Sage Journals: Discover world-class research

Abstract

After a U.S. Coast Guard (USCG) search and rescue (SAR) case, USCG personnel create an after-action report containing a textual narrative of the situation and Coast Guard response efforts. Data analysts explored how to identify reports involving cases with a verified person in the water. With restricted access to compute resources and limiting policy, large language models (LLMs) could not be utilized, so statistical (‘classical’ and non-neural) methods were considered for training a classification model to identify SAR case outcomes from report texts. The dataset was severely imbalanced toward the negative class, and the texts were extremely messy, with many typos and abbreviations. Therefore, an extensive text cleaning pipeline was developed and tested for improving classification performance. The Iterative Token Elimination Algorithm (iTEA) was developed to increase differences in vocabulary between classes. Model improvement was further explored through augmentation of the feature space using non-text data. The best model was an XGBoost model, achieving 0.762 recall and precision (and 0.959 accuracy). Errors from the test set are analyzed to guide future improvements until LLMs can be used, which are expected to improve performance and reduce text cleaning requirements.

Keywords

Data analysis decision trees machine learning natural language processing text classification text processing

Get full access to this article

View all access options for this article.

References

Zhou

Wang

, et al. NER-based military simulation scenario development process. J Def Model Simul 2023; 20: 563–575.

Arslan

Allix

Veiber

, et al. A comparison of pre-trained language models for multi-class text classification in the financial domain. In: Companion proceedings of the web conference 2021. WWW ’21, Ljubljana, 19–23 April 2021, pp. 260–268. New York: Association for Computing Machinery.

Minaee

Kalchbrenner

Cambria

, et al. Deep learning–based text classification: a comprehensive review. ACM Comput Surv 2021; 54(3): 1–40.

Wang

Pang

Lin

, et al. Adaptable and reliable text classification using large language models. In: 2024 IEEE international conference on data mining workshops (ICDMW), Abu Dhabi, 9 December 2024, pp. 67–74. New York: IEEE.

107th Congress

PL.

An act to establish the department of homeland security, and for other purposes. https://www.dhs.gov/homeland-security-act-2002 (2002, accessed 11 July 2024).

National search and rescue plan of the United States. https://www.dco.uscg.mil/Portals/9/CG-5R/manuals/National_SAR_Plan_2016.pdf (2016, accessed 11 July 2024).

United States national search and rescue supplement to the international aeronautical and maritime search and rescue manual version 2.0. https://www.dco.uscg.mil/Portals/9/CG-5R/nsarc/NSS_2018_Version/National%20SAR%20Plan%202018.pdf (2018, accessed 11 July 2024).

US Coast Guard Office of Search and Rescue (CG-SAR). https://www.dco.uscg.mil/Our-Organization/Assistant-Commandant-for-Response-Policy-CG-5R/Office-of-Incident-Management-Preparedness-CG-5RI/US-Coast-Guard-Office-of-Search-and-Rescue-CG-SAR/ (2025, accessed 22 July 2025).

Amin

Santee

WR.

Probability of survival decision aid (PSDA) [Technical report]. Natick, MA: U.S. Army Research Institute of Environmental Medicine, 2008.

10.

Allen

Rioux

, et al. Refinement of probability of survival decision aid (PSDA) [Technical report]. Natick, MA: U.S. Army Research Institute of Environmental Medicine, 2014.

11.

US Department of Homeland Security. The United States Coast Guard force design 2028, 2025. https://media.defense.gov/2025/May/27/2003724531/-1/-1/0/REPORT%20-%20FD28%20EXECUTIVE%20REPORT%20_1166_V14.PDF (accessed 22 December 2025).

12.

Wahba

Madhavji

Steinbacher

Attention is not always what you need: towards efficient classification of domain-specific text. In: Arai

(ed.) Intelligent computing. Cham: Springer Nature Switzerland, 2023, pp. 1159–1166.

13.

Wahba

Madhavji

Steinbacher

A comparison of SVM against pre-trained language models (PLMs) for text classification tasks. In: Nicosia

Ojha

La Malfa

, et al. (eds.) Machine learning, optimization, and data science. Cham: Springer Nature Switzerland, 2023, pp. 304–313.

14.

Rudin

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 2019; 1: 206–215.

15.

Rostam

ZRK

Kertész

Advances in pre-trained language models for domain-specific text classification: a systematic review. ACM Trans Intell Syst Technol 2025; 16: 1–41.

16.

Beltagy

Cohan

SciBERT: a pretrained language model for scientific text. In: Inui

Jiang

, et al. (eds.) Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics, 2019, pp. 3615–3620. https://aclanthology.org/D19-1371/

17.

Lee

Yoon

Kim

, et al. Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2019; 36: 1234–1240. https://academic.oup.com/bioinformatics/article-pdf/36/4/1234/48983216/bioinformatics_36_4_1234.pdf

18.

Chalkidis

Fergadiotis

Malakasiotis

, et al. LEGAL-BERT: the muppets straight out of law school. In: Cohn

Liu

(eds.) Findings of the association for computational linguistics: EMNLP 2020. Kerrville, TX: Association for Computational Linguistics, 2020, pp. 2898–2904. https://aclanthology.org/2020.findings-emnlp.261/

19.

Liu

Huang

, et al. Finbert: A pre-trained financial language representation model for financial text mining. In: Bessiere

(ed.) Proceedings of the twenty-ninth international joint conference on artificial intelligence, IJCAI-20. Bremen: International Joint Conferences on Artificial Intelligence Organization, 2021, pp. 4513–4519.

20.

Burke

Pazdernik

Fortin

, et al. Nukelm: pre-trained and fine-tuned language models for the nuclear and energy. ESARDA Bull 2021; 63: 30–40.

21.

Vasankari

Koski

GenAI in the military: trends and opportunities. Scand J Mil Stud 2025; 8: 416–434.

22.

Sarica

Luo

Stopwords in technical language processing. PLoS ONE 2021; 16: e0254937.

23.

Kim

Zhang

Credibility adjusted term frequency: a supervised term weighting scheme for sentiment analysis and text classification. In: Balahur A, Van Der Goot E, Steinberger

, et al. (eds.) Proceedings of the 5th workshop on computational approaches to subjectivity, sentiment and social media analysis. Baltimore, MD: Association for Computational Linguistics, 2014, pp. 79–83. https://aclanthology.org/W14-2614/

24.

Zubiaga

(ed). Exploiting class labels to boost performance on embedding-based text classification. In: Proceedings of the 29th ACM international conference on information & knowledge management. CIKM ’20. New York: Association for Computing Machinery, 2020, pp. 3357–3360.

25.

Liang

Zhao

, et al. A hybrid model integrating RoBERTa, TF-IDF, and attention mechanism for medical query intent classification. Sci Rep 2025; 15: 42847.

26.

Kaibassova

Mukhametzhanova

Tokseit

, et al. Document analysis via combined vectorization and machine learning approaches. Int J Innov Res Sci Stud 2025; 8: 2195–2204. https://ijirss.com/index.php/ijirss/article/view/8356

27.

Remy

Name dataset, 2021. https://github.com/philipperemy/name-dataset

28.

Rehurek

Sojka

Gensim–python framework for vector space modelling. Brno: NLP Centre, Faculty of Informatics, Masaryk University, 2011.

29.

Mikolov

Chen

Corrado

, et al. Efficient estimation of word representations in vector space. In: 1st international conference on learning representations, ICLR 2013, Scottsdale, AZ, 2–4 May 2013, pp. 1–12. Kerrville, TX: Workshop Track Proceedings.

30.

Mikolov

Distributed representations of sentences and documents. In: Xing

Jebara

(eds.) Proceedings of the 31st international conference on machine learning, proceedings of machine learning research, volume 32. Bejing, China: PMLR, 2014, pp. 1188–1196.

31.

Chen

Guestrin

Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’16, San Francisco, CA, 13–17 August 2016, pp. 785–794. New York: Association for Computing Machinery.

32.

Bestgen

Please, don’t forget the difference and the confidence interval when seeking for the state-of-the-art status. In: Calzolari

Béchet

Blache

, et al. (eds.) Proceedings of the thirteenth language resources and evaluation conference. Marseille: European Language Resources Association, 2022, pp. 5956–5962. https://aclanthology.org/2022.lrec-1.640/

33.

Berzi

Berényi

Képes

, et al. NLP-based removal of personally identifiable information from hungarian electronic health records. Front Artif Intell 2025; 8: 1585260.

34.

Military reinforcement learning with large language model–based agents: a case of weapon selection. J Def Model Simul. Epub ahead of print 15 March 2025. DOI: 10.1177/15485129251323291.

35.

Schadd

Sternheim

Blankendaal

, et al. How a machine can understand the command intent. J Def Model Simul 2025; 22: 41–58.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.17 MB

0.00 MB

Evaluating analysis methods for coast guard reports freeform text: a case study on resource-constrained natural language processing with search and rescue reports

Abstract

Keywords

Get full access to this article

References

Supplementary Material