Sage Journals: Discover world-class research

Abstract

Named entity recognition (NER) is a core task in natural language processing that identifies and classifies entities, such as people, organizations, and locations within text. It has traditionally been applied in areas like text summarization, machine translation, and question answering. In recent years, NER has gained growing importance in health care, where electronic clinical records and online platforms generate large amounts of unstructured medical data. However, applying NER in clinical contexts introduces unique challenges due to the complexity of medical terminology and the need for high accuracy. In this study, we focused on the development of a real-time, low-latency NER system designed for cross-lingual speech-to-text applications, with a particular emphasis on cancer therapy-related clinical records and traditional Chinese medicine (TCM). We explored the integration of deep learning (DL) architectures optimized for low-latency neural processing to extract structured information from multilingual spoken content in medical settings, particularly in multimodal environments. We evaluate DL-based methods and propose a semi-supervised approach that combines TCM-specific corpora with biomedical resources to improve recognition accuracy. The findings provide both a systematic review of current methods and practical insights for building real-time clinical applications that support decision-making and information management in health care.

Keywords

deep learning machine translation neural processing real-time NER textual electronic clinical records therapies in cancer

Get full access to this article

View all access options for this article.

References

Wang

, Chen

, Xu

, et al. A novel large-language-model-driven framework for named entity recognition. Inf Process Manag, 2025; 62(3):104054.

, Chen

, Xu

. A shape composition method for named entity recognition. Neural Netw, 2025; 187:107389.

Nadeau

, Sekine

. A survey of named entity recognition and classification. LI, 2007; 30(1):3–26.

Wang

, Gao

, Rao

, et al. Named entity recognition (NER) for Chinese agricultural diseases and pests based on discourse topic and attention mechanism. Evol Intel, 2024; 17(1):457–466.

Grishman

, Sundheim

. Message understanding conference- 6: A brief history. In: COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics; 1996.

Litkowski

, Hargraves

. SemEval-2007 task 06: Word-sense disambiguation of prepositions. In: Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007); 2007; pp. 24–29.

Kim

, Kim

, Kang

. Weakly labeled data augmentation for social media named entity recognition. Expert Syst Appl, 2022; 209:118217.

Wacholder

, Ravin

, Choi

. Disambiguation of proper names in text. In: Fifth Conference on Applied Natural Language Processing. 1997; pp. 202–208.

Kripke

. Naming and necessity. In: Semantics of natural language. ( Davidson

and Harman

., eds.) Reidel: Boston; 1972.

10.

Mansouri

, Affendey

, Mamat

. Named entity recognition approaches. Int J Comput Sci Netw Security, 2008; 8(2):339–344.

11.

Guo

, Wang

, Zhang

, et al. Advanced hydrogel material for colorectal cancer treatment. Drug Deliv, 2025; 32(1); doi: 10.1080/10717544.2024.2446552

12.

Xiong

, Zheng

L-W

, Ding

, et al. Breast cancer: Pathogenesis and treatments. Signal Transduct Target Ther, 2025; 10(1):49.

13.

Graefe

, Hübner

, Rehburg

, et al. An ontology-based rare disease common data model harmonising international registries, FHIR, and Phenopackets. Sci Data, 2025; 12(1):234.

14.

Wang

, Zhao

, Qiang

, et al. Knowledge-tuning large language models with structured medical knowledge bases for trustworthy response generation in Chinese. ACM Trans Knowl Discov Data, 2025; 19(2):1–17; doi: 10.1145/3686807

15.

Chiu

, Nichols

. Named entity recognition with bidirectional LSTM-CNNs. TACL, 2016; 4:357–370.

16.

, Sun

, Han

, et al. A survey on deep learning for named entity recognition. IEEE Trans Knowl Data Eng, 2022; 34(1):50–70.

17.

Yin

, Huang

, Li

, et al. A survival modeling approach to biomedical search result diversification using Wikipedia. IEEE Trans Knowl Data Eng, 2013; 25(6):1201–1212; doi: 10.1109/TKDE.2012.24

18.

Yin

, Cheng

, Pan

, et al. Chinese named entity recognition based on knowledge based question answering system. Appl Sci, 2022; 12(11):5373.

19.

Tanabe

, Xie

, Thom

, et al. GENETAG: A tagged corpus for gene/protein named entity recognition. BMC Bioinformatics, 2005; 6(Suppl 1):S3–S7.

20.

Thomas

, Starlinger

, Vowinkel

, et al. GeneView: A comprehensive semantic search engine for PubMed. Nucleic Acids Res, 2012; 40(Web Server issue):W585–W591.

21.

Habibi

, Weber

, Neves

, et al. Deep learning with word embeddings improves biomedical named entity recognition. Bioinformatics, 2017; 33(14):i37–i48.

22.

Song

, Kim

, Lee

, et al. POSBIOTM—NER: A trainable biomedical named-entity recognition system. Bioinformatics, 2005; 21(11):2794–2796.

23.

Wei

C-H

, Kao

H-Y

, Lu

. PubTator: A web-based text mining tool for assisting biocuration. Nucleic Acids Res, 2013; 41(Web Server issue):W518–W522.

24.

Kim

J-D

, Ohta

, Tateisi

, et al. GENIA corpus—A semantically annotated corpus for bio-textmining. Bioinformatics, 2003; 19(suppl_1):i180–i182.

25.

Weber

, Münchmeyer

, Rocktäschel

, et al. HUNER: Improving biomedical NER with pretraining. Bioinformatics, 2020; 36(1):295–302.

26.

Sharma

, Singh

, Bisen

, et al. Molecular docking studies of bioactive constituents of long pepper, ginger, clove, and black pepper to target the human Cathepsin L protease: As a natural therapeutic strategy against SARS-Cov-2. Medinformatics, 2023; 1(2):62–72.

27.

Zhao

, Li

, Zhao

, et al. A pilot study of WCA (a Chinese Jianpi Herbal Formula) integrated to systemic chemotherapy (CT) in the first-line treatment for advanced gastric cancer (AGC). JCO, 2012; 30(Suppl_4):143.

28.

Lukman

, He

, Hui

S-C

. Computational methods for traditional Chinese medicine: A survey. Comput Methods Programs Biomed, 2007; 88(3):283–294.

29.

Yadav

, Bethard

. A survey on recent advances in named entity recognition from deep learning models. arXiv Preprint, 2019 arXiv:191011470.

30.

Zhang

, Xia

, Xu

, et al. Improving distantly-supervised named entity recognition for traditional Chinese medicine text via a novel back-labeling approach. IEEE Access, 2020; 8:145413–145421.

31.

, Wen

, Liao

, et al. Automatic construction of Chinese herbal prescriptions from tongue images using CNNs and auxiliary latent therapy topics. IEEE Trans Cybern, 2021; 51(2):708–721; doi: 10.1109/TCYB.2019.2909925

32.

Miller

, Su

. Artemisinin: Discovery from the Chinese herbal garden. Cell, 2011; 146(6):855–858.

33.

Hyodo

, Amano

, Eguchi

, et al. Nationwide survey on complementary and alternative medicine in cancer patients in Japan. J Clin Oncol, 2005; 23(12):2645–2654.

34.

Sundheim

. Overview of results of the MUC-6 evaluation. 1995.

35.

Wang

, Huang

. Application of support vector machine in cancer diagnosis. Med Oncol, 2011; 28(S1):613–618.

36.

Nasralla

, Khattak

SBA

, Ur Rehman

, et al. Exploring the role of 6G technology in enhancing quality of experience for m-health multimedia applications: A comprehensive survey. Sensors (Basel), 2023; 23(13):5882.

37.

Gründner

, Prokosch

H-U

, Stürzl

, et al. Predicting clinical outcomes in colorectal cancer using machine learning. Stud Health Technol Inform, 2018; 247:101–105.

38.

Peng

J-H

, Fang

Y-J

, Li

C-X

, et al. A scoring system based on artificial neural network for predicting 10-year survival in stage II A colon cancer patients after radical surgery. Oncotarget, 2016; 7(16):22939–22947.

39.

Hearst

, Dumais

, Osuna

, et al. Support vector machines. IEEE Intell Syst Their Appl, 1998; 13(4):18–28.

40.

Curran

, Clark

. Language independent NER using a maximum entropy tagger. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003. 2003; pp. 164–167.

41.

Zhao

, Tang

, Cheng

, et al. ABL-TCM: An abductive framework for named entity recognition in traditional Chinese medicine. IEEE Access, 2024; 12:126232–126243.

42.

Chen

, Wang

, Cai

. TRBNER: Named entity recognition of TCM medical records based on multi-feature fusion. IET Conf Proc, 2025; 2024(21):174–181; doi: 10.1049/icp.2024.4222

43.

Etzioni

, Cafarella

, Downey

, et al. Unsupervised named-entity extraction from the web: An experimental study. Artif Intell, 2005; 165(1):91–134.

44.

Amith

, Zhang

, Xu

, et al. Knowledge-based approach for named entity recognition in biomedical literature: A use case in biomedical software identification. In: International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer; 2017; pp. 386–395.

45.

Xiong

, Chen

, et al. Using character-level and entity-level representations to enhance bidirectional encoder representation from transformers-based clinical semantic textual similarity model: Clinicalsts modeling study. JMIR Med Inform, 2020; 8(12):e23357.

46.

Hanisch

, Fundel

, Mevissen

H-T

, et al. ProMiner: Rule-based protein and gene entity recognition. BMC Bioinformatics, 2005; 6(Suppl 1):S14–S9.

47.

Saad

, Zikun

. Leveraging transfer learning and label optimization for enhanced traditional Chinese medicine Ner performance. APJITM, 2024; 13(01):47–59.

48.

Feng

, Zhou

. ANETCM: A novel MRC Framework for traditional Chinese medicine named entity recognition. IEEE Access, 2024; 12:113235–113243.

49.

Hou

, Saad

, Omar

. Enhancing traditional Chinese medical named entity recognition with Dyn-Att Net: A dynamic attention approach. PeerJ Comput Sci, 2024; 10:e2022.

50.

Sekine

, Ranchhod

. Named Entities: Recognition, Classification and Use. John Benjamins Publishing; 2009.

51.

Klein

, Smarr

, Nguyen

, et al. Named entity recognition with character-level models. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003. 2003; pp. 180–183.

52.

Pham

T-H

, Le-Hong

. End-to-end recurrent neural network models for Vietnamese named entity recognition: Word-level vs. character-level. In: International Conference of the Pacific Association for Computational Linguistics Springer; 2017; pp. 219–232.

53.

Mikheev

. Periods, capitalized words, etc. Comput Linguist, 2002; 28(3):289–318.

54.

Krishnan

, Ziehe

, Pannach

, et al. Employing Wikipedia as a resource for named entity recognition in morphologically complex under-resourced languages. In: Proceedings of the 14th Workshop on Building and Using Comparable Corpora (BUCC 2021). 2021; pp. 28–39.

55.

Rodriguez

, Nguyen

, McInnes

. Effects of data and entity ablation on multitask learning models for biomedical entity recognition. J Biomed Inform, 2022; 130:104062.

56.

Nguyen

, Duong

, Cambria

. Learning short-text semantic similarity with word embeddings and external knowledge sources. Knowl Based Syst, 2019; 182:104842.

57.

Alshammari

, Alanazi

. An Arabic dataset for disease named entity recognition with multi-annotation schemes. Data (Basel), 2020; 5(3):60.

58.

Virliani

, Bijaksana

, Suryani

. Analysis of name entities in text using robust disambiguation method. JST, 2020; 10(2):178–191.

59.

Rajawat

, Barhanpurkar

, Goyal

, et al. Efficient deep learning for reforming authentic content searching on big data. In: Advanced Computing and Intelligent Technologies. ( Bianchini

, Piuri

, Das

eds.). Lecture Notes in Networks and Systems Springer: Singapore; 2022; pp. 319–327; doi: 10.1007/978-981-16-2164-2_26

60.

Tamla

. Towards semantic web-based information retrieval to solve information overload in an applied gaming ecosystem. Bull IEEE Tech Comm Digit Libr, 2019; 15(2).

61.

Wang

, Xu

, Wang

. Efficacy analysis of bronchial arterial chemoembolization for nonsmall cell lung cancer: A systematic review and meta-analysis. Cancer Biother Radiopharm, 2025; 40(3):161–172; doi: 10.1089/cbr.2024.0141

62.

Shah

, Gandhi

, Shah

, et al. Recent breakthroughs in exosome-based drug delivery: A comprehensive review for cancer therapy. Cancer Biother Radiopharm, 2025; 40(10):689–708; doi: 10.1089/cbr.2025.0050

63.

Ponomareva

, Rosso

, Pla

, et al. Conditional random fields vs. hidden markov models in a biomedical named entity recognition task. In: Proc. of Int. Conf. Recent Advances in Natural Language Processing, RANLP. 2007; pp. 479–483.

64.

, Qian

, Zhang

, et al. CRF-based hybrid model for word segmentation, NER and even POS tagging. In: Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing 2008. 2008.

65.

Szarvas

, Farkas

, Kocsor

. A multilingual named entity recognition system using boosting and C4. 5 decision tree learning algorithms. In: International Conference on Discovery Science. Springer: Berlin, Heidelberg; 2006; pp. 267–278.

66.

Shen

, Zhang

, Zhou

, et al. Effective adaptation of hidden markov model-based named entity recognizer for biomedical domain. In: Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine. 2003; pp. 49–56.

67.

Richman

, Schone

. Mining wiki resources for multilingual named entity recognition. In: Proceedings of ACL-08: HLT 2008; 2008; pp. 1–9.

68.

Chieu

, Ng

. Named entity recognition with a maximum entropy approach. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003; 2003; pp. 160–163.

69.

Saha

, Sarkar

, Mitra

. Feature selection techniques for maximum entropy based biomedical named entity recognition. J Biomed Inform, 2009; 42(5):905–911.

70.

Alokaili

, Menai

MEB

. SVM ensembles for named entity disambiguation. Computing, 2020; 102(4):1051–1076.

71.

Settles

. Biomedical named entity recognition using conditional random fields and rich feature sets. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications (NLPBA/BioNLP) 2004; pp. 107–110.

72.

Zhang

, Elhadad

. Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts. J Biomed Inform, 2013; 46(6):1088–1098.

73.

Derczynski

, Maynard

, Rizzo

, et al. Analysis of named entity recognition and linking for tweets. Inf Process Manag, 2015; 51(2):32–49.

74.

Rocktäschel

, Weidlich

, Leser

. ChemSpot: A hybrid system for chemical named entity recognition. Bioinformatics, 2012; 28(12):1633–1640.

75.

Trewartha

, Walker

, Huo

, et al. Quantifying the advantage of domain-specific pre-training on named entity recognition tasks in materials science. Patterns (N Y), 2022; 3(4):100488.

76.

Cucchiarelli

, Velardi

. Unsupervised named entity recognition using syntactic and semantic contextual evidence. Comput Linguist, 2001; 27(1):123–131.

77.

Batmaz

, Yurekli

, Bilge

, et al. A review on deep learning for recommender systems: Challenges and remedies. Artif Intell Rev, 2019; 52(1):1–37.

78.

Khan

, Siddqui

, Sohail

. A survey of recommender systems based on semi-supervised learning. In: International Conference on Innovative Computing and Communications. Springer; 2022; pp. 319–327.

79.

Xie

, Yu

, Zhang

, et al. TCM-ladder: A benchmark for multimodal question answering on traditional Chinese medicine. ArXiv, 2025; doi: 10.48550/arXiv.2505.24063

80.

, Tian

, Cui

, et al. A review of knowledge graph in traditional Chinese medicine: Analysis, construction, application and prospects. CMC, 2024; 81(3):3583–3616.

81.

Wang

, Chen

, Tang

, et al. Disentangled representation learning. IEEE Trans Pattern Anal Mach Intell, 2024; 46(12):9677–9696.

82.

Fuest

, Ma

, Gui

, et al. Diffusion models and representation learning: A survey. ArXiv, 2024; doi: 10.48550/arXiv.2407.00783

83.

Kuru

, Can

, Yuret

. Charner: Character-level named entity recognition. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers 2016; pp. 911–921.

84.

Luo

, Xiao

, Zhao

. Hierarchical contextualized representation for named entity recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2020; pp. 8441–8448.

85.

Peters

, Neumann

, Iyyer

, et al. Deep contextualized word representations. In: NAACL-HLT; 2018.

86.

Zhai

, Nguyen

, Verspoor

. Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition. arXiv Preprint, 2018 arXiv:180808450.

87.

Kim

, Kim

. Learning sub-character level representation for Korean named entity recognition. In: The International FLAIRS Conference Proceedings, 2021.

88.

Bojanowski

, Joulin

, Mikolov

. Alternative structures for character-level RNNs. arXiv Preprint, 2015 arXiv:151106303.

89.

Liu

, Cheng

, Zhang

, et al. Towards robust neural networks via random self-ensemble. In: Proceedings of the European Conference on Computer Vision (ECCV). 2018; pp. 369–385.

90.

Zheng

, Wang

, Bao

, et al. Joint extraction of entities and relations based on a novel tagging scheme. arXiv Preprint, 2017 arXiv:170605075.

91.

Kumhar

, Kirmani

, Sheetlani

, et al. Word embedding generation for Urdu language using Word2vec model. Mater Today Proc, 2021.

92.

Strubell

, Ganesh

, McCallum

. Energy and policy considerations for deep learning in NLP. arXiv Preprint, 2019 arXiv:190602243.

93.

Shen

, Yun

, Lipton

, et al. Deep active learning for named entity recognition. arXiv Preprint, 2017 arXiv:170705928.

94.

Lin

JC-W

, Shao

, Djenouri

, et al. ASRNN: A recurrent neural network with an attention model for sequence labeling. Knowl Based Syst, 2021; 212:106548.

95.

Wang

, Xu

, et al. Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification. Neurocomputing (Amst), 2016; 174:806–814.

96.

Qiu

, Wang

, Zhou

, et al. Fast and accurate recognition of Chinese clinical named entities with residual dilated convolutions. In: 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) IEEE; 2018; pp. 935–942.

97.

Adewumi

, Liwicki

. Exploring Swedish & English fastText embeddings for NER with the transformer. arXiv Preprint, 2020 arXiv:200716007.

98.

Brochier

, Guille

, Velcin

. Global vectors for node representations. In: The World Wide Web Conference, 2019; pp. 2587–2593.

99.

Luong

M-T

, Socher

, Manning

. Better word representations with recursive neural networks for morphology. In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning 2013; pp. 104–113.

100.

Peters

, Ammar

, Bhagavatula

, et al. Semi-supervised sequence tagging with bidirectional language models. arXiv Preprint, 2017 arXiv:170500108.

101.

, Zhou

, Gong

, et al. SBLC: A hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields. BMC Med Inform Decis Mak, 2018; 18(Suppl 5):114–144.

102.

Acharya

, Shinada

, Koyama

, et al. Asking the right questions for mutagenicity prediction from BioMedical text. NPJ Syst Biol Appl, 2023; 9(1):63.

103.

Chang

, Tang

, Long

, et al. Multi-information preprocessing event extraction with BiLSTM-CRF attention for academic knowledge graph construction. IEEE Trans Comput Soc Syst, 2023; 10(5):2713–2724.

104.

Z-X

, Ling

Z-H

. Hybrid semi-markov crf for neural sequence labeling. arXiv Preprint, 2018 arXiv:180503838.

105.

Liu

, Gao

, Guo

, et al. A hybrid deep-learning approach for complex biochemical named entity recognition. Knowl Based Syst, 2021; 221:106958; doi: 10.1016/j.knosys.2021.106958

106.

Yao

, Liu

, et al. Biomedical named entity recognition based on deep neutral network. Int J Hybrid Inf Technol, 2015; 8(8):279–288.

107.

Y-X

, Pan

H-R

, Song

X-Y

, et al. Hedyotis diffusa plus Scutellaria barbata suppress the growth of non-small-cell lung cancer via NLRP3/NF-κB/MAPK signaling pathways. Evid-Based Complement Alternat Med, 2021; 2021:1–8.

108.

Chen

, Luo

, Fu

, et al. Application of NER and association rules to traditional Chinese medicine patent mining. In: 2020 International Conferences on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics (Cybermatics) IEEE; 2020; pp. 767–772.

109.

Yang

, Tian

, Shi

, et al. The potential effectiveness of an ancient Chinese herbal formula Yupingfengsan for the prevention of COVID-19: A systematic review. Medinformatics, 2024; 2(1):1–10.

110.

, Zhang

, et al. A comorbidity knowledge-aware model for disease prognostic prediction. IEEE Trans Cybern, 2022; 52(9):9809–9819; doi: 10.1109/TCYB.2021.3070227

111.

, Reddy

, Ning

. Self-supervised graph learning with hyperbolic embedding for temporal health event prediction. IEEE Trans Cybern, 2023; 53(4):2124–2136; doi: 10.1109/TCYB.2021.3109881

112.

Yang

, Wu

, Liang

, et al. SMSPL: Robust multimodal approach to integrative analysis of multiomics data. IEEE Trans Cybern, 2022; 52(4):2082–2095; doi: 10.1109/TCYB.2020.3006240

113.

Yang

, Ye

, Tan

, et al. Cross-domain missingness-aware time-series adaptation with similarity distillation in medical applications. IEEE Trans Cybern, 2022; 52(5):3394–3407; doi: 10.1109/TCYB.2020.3011934

Real-Time Named Entity Recognition from Textual Electronic Clinical Records in Cancer Therapy Using Low-Latency Neural Networks

Abstract

Keywords

Get full access to this article

References