Sage Journals: Discover world-class research

Abstract

Background

At present, artificial intelligence (AI) plays a significant role in the diagnosis, treatment, and prognosis of lymphoma. This study quantitatively analyzed the research hotspots and future trends in this field by using bibliometric software. The aim is to offer researchers a full understanding of the current research along with suggestions for future directions.

Methods

All relevant articles of AI in lymphoma research were retrieved from the Web of Science Core Collection database from 2010 to 2024. Bibliometric visualization analysis of all retrieved data was conducted by using the “bibliometrix” package in R (version 4.4.1), CiteSpace (version 6.4.R1), and VOSviewer (version 1.6.20).

Results

Analysis of 662 publications shows that AI in lymphoma research is currently on rapid development. The United States and China are ahead of other countries in terms of the number of articles and citations. Jiang Huiyan from Northeastern University (China) and Michel Meignan from Assistance Publique-Hôpitaux de Paris (France) are the most prolific and highly cited authors in this field. Recent hotspots focus on molecular expression and imaging histology, while the emerging directions are molecular mechanisms and chemotherapeutic strategies in lymphoma.

Conclusions

This study comprehensively analyzes the hotspots and trends of AI research in lymphoma, which is shifting from radiomics toward molecular mechanisms and AI-optimized chemotherapy. In the future, it is necessary to meet the practical clinical demands and promote the integration of AI.

Keywords

Lymphoma artificial intelligence machine learning bibliometric analysis VOSviewer CiteSpace

Introduction

Lymphoma is a category of malignant tumors that originate from lymphatic tissues, primarily comprising Hodgkin lymphoma (HL) and non-Hodgkin lymphoma (NHL). Non-Hodgkin lymphoma can further be classified into B-cell lymphoma, T-cell lymphoma, and NK-cell lymphoma based on their cellular origins.¹ Different subtypes of lymphoma exhibit distinct biological behaviors and require different clinical treatments.² In recent years, the global incidence of lymphoma has shown an upward trend, particularly among the elderly.³ According to the 2022 GLOBOCAN statistics, NHL is the 10th most common cancer worldwide and the 11th leading cause of cancer-related deaths.⁴ If the mortality rates of all lymphoma subtypes are aggregated, its ranking would be even higher. Early diagnosis and effective treatment are equally important for improving patient prognosis. Due to the complex pathological classification and clinical manifestation of lymphoma, traditional diagnostic methods face several challenges. For instance, pathological diagnosis relies on the microscopic judgment of pathologists, which sometimes needs to be combined with immunohistochemistry and fluorescence in situ hybridization (FISH). These limitations may lead to subjective judgments, thereby impacting the accuracy and reproducibility of diagnosis.⁵

The rapid development of artificial intelligence (AI) has created new opportunities for lymphoma research.⁶ Broadly speaking, AI encompasses machine learning (ML), deep learning (DL), natural language processing (NLP), and other subfields. For example, AI can also predict treatment responses by analyzing 18F-FDG PET/CT imaging data of lymphoma patients, which could contribute to optimizing therapeutic strategies.⁷ Researchers utilized ML integrated with gene transcriptional profiles and single-cell RNA sequencing data to capture the clinical heterogeneity within existing subtypes of DLBCL. They identified the cellular origins and genetic subtypes of DLBCL, providing opportunities for precise targeted therapies.⁸ Additionally, DL applied in digital pathology can automatically analyze high-resolution tissue slides to assist pathologists in classifying and diagnosing lymphoma.⁹ In clinical decision support, NLP assists researchers in leveraging electronic health records to support studies on lymphoma as well as healthcare decision-making.¹⁰

The first quantitative analysis of anatomical literature was conducted by German scholars F. J. Cole and N. B. Eales in 1917, which is now acknowledged as an early form of bibliometrics. However, it was not until 1969 that Alan Pritchard formally introduced the comprehensive concept of bibliometrics.¹¹ Bibliometrics enables the objective quantification of various aspects of literature, including the interconnections among countries, institutions, authors, journals, and keywords. This approach can identify research hotspots and future trends in a targeted area. As a result, bibliometrics overcomes the limitations of traditional review paper in terms of objectivity.¹² Nevertheless, it was not until the twenty-first century, with the widespread adoption of computer technology and academic databases such as Web of Science (WOS), that large-scale quantitative bibliometric analysis became feasible. Meanwhile, the use of visualization tools, including R, VOSviewer, and CiteSpace, has promoted research into complex networks, such as collaboration networks and co-citation networks.¹³ This paper presents the first bibliometric analysis of AI in lymphoma research. The objective is to comprehensively analyze the current research landscape in this field, identify future developments, and provide valuable references for clinicians and researchers.

Materials and methods

Data source and search strategy

The WOS database is widely recognized as one of the most comprehensive and authoritative databases.¹⁴ The original data for this bibliometric analysis were retrieved from the Web of Science Core Collection (WOSCC). To ensure the comprehensive inclusion of studies related to AI and lymphoma, the data extraction process was conducted as follows: TS = ((“Artificial intelligence” OR “Machine learning” OR “Deep learning” OR “Neural network” OR “Intelligent Systems” OR “Reinforcement Learning” OR “Cognitive Computing”) AND (“lymphoma” OR “Hodgkin disease” OR “non-Hodgkin disease” OR “Hodgkin's disease” OR “non-Hodgkin's disease”)). Database: WOSCC (SCI-Expanded). All the data were exported in plain text format (including title, authors, keywords, abstract, citations, and others). This process was conducted in January 2025 to avoid discrepancies in the results due to updates of this database.

To ensure the relevance and quality of the data, we established the following inclusion and exclusion criteria for the literature search:

Inclusion Criteria: (1) Original articles or reviews focusing on the application of AI or ML in lymphoma; (2) Publication date between 1January 2010 and 31 December 2024; (3) Published in English.

Exclusion Criteria: (1) Studies not related to lymphoma (e.g., other cancer types); (2) Studies where AI/ML was not a core methodology (e.g., only mentioned in introduction); (3) Non-research publications such as conference abstracts, editorials, letters, patents, and book chapters.

Data extraction and analysis

We imported the downloaded data into CiteSpace (6.4.R1), where all the duplicated literatures were removed. The filtered data were then imported into Microsoft Excel (2021), R (4.4.1), CiteSpace (6.4.R1), and VOSviewer (1.6.20) for visualization analysis. We used Microsoft Excel to analyze the annual number of publications (NP) and average citations. Then, we utilized R package “bibliometrix” to conduct visual analysis through Biblioshiny. Biblioshiny is an R tool developed by Massimo Aria of the University of Naples and Corrado Cuccurullo of the University of Campania.¹⁵ This tool enabled us to investigate national collaborations, core journals, author publication timelines, trending topics, and keyword thematic maps. We further used CiteSpace and VOSviewer to take advantage of their strengths in visualizing network analyses.^16–18 These tools were employed to better visualize the co-occurrence networks of institutions, journals, authors, and clustered keywords. Besides commonly used bibliometric indicators such as total publications (TP) and total citations (TC), we used the H-index to evaluate the scientific output and impact of a researcher, which is defined as the number of papers that have been cited at least H times.¹⁹ Multiple country publications (MCP) were used for measuring the level of international cooperation,²⁰ total link strength (TLS) for assessing links between institutions, journal citation reports, and journal impact factor (JIF) for assessing publication quality.²¹

Results

Annual publication and research trends

After the literature processing, as shown in Figure 1A, 662 publications were included in this bibliometric analysis, comprising 594 research articles and 68 reviews. Figure 1B illustrates the annual number of publications (NP) and the average citations within this field. Based on these two indices, research in this topic can be roughly categorized into three phases:

Figure 1.

(A) The process of data extraction and selection. (B) The annual number of publications (the left side) and the yearly average citation (the right side) in the field of artificial intelligence in lymphoma research.

Initial phase (2010–2014): During this period, fewer than five papers were published annually, with citations also being relatively low. The average citation over these years was approximately 2.16.

Steady growth phase (2015–2019): This period showed a rapid increase in the NP, indicating an influx of new researchers into the field. The average citation rose to 3.48, suggesting an improvement in the quality and impact of the literature compared to the initial phase.

Rapid expansion phase (2020–2024): The annual publication number surged from 55 to a peak of 166, demonstrating significant acceleration in research about AI in lymphoma. Meanwhile, the average citation reached its peak at 5.08 in 2020, which highlights the substantial influence of several key publications that year.

Analysis of countries/regions

A total of 66 countries published papers about AI research in lymphoma during this study period. The top 10 countries in terms of NP and citations are listed in Table 1. The top three countries by the NP were China (196), the United States (115), and Germany (41). The countries with the highest citation counts are the United States (2415), China (1688), and Germany (786). Figure 2A shows a chord diagram of international collaborations. Both the United States and China produce far more publications and show a wide range of international collaborations. The most frequent collaborations are observed between China, the United States, France, and Germany. Among the top 10 countries in terms of the NP, China has a low proportion of MCP (14.8%), while Germany, France, and the United Kingdom show higher proportions of MCP (46.3%, 52.9%, 52.4%, respectively), as shown in Figure 2B. Figure 2C shows the trends of national scientific output. The United States is far ahead of other countries in the early stage, but China's papers have shown an exponential explosion of growth, surpassing the United States in 2023 and leading since then, which is exactly in line with the growth trend of China's AI technology in recent years. Germany, France, Italy, and other countries have maintained a steady and slow growth in the publication of papers, with an average annual growth rate of 2–3%.

Figure 2.

Analysis of countries in this research filed. (A) This is a collaboration map between countries regarding their publication outputs. The larger the area, the greater the number of publications. The thicker the lines connecting the countries, the higher the intensity of collaboration between them. (B) SCP and MCP for different countries. SCP refers to Single-Country Publications, whereas MCP stands for Multi-Country Publications. (C) This chart illustrates the trend of document output in this research field among different countries. China and the United States are the fastest-growing nations among all the countries.

Table 1.

Top 10 countries in this field.

Country	TP	Rank	Country	TC
China	196	1	USA	2415
USA	115	2	China	1688
Germany	41	3	Germany	786
France	34	4	France	490
Japan	33	5	Japan	470
Italy	26	6	Korea	405
India	21	7	Italy	300
United kingdom	21	8	Sweden	267
Korea	20	9	United kingdom	235
Spain	18	10	Spain	225

TP: total publications; TC: total citations.

Analysis of institutions

A total of 1584 institutions are involved in AI and lymphoma research. Table 2 lists the top 10 institutions in terms of the NP. The Chinese Academy of Sciences and Northeastern University in China are the most productive institutions (both TP = 18), followed by the University of Texas MD Anderson Cancer Center in the United States (TP = 15). In terms of TC, Stanford University (TC = 463), the Chinese Academy of Sciences (TC = 349), and the University of Texas MD Anderson Cancer Center (TC = 242) rank as the top three. A co-occurrence network (Figure 3A), generated by VOSviewer, is used to analyze institutions with six or more publications. The network consists of 38 nodes, which are classified into eight different clusters. Northeastern University (TLS = 81), Shanghai Jiao Tong University (TLS = 65), and the Chinese Academy of Sciences (TLS = 51) are identified as the top three TLS institutions. Figure 3B depicts the timeline of institutional publications, where the color gradient indicates the time of publication. It is clear to see that most of the institutions’ research findings were published after 2018, which may be related to the fact that some remarkable breakthroughs in AI have only been made in recent years. Columbia University and the University of Toronto are regarded as early pioneers in the exploration of AI in lymphoma research. The Chinese Academy of Sciences and Stanford University are emerging as leading institutions, indicating changes in research outputs among different institutions over time.

Figure 3.

Analysis of institution collaboration. (A) This Figure illustrates the collaboration and influence among different institutions. The size of the node represents the total citation count of articles published by the institution, while the thickness of the edges indicates the total link strength (TLS) of the collaborative relationships between institutions. (B) Building on the previous Figure, publication time information has been incorporated. The color of the nodes indicates the active publication time of institutions, with redder colors representing more recent publications.

Table 2.

Top 10 contributing institutions related.

Rank	Institution	Country	TP	TC	TLS
1	Chinese Academy of Sciences	China	18	349	51
2	Northeastern University	China	18	164	81
3	The University of Texas MD Anderson Cancer Center	USA	15	242	45
4	Shanghai Jiao Tong University	China	13	183	65
5	Stanford University	USA	13	463	30
6	Sun Yat-sen University	China	13	85	19
7	Mayo Clinic	USA	12	124	22
8	Memorial Sloan Kettering Cancer Center	USA	12	178	36
9	Fudan University	China	11	116	49
10	Sichuan University	China	11	180	38

TP: total publications; TC: total citations; TLS: total link strength (TLS measures the strength of collaborative relationships between institutions in the co-occurrence network. It is calculated by summing the link strengths of all connections between an institution and other institutions in the network.).

Journal analysis

Normalized citation (norm. citation) refers to the citation value of a journal that has been normalized across disciplines and over time. Norm.Citations serve as a better tool for cross-field comparison of journal impact than TC. A total of 311 journals were ranked based on Norm.Citations, and the top 10 journals published 143 papers, accounting for about 21.6% of the TP (Table 3). Among these ranked journals, the most important one is Cancers, while its JIF of 2024 is only 4.4, indicating that the journal holds a relatively lower reputation in this research area. It is closely followed by the European Journal of Nuclear Medicine and Molecular Imaging (JIF2024 = 7.6) and the Journal of Nuclear Medicine (JIF2024 = 9.1), both of which focus on nuclear medicine and molecular imaging. These two journals mainly publish high-quality articles on basic research and technological innovations in radiology. In addition, 24 journals with at least five publications were analyzed through a co-occurrence network of academic journals, which were clustered into five different groups with a total of 95 links, as shown in Figure 4A. The red cluster primarily comprises journals at the intersection of oncology and AI, the blue cluster focuses on computer science and diagnostic radiology research, and the green cluster encompasses the interdisciplinary field of radiology. Figure 4B illustrates the timeline of journal publications, with the color gradient representing the publication time. It clearly shows that most journals publishing research on AI and lymphoma are concentrated in the period after 2020, which is consistent with the findings in Figure 1B.

Figure 4.

Analysis of contributing journals. (A) Each node represents a journal, and the size of the node indicates the normalized citation (norm. citation) of that journal within this research field. (B) Building on the previous graph, temporal information about journal activity has been incorporated. The redder the color, the closer the year is to the present.

Table 3.

Top 10 influential journals related.

Rank	Journal	Norm.Citations	TP	JCR	JIF2024
1	Cancers	32.23	32	Q2	4.4
2	European Journal of Nuclear Medicine and Molecular Imaging	26.54	13	Q1	7.6
3	Journal of Nuclear Medicine	23.08	8	Q1	9.1
4	Nature Communications	18.67	6	Q1	15.7
5	Frontiers in Oncology	17.67	30	Q2	3.3
6	European Radiology	16.73	11	Q1	4.7
7	Scientific Reports	11.43	13	Q1	3.9
8	Computer Methods and Programs in Biomedicine	9.72	9	Q2	4.8
9	Computers in Biology and Medicine	9.31	11	Q1	6.3
10	PLOS ONE	7.05	10	Q2	2.6

TP: total publications; JCR: Journal Citation Reports; JIF2024: Journal Impact Factor of 2024.

Analysis of authors

A total of 4548 authors contributed to the findings. We evaluated the most influential authors in the field over the past 15 years based on the NP, local citations (LCS), and H-index (Table 4). Professor Jiang Huiyan from Northeastern University, China, who has published 11 papers, ranks first both in the NP and the H-index. Her primary research area is the application of AI technology in medicine. Professor Michel Meignan of Assistance Publique - Hôpitaux de Paris, France, ranks first in terms of LCS. His research mainly focuses on medical imaging and tumor therapy. Figure 5A illustrates the publication output of several high-impact authors over time. Professor Hans Binder of Germany has been working at the intersection of computer science and medicine since 2015, continuing his work through 2022, making him a preeminent leader in AI and lymphoma research. Over the past five years, a group of authors including Jiang Huiyan and Catherine Thieblemont have continued to publish influential literature, which gradually consolidates their prominent positions in this area. The clustered network analysis of co-collaborating authors is shown in Figure 5B. It is evident that there is a pattern of close collaboration among highly productive authors, such as Catherine Thieblemont and Michel Meignan, as well as Ding Chongyang and Jiang Chong, who have established strong partnerships and published some influential articles in their respective research areas.

Figure 5.

Key author analysis in this research field. (A) Research activity and article number of different authors within this period. The size of the dots represents the number of articles published by the author in a specific year, while the color of the dots indicates the average citation count for that year. (B) Cooperation map of authors. The size of the nodes reflects the number of papers published by the researchers, and the curves indicate collaborations between researchers. Nodes of the same color represent researchers whohave collaborated on multiple projects or papers.

Table 4.

The most influential authors in the field.

Rank	Author	NP	Author	LCS	Author	H-index
1	Jiang Huiyan	11	Michel Meignan	85	Jiang Huiyan	6
2	Wang Lei	6	Ludovic Sibille	68	Wang Lei	6
3	You Zhu-Hong	6	Bruce Spottiswoode	62	You Zhu-Hong	6
4	Catherine Thieblemont	6	Sven Zuehlsdorff	62	Catherine Thieblemont	5
5	Ding Chongyang	6	Catherine Thieblemont	51	Hans Binder	5
6	Jiang Chong	6	Olivier Casasnovas	49	Li Biao	5
7	Hans Binder	5	Steven Le Gouill	46	Henry Loeffler-Wirth	5
8	Li Biao	5	Laetitia Vercellino	45	Michel Meignan	5
9	Henry Loeffler-Wirth	5	Irène Buvat	44	Heiko Schoeder	5
10	Michel Meignan	5	Anne-Ségolène Cottereau	44	Ding Chongyang	4

NP: number of publications; LCS: local citations.

Analysis of significant publications

We conducted an analysis of the top 10 most frequently cited publications to deepen our understanding in this research area (Table 5). Among these most cited works, an article by Andrew Janowczyk published in the Journal of Pathology Informatics titled “Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases” shows the highest number of citations (TC = 791, TC per Year = 87.89) in this field. Although this article was published in a less prominent journal, it has been widely accepted and cited for its quality. It is closely followed by “Precision medicine for human cancers with Notch signaling dysregulation” by Masuko Katoh in International Journal of Molecular Medicine (2020), and “The landscape of tumor cell states and ecosystems in diffuse large B-cell lymphoma” by Chloé B Steen in Cancer Cell (2021), with average annual citations of 25.8 and 28.6, respectively. Most of these highly cited articles were published after 2020, indicating that the interdisciplinary cooperation between oncology and AI has been research hotspot in recent years. It can also be observed that AI, radiomics, and molecular typing are the main research topics at present, and most of the highly cited papers are published in top journals such as Radiology and Cancer Cell.

Table 5.

Top 10 highest citation publications in the field.

Paper	Year	Journal	First author	TC	TC per Year
Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases	2016	Journal of Pathology Informatics	Andrew Janowczyk	791	87.89
Precision medicine for human cancers with Notch signaling dysregulation	2020	International Journal of Molecular Medicine	Masuko Katoh	155	25.83
The landscape of tumor cell states and ecosystems in diffuse large B cell lymphoma	2021	Cancer Cell	Chloé B Steen	143	28.60
18F-FDG PET/CT Uptake Classification in Lymphoma and Lung Cancer by Using Deep Convolutional Neural Networks	2020	Radiology	Ludovic Sibille	142	23.67
Co-Learning Feature Fusion Maps from PET-CT Images of Lung Cancer	2020	IEEE Transactions on Medical Imaging	Ashnil Kumar	137	22.83
Pretreatment gut microbiome predicts chemotherapy-related bloodstream infection	2016	Genome Medicine	Emmanuel Montassier	135	13.50
Diffusion radiomics as a diagnostic model for atypical manifestation of primary central nervous system lymphoma: development and multicenter external validation	2018	Neuro-Oncology	Daesung Kang	108	13.50
Primary central nervous system lymphoma and atypical glioblastoma: Differentiation using radiomics approach	2018	European Radiology	Hie Bum Suh	103	12.88
Novel Human miRNA-Disease Association Inference Based on Random Forest	2018	Molecular Therapy-Nucleic Acids	Xing Chen	97	12.13
Molecular subtyping of cancer: current status and moving toward clinical applications	2019	Briefings in Bioinformatics	Lan Zhao	92	13.14

Keywords analysis

Keyword analysis is an important method for identifying central themes in research areas. At first, we integrated some synonyms for better consistency, such as replacing “f-18-fdg pet”, “f-18-fdg pet/ct”, “fdg-pet” with “fdg-pet/ct”. We extracted 1466 keywords in a comprehensive analysis of 662 documents. With a minimum threshold of 5, a total of 142 keywords were obtained for cluster analysis. Figure 6A shows visualization of keyword density, where color moving toward red indicates higher frequency of occurrence. The keywords with the highest frequency are “classification”, “cancer”, “lymphoma”, “diagnosis”, “survival”, and “expression”. We performed co-occurrence analysis of keywords to generate five clusters (Figure 6B). The node size is determined by the frequency of keyword occurrence. The red cluster focuses on the biological and clinical features of malignant tumors, involving a total of 45 keywords. Representative keywords include “gene-expression”, “survival”, “prognostic factors”, “primary-cns lymphoma”, and “B-cell lymphoma”. The green cluster focuses on imaging and tumor heterogeneity research, which involved a total of 30 keywords, including “metabolic tumor volume”, “MRI”, “PET/CT”, “heterogeneity”, “parameters”. The blue cluster focuses on methodology and AI tool development, consisting of 25 keywords, including “classification”, “prediction”, “neural networks”, “algorithm”, “validation”, and “segmentation”. The yellow and purple clusters focus mainly on the management of tumor diagnosis and treatment, both involving of 21 keywords.

Figure 6.

Keyword analysis of artificial intelligence in lymphoma research. (A) Keyword density visualization map: different colors represent the frequency of keyword occurrences, with red indicating the highest frequency. (B) Cluster map based on keyword analysis: the size of the circles indicates the frequency of keyword occurrences, and different colors represent the types of clusters. (C) Trend of different research topics over time: the frequency of each topic is indicated by horizontal lines and dots, with the size of the dots reflecting the frequency of the topic in specific years. (D) Distribution of research topics based on density and centrality: the higher the relevance degree, the closer the topics are positioned to the right side. The greater the development degree, the closer the topics are positioned towards the top.

We further used Biblioshiny in R to generate Trend topics and Thematic Map (Figure 6C and D), which provided us with a comprehensive view of research hotspots. Figure 6C is a Trend Topics Map, illustrating the most important research topics for each year. The size of the circle represents the frequency of research keywords, and the blue line indicates the time period during which these keywords frequently appear. Important keywords for the past three years include: “classification”, “cancer”, “lymphoma”, “diagnosis”, “survival”, in 2022; “images”, “therapy”, “chemotherapy”, “segmentation”, “response”, “assessment” in 2023; and “assessment”, “radiomics”, “hodgkin-lymphoma”, “dlbcl”, “artificial intelligence”, “recommendations” in 2024.

The Thematic Map is commonly used to analyze the development degree and relevance of different research themes within a research field, as shown in Figure 6D. The upper-right quadrant contains Motor Themes, which represent core research hotspots. These themes focus on survival prediction in B-cell lymphoma, gene/protein expression profiling, metabolic tumor volume, and response assessment in NHL. The upper-left quadrant includes Niche Themes, which are well-developed (high development degree) but weakly connected to other themes. These specialized themes focus mainly on oncogenes (e.g., MYC) and chemotherapy regimens (e.g., rituximab, cyclophosphamide, vincristine). This direction is highly specialized, primarily targeting the pathological mechanisms or treatment response of lymphoma. The lower-left quadrant represents Emerging or Declining Themes, encompassing detailed response criteria and imaging studies of central nervous system lymphoma. These topics currently have limited literature and weak connections to other areas, suggesting they are either emerging new fronts or traditional approaches being phased out, which require further investigation by specific experts. The lower-right quadrant comprises Basic Themes, primarily involving the use of AI to extract features from medical imaging (e.g., PET-CT, MRI) for lymphoma classification. Although current research density in this area is relatively low, its high relevance suggests its potential to serve as a foundation for future research. In summary, current studies of AI in lymphoma are primarily based on imaging analysis. They center on survival prediction and molecular expression as core research themes while integrating specialized domains such as key oncogenes and clinical treatments, thereby forming a systematic research framework.

To further explore the developmental relationships of keyword clusters over time, we performed a cluster timeline analysis by using CiteSpace. As illustrated in Figure 7A, on the left is the timeline evolution of keywords, where terms such as “B-cell lymphoma”, “gene expression”, “machine learning”, and “classification” have gradually evolved over time into “Deep learning”, This shift reflects a transition from focusing on classification and gene expression studies of lymphoma to the clinical applications of AI in imaging and pathology images. On the right, each cluster number represents a research theme cluster, with smaller numbers indicating larger clusters. The research clusters range from “risk evaluation” and “using radiomics approach” at the top to “molecular diagnosis”, “Hodgkin lymphoma treatment”, and “identifying free-text feature” at the bottom. This indicates that the research hotspots in this field are centered around radiomics and DL, with an increasing application in novel clinical NLP and molecular diagnostics studies.

Figure 7.
Timeline analysis of keyword clustering and the top 40 Burst Keywords. (A) Timeline distribution of keyword clusters. Each cluster is marked according to the year it first appeared, with different research topics indicated by lines of different colors. Keywords that appear frequently together are displayed on the timeline. (B) Top 40 keywords with the highest burstiness in this research field. The burstiness of each keyword is represented by the length and saturation of the color of the line. Longer and deeper lines indicate keywords with high burstiness, representing a sharp increase of interest on these keywords in this period.

Keyword burstiness refers to a phenomenon in bibliometric studies where the frequency of certain keywords significantly increases within a specific time period. This burst of keywords reflects a sharp rise in attention toward particular research topics. The core idea is to identify the “turning point” (i.e., the onset of the burst) and the “duration” of certain keywords through some algorithms. As shown in Figure 7B, early burst keywords such as “artificial neural networks” (2010–2018), “support vector machine” (2011–2019), and “microarray data” (2012–2015) were centered on the establishment and citation of traditional AI methodologies. In contrast, recent burst keywords such as “positron emission tomography” (2021–2024), “prognostic value” (2021–2022), and “metabolic tumor volume” (2020–2024) focus on the concrete applications of AI in clinical lymphoma research. Therefore, it is likely that future research will revolve around these keywords in the coming years.

Reference analysis and journal overlay map

In order to analyze the references of all literature published in this research area, we utilized CiteSpace to identify the top 20 references with the highest citation bursts. These references were sorted by time periods, as shown in Figure 8A. Aerts et al.²² and Kickingereder et al.,²³ being early bursts in citations, indicate that these studies received considerable recognition among peers and laid the groundwork for the application of AI in radiomics and lymphoma research. Swerdlow SH (“The 2016 revision of the World Health Organization classification of lymphoid neoplasms,” 2016),²⁴ with the highest burst strength of 7.94, significantly impacted clinical diagnostics through its involvement in WHO classification criteria for lymphoma. More recent bursts, such as Zwanenburg et al.,²⁵ reflect current trends in AI-driven lymphoma research moving toward multimodal data integration and genomics.

Figure 8.
The burstiness analysis of cited references and the journal dual-map overlay analysis. (A) The top 20 documents related to artificial intelligence in lymphoma research with the highest burstiness. The burstiness of each reference is indicated by the length and color saturation of the line. Longer and deeply colored lines represent references with high burstiness, indicating a sharp increase of interest during this period. (B) The dual-map overlay visualizes the article distribution of artificial intelligence in lymphoma research, superimposing the relationships between citing and cited documents. The left part represents citing journals, while the right part represents cited journals, with curves indicating citation relationships.

The journal overlay map is often used to describe the distribution patterns of citing and cited journals, while simultaneously observing the direction of knowledge flow between them, as shown in Figure 8B. The z-value represents the normalized significance of a discipline, with a higher z-value indicating it is more likely to represent a core research area within the field. The f-value indicates the frequency of occurrence of a term within the field, with a higher f-value suggesting that the term is discussed more frequently. There are mainly three directions of knowledge flow: “medicine, medical, clinical” to “molecular, biology, genetics” (z = 6.18; f = 1623); “medicine, medical, clinical” to “health, nursing, medicine” (z = 5.38; f = 1422); and “molecular, biology, immunology” to “molecular, biology, genetics” (z = 4.14; f = 1112). It can be concluded that in this research field, genomics (i.e., molecular, genetics) and clinical research (i.e., medical, clinical) are merging and citing each other. Future research in this field should also be based on cross-learning and validation of these two core disciplines.

Discussion

General information

In this study, we conducted a bibliometric analysis of 662 studies related to the application of AI in lymphoma. The data were retrieved from the WoSCC database, covering the period from 2010 to 2024, and several bibliometric tools were used in this study, such as Microsoft Excel, VOSviewer, and R. Global literature in this field has been increasing over the past 15 years. With 166 articles published in 2024, the NP is expected to rise further. The analysis shows that the paper by Andrew Janowczyk, published in 2016, gains the highest number of citations in this topic, and its average annual citations also rank top. China tops the list with 196 TP, far exceeding the United States’s 115, followed by Germany and France. This result is largely consistent with the regional distribution of global AI research, highlighting the dominance of these two countries. However, as for the TC, the United States (2415) is significantly ahead of China (1688), and there is also a large gap in the average citation per article. This suggests that China needs to shift its focus from quantity to quality, as well as to enhance the output of high-quality articles. In terms of author contributions, Prof. Jiang Huiyan of Northeastern University leads in terms of publications and H-index. Her work focuses on developing specialized neural network architectures for radiological images in lymphoma, which are mainly published in engineering and technology journals, highlighting its interdisciplinary characteristic of this research field. Prof. Michel Meignan from France is a leader in LCS, whose research focuses on the evaluation of the clinical efficacy of AI tools. His research often involves multicenter trials and his works are more frequently cited than other pure technical or medical papers. As for institutions, the intensity of collaboration between Chinese institutions (e.g., Chinese Academy of Sciences and Northeastern University, TLS = 81) far exceeds cross-national collaboration, while U.S. institutions (e.g., MD Anderson Cancer Center and Stanford University, TLS = 45) are more inclined to international collaboration. This difference may be attributed to China's abundant domestic lymphoma cases and linguistic differences, while the United States pays more attention to global academic work, probably because of its language advantage and academic tradition. The proportion of papers published in high-impact journals about AI in lymphoma research is still relatively low, which may attribute to its interdisciplinary nature and its relatively new stage of development.

Artificial intelligence in the diagnosis, treatment, and prognosis of lymphoma

Artificial intelligence, especially image recognition, plays an important role in the radiological and pathologic diagnosis of lymphoma.²⁶ Researchers have used AI models to enhance tumor discrimination accuracy in radiological images (e.g., PET/CT and MRI). Meanwhile, researchers have integrated AI with immunohistochemical images for lymphoma diagnosis and classification. These AI-driven researches enable automated detection of microlesions, which would reduce errors in these medical images. Sun et al. applied DL algorithms to identify MRI and PET-CT images of diffuse large B-cell lymphoma. Their approach enabled the automatic detection of lymph node enlargement, tumor infiltration, and other radiological changes.²⁷ Deep learning algorithms were used to analyze MRI images of primary central nervous system lymphoma, focusing on automatic identification and segmentation of tumor regions.²⁸ The European Association of Nuclear Medicine has been working on MRI segmentation technology. They used convolutional neural network to accurately locate tumors (error < 0.5 mm), which would assist in the planning of radiotherapy target areas.²⁹ A study guided by Hiroaki Miyoshi investigated DL algorithms for the automated analysis of lymphoma histopathological slides, including diffuse large B-cell lymphoma, follicular lymphoma, and reactive lymphoid hyperplasia. After comparing their results with diagnosis of hematopathologist, they found the AI model achieved accuracies of 94.0%, 93.0%, and 92.0%, respectively.³⁰ In another study, after scientific practice, an accurate DL platform has been established to classify pathologic images for diffuse large B-cell lymphoma. This platform could analyze diffuse DLBCL and non-DLBCL pathologic images from three hospitals separately using AI models and obtained a diagnostic rate of nearly 100%.³¹

Artificial intelligence also holds tremendous potential in predicting treatment responses. By integrating radiological images, genomic features, and gut microbiome data, researchers have developed predictive models to assess treatment response, such as chemotherapy and CAR-T, which would guide personalized treatment. Furthermore, by simulating drug metabolic pathways and analyzing drug combination effects, NLP-based real-time analysis of electronic medical records makes dynamic treatment optimization possible. A study by Tran et al. demonstrated that ML can predict the response of lymphoma patients to chemotherapy with 80% accuracy. Using the Support Vector Machine algorithm, they constructed a mathematical model that integrated different imaging methods—computed tomography and bioluminescence imaging. This model was designed to predict when targeted therapy would induce oncogene addiction following the inactivation of oncogenes such as K-Ras and MYC, leading to tumor regression in lung and lymphoma cases.³² Carreras et al. analyzed a series of mantle cell lymphoma (MCL) by using a combination of artificial neural network, radial basis function, gene set enrichment analysis, and conventional statistics. He identified prognostic genes for dimensionality reduction, which would predict the overall survival rates and new biomarkers for improved treatment.³³ Some clinicians applied ML to analyze fragmentation variations of ctDNA in large B-cell lymphoma and CNS lymphoma, respectively. They derived these ctDNA samples during lymphoma diagnosis and treatment, thus guiding noninvasive treatment decisions and risk stratification.^34,35 A large cohort study consisting of B-cell lymphoma patients from five centers in Germany and the United States. Machine learning was applied to microbiome data derived before CAR-T treatment, validating that the gut microbiome would modulate the efficacy of lymphoma immunotherapy.³⁶ Some scientists discussed the application of NLP in medical record analysis, which was used to effectively extract information on patient histories, symptoms, and treatment outcomes.³⁷ This clinical decision support system would assist in prevention and care by leveraging precision medicine. Moreover, NLP and transformer-based models have been used to handle automatic structured reporting (SR) filling. They collected the clinical and radiological data of lymphoma patients to form CT-based SRs, providing real-time updates on patient treatment responses and ensuring flexible adjustment of treatment plans.³⁸ Meanwhile, a multi-interaction system was constructed to evaluate the efficacy of some drug combinations, discovering the effectiveness of anaplastic lymphoma kinase inhibitor crizotinib and proteasome inhibitor bortezomib in lymphoma.³⁹

Developing effective predictive models constitutes a critical part in the assessment of lymphoma prognosis.⁴⁰ Researchers have developed AI models by integrating medical images, genetic data, and immune microenvironment characteristics, which significantly improved the accuracy of predictions regarding lymphoma prognosis. Furthermore, researchers employed AI-based dynamic risk stratification systems to identify high-risk patients, thereby enhancing the progression-free survival rate. Some scientists have investigated the prognostic value of radiomic features of glucose metabolism by 18F-FDG PET/CT in MCL. By integrating multilayer perceptron neural network with clinical parameters, radiological characteristics, and molecular data, they developed a more accurate prognostic model.⁴¹ Artificial intelligence has often been used to integrate patients’ clinical information with gene expression data. Some biomarkers were identified as being associated with the prognosis of diffuse large B-cell lymphoma, thereby significantly enhancing the predictive accuracy for patient survival risk.⁴² After studying massive data of oncology patients (such as clinical, imaging, and genomic data) at Hospital Universitario Puerta de Hierro-Majadahonda, scientists identified potential prognostic factors and developed a prognostic model that stratified different cancer patients based on their profiles.⁴³ With the help of AI, researchers found that specific gene mutations (such as TP53, MYC) are closely associated with the prognosis of lymphoma. Additionally, the infiltration of immune cells within the tumor microenvironment also impacts lymphoma progression and treatment response.^44,45

Limitations and future direction

Although our study offers a thorough and comprehensive discussion on AI in lymphoma research, there still remain some limitations. Firstly, given the complicated classification of lymphoma, our study did not discuss AI in different subtypes separately. Secondly, we only included English articles from the WoSCC database and did not include articles in other languages or from other databases, which may lead to the omission of some important technological advancements. Additionally, because of the limitations of bibliometric tools, we were unable to fully illustrate the complex network involving multiple variables, such as the dynamic changes in collaboration over time. Based on this bibliometric analysis, future directions and potential hotspots in this field may include the following:
− Integrating AI technology into lymphoma genomics and clinical research.

− Exploring AI in medical records and patient management of lymphoma.

− Assessing the generalizability of medical AI models through multicenter clinical trials involving lymphoma patients.

Conclusion

The current application of AI in lymphoma research has entered a new phase of clinical integration. Artificial intelligence is commonly used in the diagnosis, treatment, and prognosis of lymphoma. However, there is a gap between academic output and its clinical application. Further innovations driven by clinical needs will be the key to overcoming the current bottleneck in this field. The study comprehensively analyzes current hotspots and trends of AI research in lymphoma, which is developing from pure radiomics analysis toward molecular mechanisms and AI-optimized chemotherapy. In the future, it will be necessary to address practical clinical demands and promote the integration of AI.

Supplemental Material

sj-docx-1-dhj-10.1177_20552076261416383 - Supplemental material for Research hotspots and trends of artificial intelligence in lymphoma: A bibliometric analysis from 2010 to 2024

Supplemental material, sj-docx-1-dhj-10.1177_20552076261416383 for Research hotspots and trends of artificial intelligence in lymphoma: A bibliometric analysis from 2010 to 2024 by Haixin Mao, Qin Zhang, Dan Wan, Yujie Lu and Yutao Zhang in DIGITAL HEALTH

Footnotes

ORCID iDs

Haixin Mao

Qin Zhang

Dan Wan

Yujie Lu

Yutao Zhang

Ethical approval

Not applicable. This is a bibliometric study. The Research Ethics Committee of the First People's Hospital of Zigong has confirmed that no ethical approval is required.

Contributorship

MHX and ZQ conceived and designed research. MHX, ZQ, and LYJ searched and analyzed data. MHX and ZQ wrote the manuscript. ZYT and WD reviewed this manuscript. All authors contributed to the study conception and design. All authors reviewed and commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Supplemental material

Supplemental material for this article is available online.

References

1.
Xie Y Pittaluga S Jaffe ES . The histological classification of diffuse large B-cell lymphomas. Semin Hematol 2015; 52: 57–66.

2.
Willemze R Cerroni L Kempf W , et al. The 2018 update of the WHO-EORTC classification for primary cutaneous lymphomas. Blood 2019; 133: 1703–1714.

3.
Liu W Liu J Song Y , et al. Burden of lymphoma in China, 2006-2016: an analysis of the global burden of disease study 2016. J Hematol Oncol 2019; 12: 15.

4.
Bray F Laversanne M Sung H , et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2024; 74: 229–263.

5.
Kim S-Y Chung HW So Y , et al. Recent updates of PET in lymphoma: FDG and beyond. Biomedicines 2024; 12: 2485.

6.
Srinidhi CL Ciga O Martel AL . Deep neural network models for computational histopathology: a survey. Med Image Anal 2021; 67: 101813.

7.
Hasanabadi S Aghamiri SMR Abin AA , et al. Enhancing lymphoma diagnosis, treatment, and follow-up using 18F-FDG PET/CT imaging: contribution of artificial intelligence and radiomics analysis. Cancers (Basel) 2024; 16: 3511.

8.
Steen CB Luca BA Esfahani MS , et al. The landscape of tumor cell states and ecosystems in diffuse large B cell lymphoma. Cancer Cell 2021; 39: 1422–1437.e10.

9.
Hubbard-Perez M Luchian A Milford C , et al. Use of deep learning for the classification of hyperplastic lymph node and common subtypes of canine lymphomas: a preliminary study. Front Vet Sci 2023; 10: 1309877.

10.
Passamonti F Corrao G Castellani G , et al. The future of research in hematology: integration of conventional studies with real-world data and artificial intelligence. Blood Rev 2022; 54: 100914.

11.
Khalil GM Gotway Crawford CA . A bibliometric analysis of U.S.-based research on the behavioral risk factor surveillance system. Am J Prev Med 2015; 48: 50–57.

12.
Ninkov A Frank JR Maggio LA . Bibliometrics: methods for studying academic publishing. Perspect Med Educ 2022; 11: 173–176.

13.
Pan XL Yan EJ Cui M , et al. Examining the usage, citation, and diffusion patterns of bibliometric mapping software: a comparative study of three tools. J Informetr 2018; 12: 481–493.

14.
Shamsi A Silva RC Wang T , et al. A grey zone for bibliometrics: publications indexed in web of science as anonymous. Scientometrics 2022; 127: 5989–6009.

15.
Aria M Cuccurullo C . Bibliometrix: an R-tool for comprehensive science mapping analysis. J Informetr 2017; 11: 959–975.

16.
Arruda H Silva ER Lessa M , et al. VOSviewer and bibliometrix. J Med Libr Assoc 2022; 110: 392–395. 10.5195/jmla.2022.1434

17.
Chen C Song M . Visualizing a field of research: a methodology of systematic scientometric reviews. PLoS One 2019; 14: e0223994.

18.
van Eck NJ Waltman L . Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 2010; 84: 523–538.

19.
Hirsch JE . An index to quantify an individual's scientific research output. Proc Natl Acad Sci 2005; 102: 16569–16572. 10.1073/pnas.0507655102

20.
Luukkonen T Tijssen RJW Persson O , et al. The measurement of international scientific collaboration. Scientometrics 1993; 28: 15–36.

21.
Mingers J Yang L . Evaluating journal quality: a review of journal citation indicators and ranking in business and management. Eur J Oper Res 2017; 257: 323–337.

22.
Aerts HJ Velazquez ER Leijenaar RT , et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun 2014; 5: 4006.

23.
Kickingereder P Wiestler B Sahm F , et al. Primary central nervous system lymphoma and atypical glioblastoma: multiparametric differentiation by using diffusion-, perfusion-, and susceptibility-weighted MR imaging. Radiology 2014; 272: 843–850.

24.
Swerdlow SH Campo E Pileri SA , et al. The 2016 revision of the world health organization classification of lymphoid neoplasms. Blood 2016; 127: 2375–2390.

25.
Zwanenburg A Vallières M Abdalah MA , et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 2020; 295: 328–338.

26.
Zhang Y Deng Y Zou Q , et al. Artificial intelligence for diagnosis and prognosis prediction of natural killer/T cell lymphoma using magnetic resonance imaging. Cell Rep Med 2024; 5: 101551.

27.
Sun Z Yang T Ding C , et al. Clinical scoring systems, molecular subtypes and baseline [(18)F]FDG PET/CT image analysis for prognosis of diffuse large B-cell lymphoma. Cancer Imaging 2024; 24: 68.

28.
Ko C-C Liu Y-L Hung K-C , et al. MRI-based machine learning for prediction of clinical outcomes in primary central nervous system lymphoma. Life 2024; 14: 1290.

29.
Nanni C Kobe C Baeßler B , et al. European Association of Nuclear Medicine (EANM) focus 4 consensus recommendations: molecular imaging and therapy in haematological tumours. Lancet Haematol 2023; 10: e367–e381.

30.
Miyoshi H Sato K Kabeya Y , et al. Deep learning shows the capability of high-level computer-aided diagnosis in malignant lymphoma. Lab Invest 2020; 100: 1300–1310.

31.
Li D Bledsoe JR Zeng Y , et al. A deep learning diagnostic platform for diffuse large B-cell lymphoma with high accuracy across multiple hospitals. Nat Commun 2020; 11: 6004.

32.
Tran PT Bendapudi PK Lin HJ , et al. Survival and death signals can predict tumor response to therapy after oncogene inactivation. Sci Transl Med 2011; 3: 103–199.

33.
Carreras J Nakamura N Hamoudi R . Artificial intelligence analysis of gene expression predicted the overall survival of mantle cell lymphoma and a large pan-cancer series. Healthcare 2022; 10: 55.

34.
Meriranta L Alkodsi A Pasanen A , et al. Molecular features encoded in the ctDNA reveal heterogeneity and predict outcome in high-risk aggressive B-cell lymphoma. Blood 2022; 139: 1863–1877.

35.
Mutter JA Alig SK Esfahani MS , et al. Circulating tumor DNA profiling for detection, risk stratification, and classification of brain lymphomas. J Clin Oncol 2023; 41: 1684–1694.

36.
Stein-Thoeringer CK Saini NY Zamir E , et al. A non-antibiotic-disrupted gut microbiome is associated with clinical responses to CD19-CAR-T cell cancer immunotherapy. Nat Med 2023; 29: 906–916.

37.
Eloranta S Boman M . Predictive models for clinical decision making: deep dives in practical machine learning. J Intern Med 2022; 292: 278–295.

38.
Bergomi L Buonocore TM Antonazzo P , et al. Reshaping free-text radiology notes into structured reports with generative question answering transformers. Artif Intell Med 2024; 154: 102924.

39.
Julkunen H Cichonska A Gautam P , et al. Leveraging multi-way interactions for systematic prediction of pre-clinical drug combination effects. Nat Commun 2020; 11: 6136.

40.
Schmitz N Trümper L Ziepert M , et al. Treatment and prognosis of mature T-cell and NK-cell lymphoma: an analysis of patients with T-cell lymphoma treated in studies of the German high-grade non-Hodgkin lymphoma study group. Blood 2010; 116: 3418–3425.

41.
Mayerhoefer ME Riedl CC Kumar A , et al. Radiomic features of glucose metabolism enable prediction of outcome in mantle cell lymphoma. Eur J Nucl Med Mol Imaging 2019; 46: 2760–2769.

42.
Merdan S Subramanian K Ayer T , et al. Gene expression profiling-based risk prediction and profiles of immune infiltration in diffuse large B-cell lymphoma. Blood Cancer J 2021; 11: 2.

43.
Torrente M Sousa PA Hernández R , et al. An artificial intelligence-based tool for data analysis and prognosis in cancer patients: results from the clarify study. Cancers (Basel) 2022; 14: 4041.

44.
İsmail Mendi B Şanlı H Insel MA , et al. Predicting prognosis of early-stage mycosis Fungoides with utilization of machine learning. Life 2024; 14: 1371.

45.
Hua W Liu J Li Y , et al. Revealing the heterogeneity of treatment resistance in less-defined subtype diffuse large B cell lymphoma patients by integrating programmed cell death patterns and liquid biopsy. Clin Transl Med 2025; 15: e70150.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.02 MB