Abstract
Information on the application of artificial intelligence (AI) in healthcare is needed to align healthcare transformation efforts. This bibliometric analysis aims to establish the patterns of publication activities on the application of AI in health. A total of 1083 scholarly papers published between 1993 and 2023 were retrieved from the Web of Science and Scopus databases. R Studio and VOSviewer were applied to quantify and illustrate publication patterns and citation rates. Publication rates grew by an average rate of 13% yearly, with each document being cited averagely 12 times. The articles had a mean of five co-authors, with a global co-authorship rate of 10%. COVID-19, artificial intelligence, and machine learning dominated the publications. The US, China, UK, Canada, and India coordinated most of the collaborative research. AI-based health information research is growing steadily. International collaborations can be leveraged to ensure the spread and interoperability of AI-based healthcare innovations globally.
Keywords
Introduction
Internet health information sources are increasingly becoming integral in decision-making on physical and mental health. The presence of large amounts of inaccurate information online has necessitated the development of systems to evaluate the reliability of internet health information. 1 Informaticists are applying machine learning or knowledge-based techniques to distinguish between trustworthy information sources and implausible materials. 2 Artificial intelligence (AI) models are transforming evidence-based decision-making in healthcare organizations since they provide faster, verifiable, and richer insights. 3 Therefore, AI is integral in the transformation of the healthcare system.
Artificial intelligence presents a paradigm shift in the approach to diagnostics, risk assessment, health information management, lifestyle monitoring, and virtual health aid. 4 The growing market for AI technology service providers has accelerated the convergence of natural language processing, blockchain analysis, and other advanced technologies. 5 Thus, the healthcare industry is a hub for AI applications.
Integrating AI into the healthcare system leverages vast data from electronic health records and enhanced computational capabilities.5–7 AI-powered advanced digital health solutions enhance healthcare by facilitating precise diagnoses, automating prescriptions, and predicting diseases. 8 For example, the applications of AI and machine learning (ML) to echocardiography while capitalizing on health information databases can improve the classification and treatment of many cardiac conditions.9,10 Although several European nations have included data linking into their regular public health initiatives, AI usage remains low. 10 Evidence-based health policy requires both an efficient data governance framework and a strong national health information system. 11 AI offers an opportunity to leverage data to modernize and streamline healthcare.
Establishing the state of research activities on the application of AI in healthcare can provide insights on the aspects that need heightened activity. This bibliometric analysis compares Web of Science (WOS) and Scopus regarding the volume and quantity of research output, research growth trends, and research hotspots on the application of AI in health. It also explores key players, internal collaboration networks, and prestigious journals at the center of the AI-driven healthcare informatics based on the publications indexed in WOS and Scopus. It provides a nuanced perspective on the significant complexities, challenges and obstacles associated with AI-driven solutions. Academics, policymakers, and healthcare professionals working on integrating AI in healthcare while protecting patient privacy and security can harness the insights in this study to inform their prioritization of activities.
Methodology
Study design
A bibliometric analysis methodology was applied in this study. Massive objective bibliometric data from two databases were quantitatively analyzed using bibliometric software to provide a nuanced summary on publication activities on artificial intelligence and health informatics.
Data collection
Inclusion and exclusion criteria
Articles accessible through Scopus or WOS, which are the topmost databases for academic research articles and with rich citation data, were identified. Only articles published between 1993 and 2023 and indexed in English were included. “Artificial Intelligence” and “Health Information” keywords must have been in their titles, abstracts, or keywords for consideration. Documents not available online were excluded.
Search strategy
A methodical search was developed for each of Scopus and WOS databases using the “Health Information” AND “Artificial Intelligence” search terms while applying the inclusion and exclusion criteria. The search methodology was adjusted as needed to accommodate syntax and indexing modifications to provide uniformity across the two databases. Boolean operators including “AND,” “OR,” and asterisk were used to make the search more comprehensive.
The Search Query for Scopus was TITLE-ABS-KEY (“Artificial Intelligence” AND “Health Information”) AND PUBYEAR >1993 AND PUBYEAR <2023 AND (EXCLUDE (SUBJAREA, “CENG”) OR EXCLUDE (SUBJAREA, “BUSI”) OR EXCLUDE (SUBJAREA,“ENER”) OR EXCLUDE (SUBJAREA, “ECON”) OR EXCLUDE (SUBJAREA, “ENVI”) OR EXCLUDE (“SUBJAREA”, “PHYS”) OR EXCLUDE (SUBJAREA, “Undefined”) OR EXCLUDE (SUBJAREA, “AGRI”)).
The Search Query for WOS was “Artificial Intelligence” AND “Health Information” (All Fields) and Construction Building Technology or Energy Fuels or Government Law or History Philosophy Of Science or Sociology or Linguistics or Art (Exclude – Research Areas) and Engineering Chemical or Environmental Studies or Green Sustainable Science Technology or Business or Engineering Industrial (Exclude – Web of Science Categories) and ENVIRONMENTAL RESEARCH or ENTROPY (Exclude – Publication Titles). Before the extraction of datasets for web of science, 43 articles that did not include the “Artificial Intelligence” and “Health Information” search terms in their title, abstract and keywords were excluded.
Data extraction and preprocessing
For each article, the following bibliographic information was extracted: title, author names, affiliations of the authors, abstract, publication date, and keywords. The two authors indipendently subjectively assessed the comprehensiveness and consistency of the metrics data extracted from the retrieved manuscripts to ascertain relevance and appropriateness. Duplicates were identified and eliminated. Other citations were excluded for lacking the metrics needed in the analysis.
Statistical analysis
Bibliometric analysis was done using the VOSviewer Software (Version 1.6.20, Harzing’s Publish or Perish: 8.9.4538.8589 2023.07.07.1629) by building networks and visualizing bibliometric information to produce co-authorship networks, co-citation networks, and keyword co-occurrence networks. Research themes and clusters were identified and visualized. Clusters were labeled to reflect the prominent research topics and subdomains in the corpus.
The R studio Software Version 4.3.3, 2023.12.0+369 “Ocean Storm” Release) facilitated data pretreatment, the computation of bibliometric indicators, and the production of visual representations of the analyzed data. It computed collaboration indices, h-indices, and citation counts metrics, which allowed for a comprehensive evaluation of the productivity and impact of research, authors, and journals.
A custom R Studio script was used to merge datasets from Scopus and WOS. A bibliometrix package in R Studio was applied to craft a detailed tag cloud. By retrieving titles, abstracts, and keywords, a comprehensive dataset was generated, which was then meticulously cleaned to remove duplicates and standardize the text. Through the biblioAnalysis function, we conducted an in-depth bibliometric analysis. The wordcloud function allowed the visualization of the frequency of each term, with larger fonts denoting higher frequencies.
To map international collaboration patterns, VOSviewer was used by uploading datasets extracted from the two databases separately. This software facilitated the creation of a network depicting the collaborative efforts of various countries based on authors’ affiliations. Co-authorship links were highlighted, with thicker lines indicating stronger connections. Each node represented a country, sized according to the number of publications. VOSviewer’s clustering algorithm differentiated collaboration clusters with distinct colors. The final network map, refined for clarity, effectively showcased global collaboration trends within the dataset.
Results
A total of 476 and 826 publications were retrieved from WOS and Scopus databases, respectively. After eliminating 211 duplicates and eight records with incomplete metrics data, 1083 publications including books, journal articles, and other academic publications were selected for bibliometric analysis (Figure 1). Flow chart showing the selection of the appropriate records from the search results.
Overview of the dataset
Summary of the dataset obtained from retrieved documents.
A total of 4063 authors participated in writing the documents, with 86 of them writing their documents single-handedly (Table 1). An average of five co-authors were involved in writing each document. The international co-authorship rate was 10%. Table 1 presents other summaries of the dataset.
Language of publication
Number of publications in different languages.
Publication trends over 30 years
Both SCOPUS and WOS had a very low number of publications on AI and health informatics until around 2009. Between 2010 and 2021, the number of relevant articles noticeably increased in both databases. SCOPUS slightly outpaced WOS in 2010 and 2011. From 2012 onwards, the number of publications in WOS consistently surpassed those in SCOPUS. Then, research output exponentially increased in 2018, peaking around 2021 (Figure 2). Only one 2023 publication was retrieved from WOS in November 2023 when the search was conducted, but SCOPUS had 112. Evolution of research publications in SCOPUS and Web of Science from 1993 to 2023.
Keyword analysis and Co-occurrence
The set minimum for a keyword’s appearances was five. Artificial intelligence was the most frequent keyword (Figure 3). Human and decision support system were also relatively common keywords. Tag cloud showing the frequency of keywords in the retrieved publications.
Identities, links, appearances, and total link strengths of clusters of linked terms.
Author and institution productivity
Top cited author on AI and health informatics in Scopus and Web of Science.
The WOS data, in contrast, provides a unique collection of authors and their publications, underlining even more the variety of the academic success landscape. For instance, Stone and Zidar coauthored an article with 1208 citations (Table 4).
Prominent research organizations and their Impact on AI and Health informatics: Insights from Web of Science and Scopus.
On the other hand, the Institute of Biomedical and Health Engineering and the Key Laboratory for Health Informatics contributed the highest number of publications retrieved from Scopus (Table 5). The Laboratory of Computer Science, Massachusetts General Hospital also had a highly cited publication, with 254 citations.
Journal and citation analysis
Citation metrics in Scopus and Web of Science (1993-2022).
Overall, Scopus had higher citation metrics in all aspects except cites per paper and authors per paper (Table 6). For instance, documents retrieved from Scopus had 9652 citations, which are 3475 more than WOS’ 6177.
Authors of publications retrieved from Scopus had more mean publications compared to WOS’. The papers per author in Scopus were 281, which was 140 more than the 141 paper per author in WOS. Cites per author in Scopus (3367.57) were more than double those in WOS (1655.44).
Countries production and collaboration networks
USA and Australia had the highest number of publications in WOS (Figure 4). On the other hand, USA and China had the highest number of publications in Scopus. Other notable sources of publications include Canada, India, and European countries. Number of papers produced by countries in artificial intelligence in health information research.
The publications from the US had the most international collaborations (Figure 5). The authors in US collaborated with authors across the world to produce the publications. Other countries with high frequencies of international collaborations based on the publications in the WOS include Australia, England, Spain, and Italy. Publications in Scopus also showed USA as the topmost coordinator of research collaborations (Figure 5). The publications from United Kingdom, India and China also showed high rates of international collaborations. Cooperation between countries in artificial intelligence in health information research.
Emerged trends
Publications on “artificial intelligence,” “machine learning,” and “medical informatics” grew steadily between 1993 and 2023 (Figure 6). The terms “COVID-19” and “coronavirus disease 2019” emerged in 2021. Concurently, “telehealth” and “telemedicine” gained prominence. Telehealth and blockchain gained popularity in 2023. Emerging trends related to the topic “artificial intelligence and health informatics.”
Summary of publication trends in WOS
Selected research articles based on number of citations in Web of Science.
Articles with the high number of citations were commonly in journals focusing on informatics. Two of the articles featuring in the top 10 list were published in the Journal of the American Medical Informatics Association. The second topmost article was published in the BMC Medical Informatics and Decision Making journal. The seventh in the list was published in the Health Information Science And Systems journal.
Selected research articles based on number of citations in Scopus.
In Scopus, six articles in the top ten list of the most cited papers were published in journals focusing on informatics. Two of the top 10 articles were published in the Journal of Biomedical Informatics. BMC Medical Informatics and Decision Making, which produced the second most cited article in WOS, also produced the second most cited article in Scopus (Table 8).
Discussion
The topic of artificial intelligence (AI) and health information (HI) integration is fast developing and has the potential to significantly impact patient outcomes, healthcare, and research. This bibliometric analysis has highlighted the changing landscape of health informatics by showing the trends, advancements, and multidisciplinary nature of AI in HI. The exponentially increasing growth rate in AI and HI publications accessible through WOS and Scopus is austounding, reflecting continuous increase in the amount of available information. A bibliometric analysis of English healthcare-related AI publications between 1995 and 2019 reported a 17% annual growth rate in related publications. 12 Another bibliometric analysis on the growth of digital health, which encompases the application of AI in healthcare, reported a 20% annual growth rate in publications. Wamba and Queiroz 13 also reported tremendous growth in the number of publications addressing the application of AI in digital health. The growth rates reported in the studies are consistent with the 13% average annual growth rate reported in this study. Therefore, researchers are increasingly conducting research on the application of AI in healthcare.
The analysis of publication trends reveals an exponential growth in research output over the past decade between 2015 and 2022 in both WOS and Scopus. The publication activity was highest between 2018 and 2023 since most of the retrieved publications were about five years old. Other studies also reported exponential growth in AI-in-Health publications after 2015.12,14 Thus, interest and investment in AI applications is increasing in the healthcare sector. The increase in scientific literature could be due to the growth of computing power and capacity to store data. 14
Computer scientists, data scientists, clinicians, and medical informaticians are progressively collaborating to apply advanced technologies including AI in healthcare. Most of the retrieved documents in this bibliometric analysis were authored by an average of about four authors, most of them from different countries. It is predicted that AI technologies will be supporting 95% of human interactions by 2029. 15 Therefore, more growth of research output is expected in the future.
The AI-in-Health publications accessible through Scopus and WOS are highly regarded by the scientific world. The 12 citations per document reported in this bibliometric analysis indicate that scientists consider the publications valuable since the topic is relatively new. The citation rate of papers depends on their potential excellence, with papers of higher scientific value having more citations. 16 The publications have multiple references, which indicates the effort that the authors put to obtain the recognition. Articles with longer reference lists are more likely to be highly cited. 17
English is the dominant language of publication of articles on AI and health informatics indexed in English in Scopus and WOS. Almost all the retrieved articles were in English. English is the common language of science, and articles indexed in English are more likely to be authored in English compared to other languages, hence the abundance of articles in English is not surprising. The scarcity of articles in other languages indicate that non-English speakers may be having barriers in indexing in English the AI-in-Health science that they author in other languages, which has been a burden for non-native English scientists even in other topics. 18 WOS and Scopus should enhance the indexing of non-English articles in English to increase their accessibility by English readers, who are the majority.
The prominence of machine learning, artificial intelligence, big data, deep learning, medical information systems, and digital health keywords in the retrieved articles shows that advanced technologies are concurrently contributing to healthcare improvements. 19 The evidence in this bibliometric analysis points to emphasis on the application of emerging technologies in healthcare through healthcare informatics, information dissemination, data security, information systems, and healthcare management. The infrastructure for both data generation and data analysis have improved as health data is massively increasing. 20 Thus, all advanced technologies have a place in revolutionizing healthcare in the current information age.
The high number of publications per author shows that the featuring researchers are experienced and prominent. The high number of citations of articles by a few authors namely Zheng, Zeng, Stone, and Zidar shows that they are authorities in AI-in-Health. For instance, Zheng is an established researcher based at the University of Liverpool with 78 publications and 3838 citations. 21 His focus areas include wearable devices, health monitoring, and biosensors, which heavily apply AI. Stone and Zidar co-authored an article that has gotten 1208 citations, which indicates that they contribute high-calibre information on AI-in-Health.
The United States has consistently played a leading role in AI-in-Health research. China, the United Kingdom, Canada, India, and Australia have also made significant contributions to the field. Another study also reported that USA, Canada, European countries, and China have the highest productivity in AI-in-Health research. 14 The US has notable research institutes with researchers focusing on AI-in-Health. For example, the Massachusetts General Hospital in the US, which emerged as the leading source of studies on AI-in-Health, has the Mass General Research Institute at the hospital that conducts both hospital-based and community-based research. The institute focuses on translating the new discoveries of research scientists in the lab into practice changes. 22 Since its research comprises innovative services, AI is bound to be a prominent feature in its research activities.
Both WOS and Scopus are keen on indexing publications on AI-in-Health. The larger number of publications retrieved from Scopus compared to WOS and the higher citation rate indicate that Scopus could be more aligned toward indexing AI-in-Health publications. Majority of the top 10 most cited articles in Scopus were published in informatics journals, which shows more alignment to the search terms. According to Singh et al. 23 Web of Science is a selective database, hence the fewer publications retrieved from it. However, the articles that were retrieved through Scopus were more specific on AI-in-Health, thus its indexing may be more specific.
The observed existence of international collaboration and diversity of perspectives shows the growth of AI-in-Health globally. The United States and the United Kingrom are keen on collaborative research with networks across the world. International collaborations are vital in the development of inter-operatable AI-driven health interventions that are accessible across borders. 24 They form the basis for knowledge transfer, which can further improve the AI-in-Health field as innovators exchange ideas and thoughts.
The COVID-19 pandemic disrupted the publication dynamics as researchers focused on it between 2020 and 2022. The dorminance of COVID-19 publications in the top 10 list of articles retrieved from the WOS indicates researchers’ predeliction to COVID-19 research during the pandemic. Besides the application of AI to manage the COVID-19 pandemic, other aspects of AI-in-Health research stagnated during the pandemic. 25 Notably, the search in Scopus identified relevant articles while leaving out COVID-19 related publications, which shows the discriminative power of its search engine. Scopus was developed more recently compared to WOS, hence it could be more advanced.
Limitations and challenges
The main limitation of this study is that it focuses on quantity metrics rather than the quality of the published content. Some research work addressing very crucial informational gaps in AI-in-Health may have not featured in the top ten list created by the bibliometric analysis because of low citations counts. Secondly, only two databases were searched yet several other databases index articles on AI and health informatics. Therefore, publications not indexed in Scopus and WOS may have been missed. Although the challenge of missing data could have been mitigated by perform separate analyses for each database, doing so is difficult.
The databases searched to identify the articles have focused on structures to index research articles. Hence, authors who may have relied on communicating their AI-in-Health findings through book chapters and books may have been overlooked by this bibliometric analysis. Besides, this bibliometric analysis did not have a mechanism to address the inherent inconsistencies between databases, which may have introduced errors in the results.
While RStudio and other statistical software packages are powerful tools for bibliometric analysis, we experienced challenges in handling the large datasets in certain data formats. Repeated analysis while changing data formats facilitated the utility of the RStudio in the bibliometric analysis. Additionally, the bibliometrix package in R Studio does not provide the full names of authors and the name of their affiliated institutions, which compelled searching the full names of the authors manually. Besides, the extracted web map from VOSviewer could inadvertently miss some node names when the biliometric package is used, but repeated analysis revealed the missing node names.
Parameter loss was also a challenge during the merging process, whereby citation counts or keywords risked getting lost in case of failure to consistently format data across datasets. The challenge of parameter loss was addressed by carefully checking the data to ensure that all relevant parameters were retained and properly aligned.
Another challenge was duplicate entries when merging data from multiple sources. It occurred for articles indexed in both Scopus and WoS. Duplicate removal as outlined in the methods section helped in addressing this challenge.
Conclusion
Both Web of Science and Scopus have indexed in English publications on the application of AI in health informatics. The search in Scopus yielded articles that were more aligned to the search terms. Scopus also had generally better metrics than WOS, perhaps due to its relative newness. The fusion of AI with health information is a dynamic and evolving field with the potential to revolutionize healthcare considering its rapid growth in the recent past as reflected in high volume of related research output between 2018 and 2023. The COVID-19 pandemic made the production of publications on the application of AI to be centered on COVID-19. The research activities in AI-in-Health are expected to keep increasing. Interdisciplinary collaborations and knowledge-sharing are the cornerstones to ensure interoperability and diffusion of AI-in-Health innovations across the world. AI-in-Health researchers in high-income countries should continue including researchers from developing countries in their study groups for global impact of their research outputs. Future researchers should focus on non-COVID-19-related topics for more publication activity on the application of AI in other health aspects.
Footnotes
Author contributions
EA and DK conceptualized and designed the study; EA and DK conducted the search and screened the results; EA extracted and analyzed the metrics data; DK interpreted the results; EA and DK wrote the draft manuscript; DK edited the manuscript; EA and DK proofread the manuscript and agreed on its submission to the journal.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethical statement
Data availability statement
The data analyzed in this study is available from the corresponding author upon request.
