Abstract
Background
The rapid integration of ICT into healthcare has elevated the critical role of digital health literacy (DHL). However, the conceptual relationship between DHL and electronic health (eHealth), along with the impact of transformative events such as the Fourth Industrial Revolution and the COVID-19 pandemic, remains inadequately investigated.
Objective
This study seeks to analyze research trends in DHL across four distinct historical periods to uncover key themes and their temporal evolution.
Methods
A comparative analysis of 2645 abstracts from Scopus publications (1977–2022) was conducted, segmented into four periods: (I) emerging era (1977–2006), (II) establishment era (2007–2016), (III) diffusion era (2017–2020), and (IV) post-pandemic era (2021–present). Text network analysis identified core keywords, and Latent Dirichlet Allocation (LDA) extracted dominant topics and their temporal evolution.
Results
Since 2006, DHL research has exhibited consistent growth, underpinned by transformative advancements during the Fourth Industrial Revolution and further amplified by a significant surge in scholarly engagement in the post-pandemic era. Importantly, during the diffusion era (Period III), a divergence in the trajectories of “digital health literacy” and “electronic health literacy” emerged.
Text network analysis revealed a progression toward greater uniformity in node sizes over time, coupled with an increase in the complexity and intricacy of connections between nodes. These findings indicate a growing diversity and nuanced understanding of concepts associated with DHL. Moreover, research in the post-pandemic era (Period IV) emphasized the critical role of DHL in addressing health disparities and advancing equitable access to healthcare.
Conclusion
The study reveals the dynamic progression of DHL research, catalyzed by technological advancements and global health crises. Strengthening DHL, particularly among vulnerable populations, is crucial for mitigating health disparities in a rapidly digitalizing world. Future research should prioritize the development of targeted interventions and examine DHL's impact across diverse sociocultural contexts.
Keywords
Introduction
The rapid advancement of digital technology and the Fourth Industrial Revolution have fundamentally reshaped how health information is accessed and utilized. 1 These transformations have introduced unprecedented opportunities and challenges in the health sector, particularly in how individuals engage with digital health resources.2,3 The COVID-19 pandemic further amplified these dynamics, 4 underscoring the critical role of digital health literacy (DHL) in equipping individuals to adapt to these changes. 5
The World Health Organization (WHO) defines digital health as the comprehensive integration of digital technologies into health systems, encompassing all stages from inception to operation. 6 This paradigm extends beyond the conventional scope of eHealth by incorporating advanced technologies, including the Internet of Things (IoT), big data analytics, and artificial intelligence (AI).
Although DHL and eHealth are closely related, they are conceptually distinct. 7 eHealth primarily refers to the application of information and communication technologies (ICTs) to healthcare services, encompassing electronic health records (EHRs), telemedicine platforms, and mobile health (mHealth) applications. 8 Its primary focus lies in technology-mediated healthcare delivery and communication, emphasizing the access, storage, and exchange of health information within digital ecosystems 6 .
Conversely, DHL is a critical competency within this domain that transcends the functional use of digital health technologies. It encompasses advanced cognitive and social skills necessary for individuals to effectively seek, understand, evaluate, and apply health information from digital sources. Beyond merely distinguishing accurate from inaccurate information, DHL requires competencies such as navigating complex digital tools, promoting evidence-based decision-making, and addressing health inequities.9–11 Recently recognized as a “super social determinant of health,”12–14 DHL significantly influences health behaviors and outcomes by empowering individuals to make informed health decisions and engage in self-management of health and well-being.15,16 However, insufficient DHL not only hinders effective health promotion and disease management but also exacerbates the digital divide, thereby perpetuating health disparities and systemic inequities. 9
Since its initial conceptualization by Norman and Skinner in 2006 as part of eHealth literacy (eHL), 17 the concept of DHL has evolved significantly in response to global shifts and digital development.7,18 However, the relationship between eHL and DHL, and the impact of societal events such as the Fourth Industrial Revolution and the COVID-19 pandemic on DHL research, remain underexplored.19,20 Additionally, previous literature reviews2,20 have reported that DHL-related studies began emerging in 2006 and saw a dramatic increase after 2014. However, these studies do not sufficiently account for the impact of the COVID-19 pandemic. While prior studies have largely relied on bibliometric reviews and literature analyses to provide an overarching view of DHL research,2,19,21,22 they often fail to delve into the specific trends and shifts associated with distinct historical events.
This study addresses these gaps by employing text network analysis, a robust methodology capable of identifying patterns and relationships among key terms within academic texts.23,24 Unlike traditional methods, text network analysis allows for a nuanced exploration of how research priorities emerge and evolve over time. 24 In addition, topic modeling is employed to consolidate and reduce complex textual data into coherent and interpretable topics. 25 Analyzing research trends is essential for understanding the progression of scholarly inquiry and informing strategic future directions. By focusing on temporal patterns rather than specific research questions, trend analysis reveals areas of contemporary academic interest.23,24 Together, these analytical techniques enable the identification of prevailing research themes and the projection of future trajectories. 23 By categorizing research trends into four distinct periods—defined by the conceptualization of DHL (2006), 17 the official proclamation of the Fourth Industrial Revolution (2016), 26 and the declaration of COVID-19 as a global pandemic (2020) 27 —this study provides a granular view of DHL's development and highlights pivotal shifts in research focus.
The study aims to analyze DHL-related research trends across three historical events using text network analysis and topic modeling. Specifically, its objectives are to:
Identify research trends related to DHL, Perform a comparative analysis of core keyword networks across different time periods, and Examine topic model themes over distinct historical periods.
The significance of this study lies in its multifaceted contributions. Through the application of advanced methodologies, including text network and topic modeling analysis, it systematically delineates key research themes shaped by evolving societal dynamics, offering critical insights into future scholarly trajectories. Beyond its methodological rigor, the study provides profound academic and practical implications by elucidating the developmental trajectory of DHL. These findings serve as a foundational resource for policymakers, educators, and healthcare professionals striving to navigate the challenges of an increasingly digitalized and health-oriented global landscape.
Methods
Data collection
This study was designed as a trend analysis incorporating text network analysis and topic modeling to examine research trends in DHL. The study period spanned from 1977 to 2022, with data extracted from the Scopus database, focusing on articles within the domain of public health. Scopus is one of the most comprehensive academic databases, encompassing over 25,000 scholarly journals, more than 7000 publishers, and approximately 90 million records. 28 The year 1977 was selected as the starting point of the analysis, as it marks the earliest appearance of DHL-related studies in the Scopus database.
To systematically analyze the evolution of DHL research, the timeline was divided into four distinct periods, delineated by historical milestones. Core keyword trends were examined across these periods: period I (emerging era) for 1977–2006; period II (establishment era) for 2007–2016; period III (diffusion era) for 2017–2020; and period IV (post-pandemic era) after 2021. This methodological framework provided a comprehensive view of the progression of DHL research, highlighting shifts in focus and emerging trends across different historical contexts.
We conducted a comprehensive search of the Scopus database to identify articles in the field of public health using the terms “digital health literacy,” “e-health literacy,” and “electronic health literacy” in titles, keywords, and abstracts. This search yielded a total of 4054 articles containing at least one of the specified terms. After screening for relevance to public health and eliminating duplicates, 2645 articles were selected for further analysis.
The distribution of analyzed articles across the defined periods was as follows: Period I comprised 89 articles, Period II included 599 articles, Period III comprised 884 articles, and Period IV included 1073 articles (Figure 1).

Flow diagram for the data collection (literature search and article selection) and data analysis process.
To enhance the systematicity and transparency of reporting, the checklist of Bibliometric Reviews of Biomedical Literature (BIBLIO) was employed. 29 This checklist comprises 20 essential items that outline the minimum requirements for conducting and reporting bibliometric reviews. Its application ensures methodological rigor and consistency, facilitating reproducibility and the comprehensive assessment of bibliometric analyses (Supplementary material 2).
As this study analyzed publicly accessible literature, it did not involve human participants and was therefore exempt from ethical approval requirements.
Analysis methodologies
Text network analysis
Text network analysis provides enhanced insights by examining relationships among keywords to uncover hidden meanings and can be effectively utilized through visualization. Keywords were extracted for each major period from the abstracts using morpheme analysis. Data analysis was performed using NetMiner 4.0 software (Cyram, Seongnam-si, South Korea).
Stopwords were identified and excluded through an iterative review process conducted by three researchers. Words that appeared consistently across all papers were excluded from the research trend analysis, as they lacked specificity as keywords. For example, commonly used terms in DHL-related papers, such as “digital,” “health,” and “literature,” were omitted. Similarly, generic terms like “discussion” and “methods” were excluded due to their lack of relevance to the research question (Supplementary 1).
The rankings were assessed based on the centrality of keywords (nodes) in the text network to select the core keywords. Among various centrality indices, this study adopted the “degree centrality” as it was more appropriate for keyword analysis. The degree centrality index measures how many links (connections) the nodes have in the network, identifying keywords that appear simultaneously with other keywords. 23 Degree centrality is expressed as the value between 0 and 1, with a value close to 1 representing large centrality. This study analyzed the frequency and degree centrality of the top 30 keywords for each of the four defined periods, providing a detailed exploration of keyword trends and their evolution over time.
Topic model analysis
Latent Dirichlet Allocation (LDA) is a probabilistic generative model widely utilized for identifying latent topics within a corpus of text. By analyzing the co-occurrence patterns of words, LDA extracts underlying topics and represents each document as a probabilistic distribution over these topics, while each topic is modeled as a distribution over words. This dual representation enables LDA to effectively uncover the thematic structure of large text datasets.30–32 Additionally, the model simulates the generative process of creating documents, offering insights into the relationships between words, topics, and documents.
In this study, the LDA algorithm was implemented with 100 iterations, employing the Markov Chain Monte Carlo (MCMC) method for model training. These parameters align with the default configuration of the applied software. The number of topics, which can be adjusted based on the analytical objectives, was set to three to balance interpretability with the granularity of topic differentiation. This choice allowed for a focused and concise analysis of thematic shifts over time.
Results
DHL-related research trends
Table 1 presents an analysis of the top 30 countries and journals contributing to the field of DHL. The United States dominated DHL research in terms of publication (
Leading 30 countries and journals by number of publications (
Figure 2 illustrates the distribution of academic fields related to DHL research. The majority of studies were in the field of Medicine (56%), followed by Social Sciences (16%) and Nursing (9%).

Academic fields related to digital health literacy.
Figure 3 delineates the annual publication trends in academic papers pertaining to DHL. Prior to 2006, the volume of publications remained relatively static, indicating minimal research activity in this domain. A modest yet steady upward trajectory becomes discernible from 2006, transitioning into a pronounced acceleration beginning in 2016, a period conceptualized as the “diffusion era” (Period III). This growth trajectory was further catalyzed during the “post-pandemic era” (Period IV), reflecting the increasing scholarly attention to DHL as a critical framework for addressing global challenges and leveraging technological innovation.

Annual publication trends in digital health literacy. Note: Solid lines represent publications used in this analysis; dotted lines represent e-health literacy; dashed lines represent digital health literacy; and dashed-dotted lines represent electronic health literacy.
When examining trends by specific terms—"digital health literacy,” “e-health literacy,” and “electronic health literacy”—the trajectory of digital health literacy closely mirrors the overall pattern, with a steeper growth compared to the other terms. In contrast, e-health literacy and electronic health literacy show less pronounced increases, with e-health literacy consistently recording the lowest number of publications among the three categories. Notably, during Period III (diffusion era), the upward trajectories of “digital health literacy” and “electronic health literacy” reverse, reflecting a significant shift in scholarly focus and a pronounced preference for the term “digital health literacy” within academic discourse.
Table 2 presents a comparative analysis of the 30 most frequently appearing keywords in DHL-related papers across different periods. Commonly recurring keywords across all periods included “care,” “student,” and “education,” with consistent frequency trends. Notably, the keyword “barrier” exhibited a remarkable increase in prominence, advancing from the 30th position in Period 1 to the 8th position in Period 4.
Frequency of keywords according to the four periods.
Comparative analysis of core keyword networks based on periods
Table 3 presents the 30 major keywords with the degree centrality according to periods. Words that appeared in all periods, included “care,” “computer,” “education,” “internet,” “service,” “training,” and “student,” which could be considered the identity of the DHL-related research.
Comparative analysis of core keyword networks according to the four periods
Figures 4–7 presents the network map of the top 50 keywords in DHL-related research. Words with high degree centrality are considered the most frequently occurring within the network, representing core concepts of the subject. 33 In the figure, node size corresponds to word frequency, with larger nodes indicating higher prominence.

Network map of the top 50 keywords in Periods I.

Network map of the top 50 keywords in Periods II.

Network map of the top 50 keywords in Periods III.

Network map of the top 50 keywords in Periods IV.
During Period I (Emerging Era), seven large nodes—such as “care,” “student,” and “internet”—stand out as dominant. Over time, node sizes become more uniform, and the connections between nodes grow increasingly intricate, reflecting the expanding diversity and complexity of concepts related to DHL.
Certain terms emerged exclusively in specific periods, such as “copyright” in Period II, “medicine” in Period III, and “pandemic,” “license,” “platform,” and “device” in Period IV. Notably, “gap” and “disparity” appeared for the first time during the post-pandemic era (Period IV), emphasizing their relevance to recent research trends.
Comparison of topic model topics by period
Table 4 presents the findings of the topic modeling analysis, providing a comprehensive overview of the evolution of research themes across four distinct periods. Although the overarching emphasis on DHL remained consistent, thematic shifts aligned with changing priorities and societal needs were discernible. During Period I (1977–2006), research primarily centered on hospital-related concepts, with predominant clusters including terms such as “electronic medical records (EMR),” “hospital,” and “doctor.” In Period II, clusters evolved to include terms such as “system,” “computer,” “physician,” and “EHR,” reflecting a transition toward technological advancements and digitalization in healthcare. By Period IV, thematic clusters incorporated terms such as “healthcare,” “provider,” “barrier,” and “implementation,” signaling an expanded focus on public health, community involvement, and the broader societal implications of DHL.
Comparison of topics according to the four periods.
Discussion
Milestones in the evolution of DHL research
Norman and Skinner's foundational conceptualization (2006)
The analysis of research trends before and after 2006, defined in this study as the year of the first conceptualization of DHL, reveals distinct patterns. Prior to 2006, the volume of publications remained relatively static, indicating minimal research activity in this domain. However, from 2006 onward, a modest yet steady increase in scholarly output becomes evident, reflecting the growing recognition of DHL as an important research focus. This observation is further supported by the text network analysis results presented in Figure 4. During Period I (Emerging Era), a small number of dominant nodes—such as “care,” “student,” and “internet”—emerged as central terms. Over time, the network structure evolves, with nodes becoming more evenly distributed and interconnections becoming increasingly intricate, symbolizing the diversification and conceptual maturation of DHL-related research.
Furthermore, the topic modeling analysis underscores a temporal shift in research focus. During Period I (1977–2006), the primary emphasis was on hospital-centric concepts, with clusters highlighting terms such as “electronic medical records (EMR),” “hospital,” and “doctor.” In Period II (2007–2016), research themes shifted towards technological advancements and digitalization, as evidenced by clusters featuring terms like “system,” “computer,” “physician,” and “EHR.” This progression reflects an expanding scope of inquiry, emphasizing the integration of emerging technologies into healthcare systems and underscoring DHL's role in addressing evolving societal and healthcare demands.
The fourth industrial revolution and its transformative impact (2016)
The analysis of the trajectories of DHL-related terms—digital health literacy, e-health literacy, and electronic health literacy—offers critical insights into the conceptual evolution of DHL. Notably, during Period III (the diffusion era), a divergence emerges in the upward trajectories of “digital health literacy” and “electronic health literacy,” reflecting a significant shift in scholarly focus.
The Fourth Industrial Revolution, defined in this study as the onset of the diffusion era, represented a pivotal moment in the integration of advanced technologies—such as artificial intelligence (AI), machine learning, big data analytics, and the Internet of Things (IoT)—into healthcare systems, fundamentally transforming healthcare delivery and operations. 34 While “e-health” primarily focuses on internet-based technologies and digital communication tools, “digital health” extends to a broader and more inclusive framework, encompassing a diverse range of advanced technologies that capture the increasingly complex and evolving digital health landscape. 35
The rapid evolution and integration of these technologies have profoundly influenced the conceptualization and scholarly understanding of DHL. This study reaffirms the pivotal role of technological advancements in shaping the trajectory of DHL research, highlighting how the adoption of emerging technologies continues to drive academic discourse and redefine the field's scope and application.
The COVID-19 pandemic as a global catalyst (2020)
This study highlights significant accelerations in DHL research during pivotal periods such as the Fourth Industrial Revolution and the COVID-19 pandemic. These observations align with prior literature, such as Yang et al. (2022), 20 which delineated distinct phases in DHL research, including an incubation period (1998–2005), a slow growth period (2006–2013), and a rapid growth period (after 2014). The current study advances this framework by introducing the “post-pandemic era” as a discrete and pivotal phase, emphasizing the significant impact of macro-environmental disruptions on shaping research trajectories.
A comparative analysis of these periods indicates that the COVID-19 pandemic exerted a more immediate and transformative influence on DHL research compared to the gradual impact of technological advancements observed during the diffusion era (2016 onward), as illustrated in Figure 3. Furthermore, the text network analysis results from this study identify the emergence of terms such as “pandemic,” “license,” “platform,” and “device” exclusively during Period IV. For instance, individuals were compelled to discern accurate information about COVID-19 amidst an overwhelming influx of misinformation to maintain their daily lives. 2 Simultaneously, children and adolescents adapted to utilizing platforms and devices for health education purposes. 36 These circumstances significantly heightened research interest in DHL, as reflected in the increased scholarly output on DHL-related topics (e.g., platform, device) during the pandemic. Aligning with findings from prior studies, 37 this analysis underscores the catalytic role of global crises in accelerating research priorities and directing scholarly focus toward addressing critical gaps in DHL.
Finally, during this period, novel terms such as “gap” and “disparity” emerged, though they did not rank among the top 30 keywords. Furthermore, the topic modeling analysis identified significant clusters, including “barrier” and “implementation,” reflecting an increasing focus on addressing systemic challenges and enhancing health outcomes within community and population health contexts. This thematic evolution underscores the growing recognition of DHL as a critical determinant of health equity and a fundamental mechanism for mitigating the digital divide and reducing health disparities in an increasingly interconnected global landscape.
Recent scholarly discourse has highlighted DHL's potential to become a prominent social determinant of health, contributing to future health inequalities and exacerbating the digital divide.12,13,38 Given that reducing health disparities remains a central objective of the global health agenda, 39 research on the inequities associated with DHL is anticipated to grow rapidly, aligning with efforts to promote equitable access to digital health resources worldwide.
Geographical trends and future directions in DHL research
This study identified the United States as the leading contributor to research on DHL, with the United Kingdom and Australia ranking as the second and third most prolific contributors, respectively. These findings are consistent with prior studies that have highlighted the United States, Australia, the United Kingdom, and Canada as dominant contributors to advancements in DHL research. 20 This underscores the significant role of Western nations in shaping the field and driving innovation.
However, the development and characteristics of digital health systems within healthcare sectors exhibit substantial variability across countries. 40 This highlights the necessity of expanding research efforts to include diverse geographical and cultural contexts. Future research should prioritize examining DHL among populations in Asia and developing regions, focusing on identifying context-specific influencing factors and addressing unique challenges. Such efforts are essential to expanding the scope and inclusivity of DHL studies.
Limitations of this study
First, limitations related to text network and topic model analysis must be considered. While these methods effectively identify research trends by analyzing a large volume of studies in a short period,23–25 they rely on key terms and their connection strengths to infer results. This approach may fail to capture deeper conceptual meanings, potentially leading to lower concreteness in the findings. 41
Second, the exclusion of certain keywords as stopwords presents a methodological limitation. To enhance the precision of keyword analysis and prevent overrepresentation, commonly used terms such as “digital” and “health” were removed. However, this decision may have unintentionally excluded critical terms that contribute to nuanced interpretations within specific subfields. As a result, the omission of these terms may have influenced the formation of thematic clusters and the structure of keyword networks by altering the co-occurrence patterns of domain-specific terminology.
Third, in this study, the number of topics was set to three based on the researchers’ judgment, considering the interpretability of the extracted topics and the research context. Future studies should consider applying quantitative evaluation metrics, such as Perplexity or Coherence Score, to enhance the objectivity and robustness of topic modeling results
Fourth, limitations related to inclusion criteria and study scope should be noted. Since this study analyzed only English-language publications, publication bias may have occurred, leading to an overrepresentation of research conducted in English-speaking countries and limiting the generalizability of the findings to non-English contexts. Additionally, a significant portion of the analyzed literature originates from Western countries, which may further restrict the global representativeness of DHL research trends.
Finally, limitations regarding publication types must be acknowledged. This study focused exclusively on journal articles, excluding gray literature and conference proceedings, which often contain important research findings and emerging trends. The exclusion of such sources may have influenced the comprehensiveness of the study's findings. Therefore, recognizing these limitations and clearly defining the study's scope is essential for a more holistic understanding of the field.
Conclusions
This study systematically analyzed digital health literacy (DHL)-related literature using text network and topic modeling approaches. The findings revealed the evolution of the DHL concept in alignment with three significant historical events. Notably, the diversity of DHL-related research has increased over time compared to its initial emergence (Period I), underscoring its dynamic and evolving nature shaped by historical contexts. Future research is anticipated to expand DHL-related studies, focusing on specific patient populations and diverse groups, including low-income and vulnerable populations, to address their unique needs and challenges.
Supplemental Material
sj-docx-1-dhj-10.1177_20552076251334537 - Supplemental material for Concept of digital health literacy revisited: Using text network and topic model analysis
Supplemental material, sj-docx-1-dhj-10.1177_20552076251334537 for Concept of digital health literacy revisited: Using text network and topic model analysis by Jiyoung Park, Seohyun Won, Mingee Choi, Chul Hee Kang and Han Shi Jocelyn Chew in DIGITAL HEALTH
Supplemental Material
sj-docx-2-dhj-10.1177_20552076251334537 - Supplemental material for Concept of digital health literacy revisited: Using text network and topic model analysis
Supplemental material, sj-docx-2-dhj-10.1177_20552076251334537 for Concept of digital health literacy revisited: Using text network and topic model analysis by Jiyoung Park, Seohyun Won, Mingee Choi, Chul Hee Kang and Han Shi Jocelyn Chew in DIGITAL HEALTH
Footnotes
Author Contributions/CRediT
Jiyoung Park: conceptualization, funding acquisition, investigation, project administration, supervision, validation, writing—original draft, writing—review and editing; Seohyun Won: writing—original draft, writing—review and editing; Mingee Choi: software, data curation, methodology, validation; Chul Hee Kang: supervision, conceptualization, writing—review and editing, Han Shi Jocelyn Chew: supervision, writing—original draft, writing—review and editing
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Research Foundation of Korea (grant number 2020R1A2C4096046, 2022R1C1C1009609).
Conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability
Derived data supporting the findings of this study are available from the corresponding author Mingee Choi on request.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
