Abstract
Introduction
Mental health disorders affect millions of people worldwide. Chatbots are a new technology that can help users with mental health issues by providing innovative features. This article aimed to conduct a systematic review of reviews on chatbots in mental health services and synthesized the evidence on the factors influencing patient engagement with chatbots.
Methods
This study reviewed the literature from 2000 to 2024 using qualitative analysis. The authors conducted a systematic search of several databases, such as PubMed, Scopus, ProQuest, and Cochrane database of systematic reviews, to identify relevant studies on the topic. The quality of the selected studies was assessed using the Critical Appraisal Skills Programme appraisal checklist and the data obtained from the systematic review were subjected to a thematic analysis utilizing the Boyatzis's code development approach.
Results
The database search resulted in 1494 papers, of which 10 were included in the study after the screening process. The quality assessment of the included studies scored the papers within a moderate level. The thematic analysis revealed four main themes: chatbot design, chatbot outcomes, user perceptions, and user characteristics.
Conclusion
The research proposed some ways to use color and music in chatbot design. It also provided a systematic and multidimensional analysis of the factors, offered some insights for chatbot developers and researchers, and highlighted the potential of chatbots to improve patient-centered and person-centered care in mental health services.
Introduction
The prevalence of mental health disorders is a significant challenge for society, as they affect a large number of individuals worldwide. As an example, according to the World Health Organization, depression alone affects more than 264 million people of all ages, and anxiety disorders are estimated to affect 3.76% of the global population. 1 Additionally, according to the World Health Organization, mental disorders rank among the most common causes of disability worldwide. 2
According to the World Health Organization, mental health refers to “a condition of well-being that enables the individual to recognize and utilize his or her own potential, to manage the normal challenges of life, to work effectively and productively, and to contribute positively to his or her community.” 3
The availability and quality of mental health services and treatment vary widely across countries and cultures in the world. According to the World Health Organization, major depression as a mental illness is the most common cause of disability in terms of years lived with disability and the fourth most common cause of disability-adjusted life years globally. 4 Moreover, the Health Canada Editorial Board on Mental Illnesses in Canada reports that more than one-fifth of Canadians will experience a mental illness in their lifetime, and the global economic impact of mental health in 2010 was estimated at 2.5 trillion U.S. dollars. 5 Meanwhile, the prevalence of suicide is rising in many countries, such as the United States, indicating the urgency of developing new solutions and innovations in mental health. 5
Shown as a major challenge in mental health services, the current clinical workforce is inadequate to address these needs; The number of psychiatrists per population is very low, with an average of 9 per 100,000 people in developed countries and as few as 0.1 per 1,000,000 people in some in lower-income regions. Meanwhile, the current and projected demand for care exceeds the available capacity, which necessitates the exploration of digital technology as a potential solution.4,6
Interestingly, chatbots, also known as conversational agents or multipurpose virtual assistants, are an emerging technology that may facilitate the engagement and adherence of users with mental health issues. 7 They have been deployed across various domains for a multitude of purposes. Specifically, in the realm of mental health, they have been utilized for therapeutic interventions, training programs, educational initiatives, counseling sessions, and screening procedures.8–10 Furthermore, they may reduce the stigma associated with seeking mental health-related advice and enhance the user experience of mobile mental health apps. 11 Moreover, they have been investigated for their effectiveness in promoting self-disclosure and expressive writing.7,12 Chatbots have also provided various types of social support, such as appraisal, informational, emotional, and instrumental support, to young people with mental health issues. While they have been designed to educate underprivileged communities on mental health and stigmatized topics.13,14
There is emerging evidence of user engagement with chatbots for supporting various mental health issues and preliminary indications of positive health outcomes in the physical and mental health domains. 15 The engagement of patients with chatbots in mental health services is a crucial factor for the successful implementation and adoption of this emerging technology. However, there is a lack of evidence-based guidance on how to design and evaluate chatbots for mental health purposes. 16 Therefore, this article aims to conduct a systematic review of the existing reviews on the topic and identify the factors influencing the engagement of patients with chatbots in mental health services to synthesize the existing evidence and present it in a single document. The conclusions drawn from this research offer significant implications for manufacturers, healthcare professionals, and policymakers who are strategizing to enhance the applicability of chatbots in the field of mental health services.
Methods
This study involved a systematic review of reviews by employing qualitative analysis of literature published between 2000 and 2024. The research question was: “what are the factors influencing patient engagement with mental health chatbots?.” Furthermore, this study employed the Joanna Briggs Institute approach to execute a qualitative systematic review, which encompasses several stages. 17 Initially, the authors identified and selected relevant studies by conducting a comprehensive search across multiple databases and implementing a rigorous screening process. Subsequently, the identified studies were critically evaluated using the Critical Appraisal Skills Programme (CASP) quality appraisal checklist. The authors then extracted pertinent data from the studies using a preestablished checklist. Following this, a thematic analysis was conducted on the data to synthesize the findings. Finally, the results were reported and analyzed in a comprehensive manner, providing a thorough understanding of the subject matter.
Data collection and search method
The authors conducted a systematic search of several databases, such as PubMed, Scopus, ProQuest, and Cochrane database of systematic reviews, to identify relevant studies on the topic. To identify relevant studies that met the essential criteria, we categorized the search terms into three themes, namely Mental health, Engagement, and Chatbot. We commenced our search by using broad terms to enhance sensitivity and incorporated synonyms using the “OR” operator. To ensure specificity and minimize irrelevant studies, we used the “AND” operator in our search strategy. Furthermore, Medical Subject Headings terms such as “mental health” were used in order to improve the search strategy. Our search strategy is presented in Table 1. The search was conducted on the 26th of January 2024 and was registered in GitHub with the following link: https://github.com/mohsenkhosravi3913/Factors-Influencing-Patient-Engagement-in-Mental-Health-Chatbots.git.
The search methodology utilized to perform the systematic review.
Inclusion and exclusion standards
The inclusion criteria for this study were articles published in English between 2000 and 2024 that specifically addressed the factors influencing patient engagement in mental health chatbots. Articles were excluded if they: (A) did not cover a chatbot corresponding to mental health services, (B) lacked a text describing the factors of patient engagement with a chatbot intervention in healthcare, or (C) weren’t a review paper in terms of methodology.
Screening and data retrieval
The systematic review was conducted in several stages. Firstly, the authors independently assessed the title of all articles obtained from the databases multiple times. In this procedure, we selected papers that appeared to discuss the factors influencing patient engagement with mental health chatbots. Any papers that seemed irrelevant were subsequently removed from further stages of the screening process. Secondly, the abstracts of selected articles were screened similar to the previous stage, and then the full text of chosen articles was thoroughly assessed. Furthermore, to ensure that no relevant studies were missed, the authors also reviewed the references cited in the articles. Finally, articles that met the validity criteria were selected for inclusion in the study.
The authors used a form to extract necessary information from selected articles, which was then summarized and combined using MAXQDA 12 software. 18 MAXQDA is a software for qualitative data analysis. It provides tools for managing, transcribing, and analyzing data through coding. It also allows note-taking, data summarization, and visualization of patterns. 19 The form utilized by the authors comprised several sections, including the year of the study's publication, its settings (the place in which the study was conducted), the associated journal, its indexed databases, the study type in terms of methodological approach, the type of chatbot(s) studied, corresponding disorders, and the number of final studies included within the study. During this process, the segments of the manuscript text that pertained to each section of the form were identified, coded, and subsequently incorporated into the final form. Furthermore, the authors verified the results at all stages to ensure reliability and minimize bias.
Quality appraisal of final articles using the CASP checklist
The quality of selected studies was evaluated using the CASP appraisal checklist, which covers multiple study designs. The checklist helps assess the validity, relevance, bias, and applicability of research studies. 20
The CASP checklist comprises ten questions that evaluate articles based on various factors, including result validity, study quality, and result applicability. 20 We used a scoring system where each question was assigned 2 points for yes, 1 point for cannot tell, and 0 points for no. The maximum score was 16, corresponding to three quality levels: low, medium, and high. Only articles with an average score of at least nine were included. 21 The total score for each study was computed, and the score for each study type was presented in the results section. After screening and evaluating the quality of articles using the CASP checklist, only the final articles were included in our review.
Data analysis
The aim of this phase was to identify the primary themes of factors that affect patient engagement with mental health chatbots. The authors conducted a thematic analysis of the items extracted from the contents of the final articles obtained through the systematic review.
The authors followed Boyatzis's code development approach to perform the analysis. 22 In this methodology, the authors initially read the text of the papers to familiarize themselves with the data; subsequently, they generated initial codes based on the objective of the analysis; the next step involved searching for themes by categorizing these codes into groups; following this, they reviewed the themes and began the process of defining and naming them; the final step was to produce the report. In this study, the authors extracted the factors affecting patient engagement with mental health chatbots separately from the final articles and compiled them into a single table using Microsoft Excel. The analysis resulted in mega-themes, themes and subthemes, based on the research question's framework. Furthermore, to improve the validity and reliability of the outcome and reduce the risk of error or bias, the authors repeated the same steps of the thematic analysis. They also consulted with each other to resolve any disagreements during the process, if needed.
Results
The following sections present the results from the systematic review of reviews, which consisted of three main steps: searching the databases, assessing the quality of the final articles, and conducting a thematic analysis of the data extracted from the final articles.
Systematic review
The execution of the systematic review of reviews was conducted in adherence to the PRISMA guidelines, which are specifically designed for reporting systematic reviews. 23 Figure 1 illustrates the results of the search within the databases. The search strategy yielded 1494 references from the databases. After removing 12 duplicates, the remaining 1482 references were screened by their titles, abstracts, and full texts. Based on the inclusion and exclusion criteria, 1472 references were excluded from the study. Consequently, 10 papers were selected as the final articles for the study.

Results of the search within the databases.
The characteristics of the final 10 studies are delineated in Appendix 1 (Bibliography of Final Articles). The studies employed different study types such as scoping reviews and systematic reviews, and were published between 2017 and 2023. Rule-based chatbots and machine learning-based chatbots were the most frequently used types of chatbots in these studies. Moreover, the studies addressed a broad range of disorders including depression, autism, anxiety, schizophrenia, posttraumatic stress disorder, substance use disorder, stress, mindfulness, and multiple other disorders. Additionally, the studies explored the potential of chatbots for promoting mental well-being, treating cooccurring depression and anxiety, and managing multiple health behaviors. Furthermore, the studies were conducted in diverse settings including the United States, Japan, Australia, China, Germany, the United Kingdom and multiple other countries. The number of final articles reviewed by the studies included in the current study varied between 10 and 54, indicating heterogeneity in the scope of the studies (Figure 2).

Number of included studies within the final articles of this study.
Quality assessment of final articles
The quality assessment was done using the CASP Systematic Review Checklist, which consisted of 10 questions that evaluated the validity, results, and applicability of the review. 20 Each question had been scored as yes (2 points), can't tell (1 point), or no (0 point), and the final score was calculated as the sum of the scores for each question. The data revealed that the majority of the 10 articles had the same final score of 9/16, which indicated a moderate level of validity, but also some limitations and gaps. The main areas where the articles scored poorly were the inclusion of all important and relevant studies, the assessment of the quality of the included studies, the consideration of all important outcomes, and the benefits, harms, and costs of the intervention. Moreover, the level of bias in the included studies was demonstrated to be reasonable. Further information regarding the quality assessment of included papers can be seen in Appendix 2 (quality assessment of final articles).
Thematic analysis
Table 2 presents the results of the thematic analysis of the data extracted from the final articles included in the study. The thematic analysis summarized the factors that influence user engagement with chatbots, conversational agents, or virtual health assistants for mental health or clinical psychology into two broad mega-themes of “chatbot factors” and “user factors.” The analysis identified four main themes: chatbot design, chatbot outcomes, user perceptions, and user characteristics. Each theme had several sub-themes that synthesized the findings from the literature.
Thematic analysis of the data extracted from the final articles.
Chatbot design
This theme comprised subthemes such as purpose, platform, response generation, dialog initiative, input and output modalities, embodiment, targeted disorders, personalization, interactivity, verbal and nonverbal behavior, anonymity, and intensity and duration. These subthemes described the characteristics and features of the chatbot that can affect user engagement. For instance, the purpose of the chatbot, whether it is for therapy, training, or screening, can influence engagement. The platform on which the chatbot is implemented, such as stand-alone software or a mobile app, can affect the engagement with the chatbots. Furthermore, the method used by the chatbot to generate responses, such as rule-based or machine learning-based, can influence the quality of the engagement.
Chatbot outcomes
This theme included subthemes such as effectiveness, measurement, support, security, privacy, and responsiveness. The effectiveness of the chatbot in improving mental health outcomes can influence engagement. Moreover, the use of appropriate outcome measurement instruments to evaluate the effectiveness of the chatbot can affect user engagement. Meanwhile, the availability of support for the chatbot app, including customer service and technical assistance, can affect user engagement.
User perceptions
This theme consisted of subthemes such as usefulness, ease of use, comparison, content, enjoyability, trustworthiness, attractiveness, acceptability, understandability, empathy, and affordability. The perceived usefulness of the chatbot in providing mental health support can influence engagement. Meanwhile, the ease of use of the chatbot can affect engagement. Furthermore, the comparison with other forms of mental health support can affect user's overall engagement.
User characteristics
This theme is comprised of subthemes such as age, gender, and mental health condition. These subthemes described the demographic, personal and health condition of the users which can influence engagement of the users from the services acquired via the chatbots.
Discussion
The results indicated that user engagement with the mental health services delivered by chatbots is influenced by several categories, such as chatbot design, chatbot outcomes, user perceptions, and user characteristics. This section discussed these categories and their implications for chatbot development and evaluation.
Chatbot design
The findings of the study declared several chatbot design factors, such as purpose, platform, response generation, dialog initiative, input and output modalities, embodiment, targeted disorders, personalization, interactivity, verbal and nonverbal behavior, anonymity, and intensity and duration. The findings implied that these factors have an influence on how patients perceive and interact with chatbots in mental health services.
The literature reveals the impact of color on mood, anxiety, aggression, and depression, which may affect engagement with mental health chatbots. For instance, the color blue has been associated with higher levels of trustworthiness and mental alertness in relation to digital platforms. 34 Hence, investigating the optimal and effective colors for the design of mental health chatbots can be a crucial step toward enhancing not only the engagement with the chatbots in the design category, but also the attitude of the users toward the chatbots in the perception category.
The literature also indicates that the incorporation of music in mental health chatbots can be a potential strategy to enhance both the engagement of patients with the chatbots in the design category and the treatment of some mental health disorders in an efficient and effective way in the outcome category.35,36 In this regard, the implementation of music therapy which is a description of a systematic intervention process in which the clients are assisted to enhance their health, by using musical experiences and the emerging relationships from them as dynamic agents of change can be a strategic policy by the chatbot designers. 37
Chatbot outcomes
The findings of the study manifested several chatbot outcome factors, such as effectiveness, measurement, support, security, privacy, and responsiveness as factors influencing patient engagement with chatbots in mental health services.
Concerning the effectiveness of chatbots, it is said that chatbots should follow the principles of other Behavioral Intervention Technologies (BITs) by adopting an efficiency model of support, i.e. a conceptual framework that captures the interaction between information and intervention in an optimal way. This model suggests that decisions should be made based on the analysis of why people may not benefit from BITs. 38 Moreover, chatbots that provide rapid diagnoses may undermine diagnostic practice, which requires practical wisdom and collaboration among different specialists as well as close communication with patients. 39
Concerning the security and safety of chatbots, the interactions between patients and chatbots should be rigorously monitored, as inaccurate chatbot responses may lead to unintended harm. Particularly, as chatbots increase their conversational flexibility, there may be more potential errors associated with natural language understanding or response generation. Hence, using unbounded chatbots should be accompanied by careful supervision of patient and chatbot interactions, and of safety functions. 40
In summary, the evidence regarding the factors influencing the engagement of patients with chatbots in the outcome category seems to be insufficient for any definitive conclusion. Therefore, it seems that further research on this category is needed to obtain a comprehensive view of the factors influencing patient engagement with mental health chatbots in the outcome category.
User perceptions
The findings of the study declared factors such as usefulness, ease of use, comparison, content, enjoyability, trustworthiness, attractiveness, acceptability, understandability, empathy, and affordability influencing patient’s engagement with mental health chatbots.
Maintaining users' preferences and demands to enhance their perception of chatbot usage has the potential to realize the long-standing concept of patient-centeredness particularly person-centeredness and person therapy in clinical services. Person-centered therapy, developed by Carl Rogers, is a nondirective approach that emphasizes the client's potential for self-actualization and self-healing. The therapist's attitude, relationship, and empathy are the key factors in the therapeutic process. 39
It is noteworthy that various international organizations, such as the G-20 summit, a prominent international forum, have recognized the need for patient-centered or person-centered care. They have emphasized the importance of this approach as a strategic policy for developing and implementing healthcare plans. 41
In such context, healthcare chatbots can facilitate the implementation of person-centered or patient-centered care within the healthcare service delivery networks which have been shown to enhance not only the clinical criteria of healthcare services but also the criteria related to user perception of healthcare services, such as user engagement.42–44 Therefore, healthcare chatbots can become the primary source of referral when users need healthcare services; This phenomenon indicates the significance of taking into account the factors affecting patient engagement within the process of delivering healthcare services via chatbots.
User characteristics
The findings showed several demographic and health status factors such as age, gender, and mental health condition influencing patient engagement with mental health chatbots.
The World Health Organization and other organizations have advocated for a feminist global health agenda that aims to address gender inequalities and barriers in health care and to achieve gender equity and empowerment for women and girls in the provision of healthcare services. 45 For instance, considering that the female population in certain regions of the world, especially Muslims in the Middle East may have distinct values and preferences, the development of chatbots that respect gender preferences could be a significant advancement in such a feminist agenda in the health care systems. 46 Since, the gender of healthcare providers, the discomfort and insecurity of being in mixed-gender settings, and the gender of the medical staff are all important aspects of person-centered care from the perspective of Muslim women who, as adherents of Islam, generally avoid any direct contact with the opposite gender due to their beliefs. 47
Furthermore, chatbots are deemed an unsuitable intervention for addressing health conditions that have high perceived stigma and severity, according to the evidence. As a result, healthcare customers may not use this technology as an independent intervention and should not rely on it as a substitute for a trustworthy health information source from a health professional. However, chatbots could be a useful adjunct for enhancing doctor–patient communication for conditions with lower perceived stigma and severity. 48 However, future studies should determine the scope of these conditions and explore how chatbots could encourage more disclosure of sensitive information to health professionals who could then provide more relevant healthcare services.
Limitations and implications of the study
The study had some limitations, such as the lack of consideration of the tradeoffs or conflicts between different categories of the factors influencing patient engagement with chatbots. The study also didn’t account for the variability or diversity of user perceptions across different groups, contexts, or cultures, and didn’t include other potential user characteristics that may affect user engagement with chatbots, which were not found through the evidence from the search.
The study also had some implications, such as the need for chatbot developers to consider user preferences and needs when designing chatbots, to adopt a holistic and multidimensional approach to measure and optimize chatbot outcomes, to conduct user research and testing to understand and address user perceptions, and to provide chatbots that are accessible and inclusive for different users. Furthermore, the study reviewed the literature on the impact of color and music on mental health and suggested some strategies for incorporating these factors in chatbot design. Finally, based on the findings of the study, it is suggested that future original researches are required to explore and describe unobserved domains of patient engagement within mental health chatbots.
Conclusions
This research investigated the factors affecting patient engagement with mental health chatbots, and found four categories of factors: chatbot design, chatbot outcomes, user perceptions, and user characteristics. It also examined the literature on the role of color and music in mental health and suggested some ways to integrate these factors in chatbot design. Furthermore, the research offered some practical and theoretical insights for chatbot developers and researchers. Moreover, it highlighted the potential of chatbots to facilitate person-centered or patient-centered care in mental health services and to improve the clinical and perceptual criteria of healthcare outcomes. Finally, further research is needed to obtain a comprehensive view of the factors influencing patient engagement with chatbots in mental health services, and there is a need for more collaboration and communication between chatbot developers, researchers, health professionals, and users.
Supplemental Material
sj-docx-1-dhj-10.1177_20552076241247983 - Supplemental material for Factors influencing patient engagement in mental health chatbots: A thematic analysis of findings from a systematic review of reviews
Supplemental material, sj-docx-1-dhj-10.1177_20552076241247983 for Factors influencing patient engagement in mental health chatbots: A thematic analysis of findings from a systematic review of reviews by Mohsen Khosravi and Ghazaleh Azar in DIGITAL HEALTH
Supplemental Material
sj-docx-2-dhj-10.1177_20552076241247983 - Supplemental material for Factors influencing patient engagement in mental health chatbots: A thematic analysis of findings from a systematic review of reviews
Supplemental material, sj-docx-2-dhj-10.1177_20552076241247983 for Factors influencing patient engagement in mental health chatbots: A thematic analysis of findings from a systematic review of reviews by Mohsen Khosravi and Ghazaleh Azar in DIGITAL HEALTH
Footnotes
Acknowledgments
We would like to acknowledge the Bing AI chatbot for its contribution to rewriting the manuscript in terms of English grammar and wording of the text.
Contributorship
Ghazaleh theorized the project and cooperated in writing the introduction and discussion sections of the manuscript. Mohsen performed the search within the databases, conducted the thematic analysis of the findings and wrote the text of the manuscript.
Consent statement
Not applicable; Since the study is a review in methodology and no individual (patient) has been included in current research.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical approval
Not applicable to this methodology of research.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Guarantor
The guarantor for the whole process of the manuscript is the corresponding author.
Supplemental material
Supplementary material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
