Abstract
Background
Individuals with prehypertension are at risk of developing hypertension, which affects many adults globally. Sustained physical activity (PA) can lower blood pressure, but maintaining long-term behavior change remains difficult. While PA habit formation interventions are promising, they face issues with scalability and accessibility. At the same time, behavior change chatbots have appeared, but their development often lacks systematic methods. Additionally, optimizing large language models (LLMs) to improve chatbot efficiency and reduce costs still needs more research.
Objective
This study introduces HabitBot, an LLM-integrated chatbot designed to foster PA habits in prehypertensive adults. HabitBot leverages LLMs for seamless interactions and integrates multidisciplinary insights, theoretical frameworks, and evidence to enhance the behavior change process.
Methods
HabitBot was developed through a systematic five-phase process: Phase 1, needs assessment via multidisciplinary discussions; Phase 2, literature review to identify relevant behavior change theories; Phase 3, selection of effective behavior change techniques (BCTs); Phase 4, intervention mapping for prototype design; and Phase 5, usability testing and focus group interviews for refinement.
Results
The process led to eight identified user needs and synthesized the Health Action Process Approach with Habit Formation Theory. Twelve effective BCTs were selected. The prototype was developed and refined across six dimensions based on user feedback. Evaluations indicated high usability, with a mean chatbot usability score of 3.84 (SD 0.82).
Conclusion
HabitBot integrates behavior change strategies with advanced LLM technology, representing a novel approach in chronic disease prevention. Future research should assess its long-term impact and generalizability.
Introduction
Hypertension remains one of the leading global risk factors for premature death and disability, contributing to an estimated 10.8 million preventable deaths and 235 million disability-adjusted life years annually. 1 Given that hypertension often requires lifelong medication once diagnosed, 2 there is a pressing need to implement early interventions targeting blood pressure regulation. Prehypertension, defined as blood pressure between 120–139/80–89 mmHg, affects approximately 25–50% of adults worldwide. 3 According to the 2024 Chinese Guidelines for the Prevention and Treatment of Hypertension, over 41.3% of Chinese adults exhibit prehypertensive levels, with 8–20% progressing to hypertension annually. 4 These data underscore the importance of early lifestyle interventions in the preclinical stage of hypertension. 5
Among lifestyle strategies, increasing physical activity (PA) has emerged as one of the most effective non-pharmacological approaches to lowering blood pressure. Meta-analyses show that regular PA can lower systolic blood pressure by about 4.4 mmHg and diastolic by 4.2 mmHg on average, 6 effects nearly comparable to certain antihypertensive medications. 7 However, long-term behavior change is notoriously difficult to achieve, as individuals frequently struggle to initiate PA and maintain it consistently over time.6,8 Thus, effective PA interventions must address not only behavior initiation but also maintenance over time.
Recent advances in behavioral science emphasize the importance of building automatic, context-driven routines—rather than relying solely on conscious motivation—to sustain long-term behavior change. 9 Dual-process model proposes that while reflective processes (e.g. planning, intention formation) are crucial for behavior initiation, automatic processes (e.g. cue–behavior associations) are essential for habit formation. 10 Theoretical frameworks such as the Health Action Process Approach (HAPA) and Habit Formation Theory offer structured explanations of how individuals transition from intention to sustained behavior.11–13 These models provide actionable guidance for designing interventions that foster PA habits in real-world contexts.
With the rise of mobile technologies and artificial intelligence (AI), scalable and adaptive digital health interventions have become increasingly feasible.14,15 In China, smartphone penetration now exceeds 73%, offering a strong foundation for digital solutions. 16 However, adults with prehypertension still lack consistent behavioral support after health screenings. 17 Most existing health apps prioritize data tracking over sustained lifestyle coaching, leading to a persistent disconnect between early risk detection and actual health behavior change. 18 Conversational agents and chatbots show promise in offering real-time guidance and contextual prompts to support health-related behavior change.19,20 However, few existing systems integrate evidence-based behavior change models and theories, especially for Chinese users and chronic disease prevention.
Recently, large language models (LLMs) have introduced new possibilities for enhancing the responsiveness and depth of behavior change chatbots. 21 While several LLM-based health bots have emerged, 22 most rely on single-turn interactions, limiting their ability to manage complex, multi-stage behavior change processes.23–25 Notably, GPTCoach—a chatbot designed to promote PA—demonstrates the potential of LLMs in this field. 26 However, it lacks integration with validated behavior change frameworks and focuses primarily on motivation rather than long-term habit formation. To address these gaps, the present study introduces “HabitBot,” an LLM-driven chatbot specifically designed to foster PA habits among adults with prehypertension. HabitBot is built on a user-centered, theory-driven, and evidence-informed design process, aiming to bridge the gap between conversational fluency and structured guidance for behavior change.
Methods
Overview
This study was conducted in Beijing, China, where adults with prehypertension are predominantly middle-aged working individuals with sedentary occupations and low levels of PA. 27 The region has high smartphone penetration and widespread use of mobile health applications, providing a favorable environment for deploying an AI-enabled intervention to promote PA. 28 A preliminary version of this work was presented as an abstract at Nursing Informatics 2024. 29 This study follows the ethical principles and has received approval from the Peking Union Medical College (Ethics Approval Number: PUMCSON-2024–23). All participants provided written informed consent before taking part. Additionally, the resulting product of this study, HabitBot, is currently undergoing a randomized controlled trial to assess its effectiveness in behavior change. This trial has been registered with the Chinese Clinical Trials Registry (Registration Number: ChiCTR2400085073). The development framework was based on Song's research, 30 which was later adapted and revised. The chatbot was systematically developed through five steps (Figure 1). In phase 1, the needs of prehypertensive individuals were identified through a multidisciplinary group discussion. In phase 2, according to the dual-process model, behavior change theories fitting the selection criteria were identified through a literature review. In stage 3, a systematic review and meta-regression determined which components are most effective in current PA habit interventions. In stage 4, the content of the chatbot-based intervention was designed using intervention mapping (IM), and the prototype was developed by integrating findings from the previous three stages. In stage 5, the chatbot intervention was iteratively refined based on user feedback gathered through focus group interviews.

Overview of the chatbot systematic development process.
Phase 1: Needs assessment
A multidisciplinary team of eight key stakeholders was assembled, including two members of the target population (individuals with prehypertension) and six experts in psychology, informatics, cardiovascular disease, and public health at Peking Union Medical College. The panel discussion explored the lifestyle habits, challenges, and motivations unique to individuals with prehypertension. Participants (mean age 36.3 ± 4.2 years; 83% female) were recruited through the research team's professional network. This discussion was audio-recorded and transcribed verbatim by HY. We then performed a thematic analysis of the transcripts to identify key themes. 31 Two researchers (HY and HM) independently coded the transcripts, compared emerging codes, and refined the themes through iterative discussion. Any disagreements were resolved with the help of a third researcher (MP) to ensure the trustworthiness of the analysis.
Phase 2: Identifying applicable underlying theories
In Phase 2, our objective was to identify intervention theories that would most effectively underpin our efforts to promote and maintain PA using the dual-process model as a selection lens. Specifically, guided by the dual-process distinction between reflective and automatic determinants of behavior, we prioritized theories (or combinations of theories) that collectively address both PA initiation/maintenance through reflective self-regulation (e.g. intention formation, planning, self-efficacy) and long-term maintenance through automaticity and cue-dependent repetition (i.e. habit formation). To achieve a comprehensive understanding of the behavioral change theory landscape, we undertook an extensive literature review. Four databases (PubMed, Embase, PsycINFO, and CINAHL) were searched, with an emphasis on those relevant to PA maintenance or habit formation. The procedural details of our literature search and screening criteria for theories are documented in Supplementary Material 1.
Phase 3: Evidence identification
To consolidate evidence on PA interventions, we conducted a systematic review and meta-analysis of existing studies on PA habit formation intervention. 32 To apply the most effective behavior change techniques (BCTs; standardized, observable intervention components, such as goal setting, self-monitoring, and feedback) to the design of the intervention content while avoiding those with adverse effects, we extracted BCTs that were positively associated with intervention effects from our meta-regression. Subsequently, we applied APEASE to ensure selected BCTs were not only effective but also feasible, acceptable, and safe for delivery via a WeChat-based chatbot. 33 We also scrutinized BCTs that showed a statistically significant negative correlation with intervention outcomes. The key findings and most relevant BCTs regarding the effectiveness of various interventions in promoting PA habits were extracted and used in the following intervention design.
Phase 4: Prototype design and development
We applied IM, a structured framework for developing theory- and evidence-based health interventions, to integrate identified needs, theories, and evidence into the chatbot prototype. 34 Our initial step involved a thorough mapping of the specific needs of prehypertensive individuals, identifying key barriers and facilitators that prevent regular PA. For each identified need, we developed corresponding intervention strategies. This included choosing the delivery platform (e.g. a mobile app or social media), designing the chatbot's dialogue flow to ensure user engagement and continuity, and creating tailored dialogue content to provide personalized support and encouragement. To meet the user's need for “accessibility and convenience” identified in the needs assessment (detailed in the results section), we chose the WeChat Mini Program as the leading platform for the intervention. The WeChat Mini Program has a broad user base and convenient social features that make it easy to increase user engagement and retention. In addition, leveraging WeChat's existing infrastructure can significantly reduce development and maintenance costs.
Second, we extract influential pathways for PA initiation and habit formation from the selected theories (e.g. self-efficacy and action planning from HAPA 13 ; cue–behavior association from habit formation theory. 35 Based on these theoretical pathways, we developed the intervention model and its contents and strategies.
Third, guided by findings from our systematic review and meta-analysis, 32 we prioritized 12 BCTs with demonstrated efficacy in facilitating PA habit formation. We developed a BCT–script matrix to systematically map each selected BCT to its corresponding chatbot dialogue structure (see Supplementary Material 2).
Fourth, we integrated all identified functionalities and content into a unified system architecture. This integration involved the construction of functional modules, interactive interfaces, and databases, ultimately leading to the implementation within the HabitBot WeChat service account and Wechat mini program.
To support this chatbot-based digital intervention, we utilized the HUAWEI WATCH as our wearable device. This smartwatch was selected not only for its China Food and Drug Administration-certified cuff-based blood pressure measurement accuracy 36 but also for its vital role in supporting ongoing self-monitoring and personalized feedback within the intervention. Linking BP data with HabitBot enabled users to monitor short-term BP changes, increase risk awareness, and receive personalized behavioral messages. The wearable also provided vibration-based exercise prompts, implementing key BCTs such as self-monitoring, feedback, and contextual cues. Therefore, the wearable was a crucial part of the pilot system.
The development of the chatbot's conversational flow and operational logic was facilitated using the Botpress Cloud platform (https://botpress.com/). This platform was chosen for its intuitive development environment, minimal coding requirements, and its support for advanced language processing capabilities. Notably, Botpress Cloud incorporates the GPT-4o language model, 37 which equips our chatbot with the capability to understand and generate responses to user inputs in natural language, tailoring dialogue based on predefined prompts, thereby enhancing the chatbot's conversational quality and user interaction experience.
To protect privacy and confidentiality, HabitBot was designed with basic data-protection safeguards. Only data required for intervention delivery and evaluation (e.g. PA metrics, BP readings, and task completion logs) were collected. Identifiers were minimized in the research dataset, and access to the study database was restricted to authorized research personnel. The intervention was reported following the TIDieR checklist, with a completed checklist provided in the Supplementary Materials 3. In addition, we reported the AI-enabled development and early evaluation following the DECIDE-AI reporting guideline; a completed DECIDE-AI checklist is provided in the Supplementary Materials 4.
Phase 5: Refining the prototype and evaluation
Four online focus groups were conducted via Tencent Meeting (similar to Zoom) from October to December 2023 to gather user feedback and enhance the prototype. Participants were recruited through Xiaohongshu (a popular social media platform in China) using convenience sampling. An open invitation post was used, and those interested were screened for eligibility. Inclusion criteria included: (1) meeting prehypertension diagnostic criteria, (2) aged 18–60 years, (3) low baseline PA level, (4) owning a smartphone with WeChat installed, and (5) normal cognitive function. Participants who completed the focus groups were subsequently invited to participate voluntarily in the pilot usability testing. Those who agreed were enrolled in individual testing sessions. A total of 36 eligible participants joined the focus groups and usability testing sessions.
Three days before each formal interview, we release the chatbot and wearable device to users for preliminary testing. During this period, users were provided with a list of test tasks encompassing the main features of HabitBot and were encouraged to explore as many of these features as possible within the three days. Designed and moderated by HM, the testing and review sessions aimed to facilitate interaction and ensure a structured yet open dialogue among participants. These interviews followed a set of predefined questions and prompts aimed at eliciting detailed feedback on various aspects of the chatbot. The procedures and interview questions utilized in the focus group interview are detailed in Supplementary Material 5.
At the conclusion of the interviews, all participants were asked to fill out the Chatbot Usability Scale (BUS), which had been translated and adapted by our research team to assess the usability of HabitBot. 38 The BUS is a reliable and valid usability scale that provides a comprehensive view of a chatbot product's subjective usability at the end of a study. It comprised 11 items, measuring five dimensions of chatbot usability: perceived accessibility to chatbot functions, perceived quality of chatbot functions, perceived quality of conversation and information provided, perceived privacy and security, and response time. The overall result is calculated by averaging all item scores (on a 5-point Likert scale), with higher scores indicating higher usability. Each participant received a voucher worth RMB 50 upon completing the interviews and questionnaires.
The focus group participants were all in the prehypertension stage, with an average systolic blood pressure of 127.44 (SD 6.04) and an average diastolic blood pressure of 81.42 (SD 6.26). Among them, 58.33% (21/36) were male, and 41.67% (15/36) were female. Participants aged 18–25, 26–30, 31–40, 41–50, and 51–60 years accounted for 16.67% (6/36), 30.56% (11/36), 27.78% (10/36), 19.44% (7/36), and 5.56% (2/36), respectively. Additionally, 66.67% (24/36) of the participants had prior experience interacting with other chatbots.
The interviews were audio-recorded, transcribed (by HM), and verified (by HY). The first coder (HY) analyzed the transcripts using NVivo software and initially extracted themes with indexed codes. Subsequently, HY discussed the preliminary coding and thematic results with the second coder (HM). Then, HM independently coded 50% of the transcripts and provided feedback to refine the initial themes. The themes were further refined and revised through an iterative process to ensure they accurately reflected the core content of the discussions. Any disagreements were resolved through discussion with a third researcher (MP) until consensus was reached. Finally, the team reviewed each suggestion and implemented iterative refinements for those deemed feasible, appropriate, and aligned with the intervention goals.
Result
Phase 1: Needs assessment
Eight potential needs for people with prehypertension were proposed through a panel discussion, including lack of awareness and knowledge, motivational gaps, difficulty in habit formation, personalization needs, accessibility and convenience, accountability and monitoring, social support, and safety concerns. Table 1 includes these themes, further description, and direct quotes from the panel discussion.
PA intervention need for prehypertension individual from the panel discussion.
PA: physical activity.
Phase 2: Theory identification
Following the Phase 2 literature review and screening (Supplementary Material 1), we selected HAPA and habit formation theory because, together, they cover reflective and automatic pathways highlighted by dual-process perspectives: HAPA effectively addresses the reflective, motivational and volitional stages of behavior change (e.g. risk perception, outcome expectancies, planning), whereas habit formation theory focuses on the development of automatic, context-triggered behaviors (e.g. repetition, cue–behavior association). This combined theoretical model covers both the initiation of PA and its maintenance through habit formation. A detailed comparison of these two theories, along with a summary of the screening of 11 candidate theories, is available in Supplementary Material 6. Figure 2 illustrates our integrated theoretical model, highlighting eight key constructs (self-efficacy, risk perception, outcome expectations, action planning, coping planning, behavior repetition, cue–behavior association, and affect) that underpin the HabitBot intervention.

The theoretical model of intervention.
Phase 3: Evidence synthesis
To consolidate evidence on PA habit formation interventions, we leveraged our previously published systematic review and meta-analysis, 32 and conducted an APEASE-based feasibility appraisal to translate the evidence into chatbot-deliverable components. Based on our research, we identified 12 BCTs as being instrumental in facilitating PA habit formation according to APEASE criteria (BCTs evaluations were shown in Supplementary Material 7), which included: Problem Solving, Action Planning, Habit Formation, Feedback on Behavior, Information about antecedents, Goal setting (behavior), Goal setting (outcome), Social support (unspecified), Instruction on how to perform behavior, Self-reward, Self-monitoring of behavior, and Behavioral practice / Rehearsal.
Notably, in our analysis, “social reward” showed a negative association with PA habit formation, suggesting that emphasizing external praise might undermine intrinsic motivation. 32 This finding offers valuable guidance for the design of interventions.
Phase 4: Integrating need, theory and evidence into the prototype design and development
Figure 3 presents the logic model of the HabitBot intervention, illustrating how user needs, theoretical constructs, and evidence-based BCTs were mapped onto the chatbot's design. By aligning every feature of HabitBot with identified user needs and proven principles of behavior change, the intervention was grounded in a comprehensive framework. We anticipate that this integrative design may improve users’ PA levels (short-term outcome) and, in turn, could contribute to better blood pressure control and quality of life (long-term outcomes). These potential benefits remain to be confirmed in future evaluations.

Logic model of the HabitBot.
In summary, Phase 1 identified eight user needs related to PA habit formation. Phase 2 yielded an integrated theoretical foundation combining HAPA and habit formation theory to address both conscious decision-making and automatic habit processes. Phase 3 provided empirical guidance, identifying the key BCTs (10 with positive influence and one with negative influence on habit formation) to include in the intervention. These elements were systematically translated into HabitBot's design (Table 2 details the mapping of user needs to theoretical constructs and BCT-based features).
Chatbot platform functionalities based on user needs, theoretical pathways, and evidence.
Behavior Change Techniques are divided according to the Behavior Change Techniques (BCTs) Taxonomy. (+): indicates that the BCT is positively correlated with the intervention effect. (−): indicates that the BCT is negatively correlated with the intervention effect.
PA: physical activity.
Specially, HabitBot's implementation comprises four main components: (1) a WeChat-based user interface (delivering accessibility and convenience), (2) a chatbot backend powered by both rule-based flows and an LLM for personalized guidance, (3) a cloud server aggregating user data for monitoring and feedback, and (4) a BP-monitoring smartwatch providing safety and accountability through physiological tracking and reminders (Figure 4 illustrates the system architecture).

System architecture of HabitBot.
By leveraging WeChat's ubiquity, HabitBot minimized barriers to use (no separate app installation required) and turned the routine act of messaging into an intervention delivery mechanism (WeChat notifications served as stable cues for habit formation). Figure 5 shows a screenshot of the “My Coach” chat interface, and Figure 6 outlines the five main dialogue pathways users can engage with. Figure 7 provides an example chat sequence, and Figure 8 demonstrates how the rule-based and LLM-driven components interact during a personalized exercise planning dialog. All the prompts we used were presented in Supplementary Material 8.

Screenshot from “My coach” of the HabitBot chatbot intervention.

Overall configuration of the “My coach” module.

Screenshot of example dialog with “My Coach.”

Example explanations for integration of rule-based and LLM components in HabitBot. LLM: large language model.
Phase 5 prototype refining and evaluation
A total of 36 users participated in the focus group evaluations of HabitBot. Qualitative analysis of the focus group discussions revealed six major themes regarding user experience and suggestions: (1) User Interface & Interaction (usability and visual appeal of the chatbot interface), (2) Content & Resource Accessibility (clarity and usefulness of information provided, links to resources), (3) Chatbot Features (desired functionalities, such as memory of past conversations), (4) Individualization & Adaptability (personalization of advice and coach persona options), (5) Privacy (comfort with data sharing and security), and (6) Support & Community (the need for social support or group features). We used this feedback to implement several key improvements in the prototype. For example, we simplified medical jargon in the chatbot's messages to improve clarity (Theme 2), added the option for users to choose between different coach “personalities” to increase engagement (Theme 4), and introduced more social support features (like encouraging messages and an option to share achievements with a friend) to address Theme 6. We also emphasized privacy protections in the user onboarding (e.g. informing users that their data is stored securely and not shared without consent) in response to Theme 5. A detailed list of user feedback and corresponding changes is provided in Supplementary Material 9.
The usability evaluation results, as shown in Table 3, indicate that the overall usability score of HabitBot was 3.84 (SD = 0.82), suggesting that HabitBot was generally effective and easy to use for users. Across all items in the BUS, the average scores received positive ratings above 3.0. Among the five dimensions assessed, the scores ranked from highest to lowest as follows: Perceived Accessibility to Chatbot Functions (Mean = 4.07, SD = 0.78), Perceived Quality of Chatbot Functions (Mean = 3.88, SD = 0.78), Perceived Quality of Conversation and Information Provided (Mean = 3.84, SD = 0.80), Time Response (Mean = 3.72, SD = 0.91), and Perceived Privacy and Security (Mean = 3.36, SD = 0.90).
Chatbot usability scale scores of each item (n = 36).
Discussion
Principal findings
This study introduces HabitBot, a novel LLM-integrated chatbot developed to promote PA habit formation among prehypertensive individuals through a systematic design process. To our knowledge, it is the first study to develop AI chatbot-based PA behavior change intervention for prehypertensive individual integrating needs, theories, and evidence. Notably, it is also the first to embed LLM technology within a comprehensive theoretical and evidence-based framework, thereby addressing highly personalized user needs while maintaining a focus on behavior change principles.
Based on the development process of our system, we identified eight user needs, two behavior change theories, and 12 BCTs that support the formation of PA habits. These findings can be used to help prehypertensive individual develop long-term PA behavior. Moreover, the systematic development process may offer a roadmap for other behavior change interventions seeking to leverage both theory and advanced AI-driven tools.
Interpretation of key findings
First, in alignment with the previous findings, the results from phase 1 highlighted several barriers and facilitators within long-term PA interventions. At the individual level, boosting motivation, raising awareness, self-monitoring, and habit formation are all essential. At the implementation level, it is essential to consider the personalization, accessibility, social support, and safety of the interventions. Additionally, our inclusion of multidisciplinary stakeholders in Phase 1 diverges from traditional user-centric assessments by incorporating the observational insights of professionals from diverse fields, thus providing a more nuanced understanding of user needs.30,39
Second, our theoretical intervention model, which integrates HAPA and Habit Formation Theory, emphasizes the need to address both reflective and reflexive pathways. In line with this, recent behavioral interventions have begun adopting dual-process strategies,40,41 recognizing that maintaining health behaviors requires both initial motivation and habit formation. 42 Moreover, many scholars now consider “habit discontinuity” moments—such as life-stage transitions or new diagnoses—as pivotal opportunities to disrupt established unhealthy routines and foster healthier alternatives. 43 By combining HAPA with Habit Formation Theory, HabitBot seeks to capitalize on the contextual upheaval brought about by a prehypertension diagnosis, thereby increasing the likelihood of instilling new, sustainable PA habits.
Third, akin to prior findings, 44 our research builds upon the foundational work of our team to identify 12 BCTs highly relevant to the formation of PA habits, such as problem solving, action planning, and feedback on behavior. Notably, among these 12 BCTs, we observed a potential negative correlation between social reward and PA habit interventions, which supports the results of a previous study by Cherubini et al. 45 Because this evidence suggested a potential risk of undermining intrinsic motivation, we revised the reward messaging after the task was completed. Instead of generic praise for good performance, we shifted to commending the effort itself and incorporated guidance for participants to recognize the intrinsic positive feelings following exercise. This comprehensive consideration of BCTs aims to maximize the utility of empirical evidence in promoting effective habit formation.
Fourth, we systematically translated needs, theories, and evidence into the intervention framework and interaction content, representing a more comprehensive and systematic approach to chatbot development than previously observed. 20 To meet the demands for convenience and accessibility, the intervention was developed on a WeChat mini program, in line with considerations similar to those of Chen et al. 14 This delivery way relieves the need for users to undergo the burdens of downloading, registration, and logging in. Moreover, as a daily chat application, WeChat could function as a stable cue, allowing user to retain habit-dependent cues even in changing environments and thus better avoid habit discontinuity. 46
Fifth, in the refinement phase of Phase 5, we undertook direct user testing of the product alongside focus group interviews, which provided profound insights beyond our anticipatory scope. Notably, the qualitative feedback acquired, though not exhaustive to the point of data saturation, offered indispensable perspectives for the AI chatbots’ continuous refinement. Participants highlighted critical aspects such as the user experience, the enjoyment derived from interactions, and the intrinsic need for social connectivity. These aspects resonate with Cheng et al.'s observations, which emphasize the importance of user experience in technology adoption. 47 Further, our findings align with Zhang et al.'s emphasis on the significance of building relational capacity in chatbots as a crucial factor for sustaining user engagement. 48 This underscores the evolving role of digital tools in enhancing user engagement and satisfaction, particularly in therapeutic interventions. Notably, the notion that the benefits to one's health may predominantly influence the initial adoption stage of behavior, while the maintenance of healthy behavior hinges on the individual's enjoyment and engagement, adds a nuanced layer to our understanding of user interaction with AI chatbots. 49 Recent advancements in AI and natural language processing have significantly lowered the barriers to developing chatbot interventions. However, as highlighted by previous works,50,51 the challenge remains in creating chatbots that are genuinely user-centered and theoretically informed.
Contributions of the large language model-integrated hybrid approach
Lastly, HabitBot integrates a hybrid approach that combines the strengths of rule-based and LLM-based chatbots to balance the advantages of both types. 52 Many existing applications solely rely on one type of system, either rule-based or generative.50,53 HabitBot employs a strategy that not only ensures the effectiveness of interventions but also enhances the flexibility of interactions. This dual focus aligns with the findings of Maeng et al., emphasizing the importance of integrating both structured and dynamic elements in digital interventions to achieve optimal outcomes. 54 By integrating rule-based strategies, the application ensures that both the content and the process of the intervention are grounded in a robust theoretical and empirical habit formation framework. 52 For instance, the chatbot aids users in locating nearby resources and offers personalized exercise advice before assisting them in creating an exercise plan.
Additionally, its intent-based features, powered by advancements in LLMs, enable the chatbot to understand user intents through natural language processing techniques. For example, when a user says in natural language, “I have to work overtime tonight and cannot complete the task,” HabitBot can not only recognize that he has not completed today's PA task, but also understand the implicit intent behind him. So HabitBot further asks the user if problem solving is needed. Meanwhile, the text generation capabilities of LLMs overcome the limitations of template-based chatbots by providing targeted, empathetic, and high-quality responses, significantly enhancing user experience and contributing to the progression of digital health interventions. 55 This is especially important in situations where the difficulties faced by each user during their daily activities are highly individualized (e.g. his busy schedule at work, family, and life or physical state limits his opportunities for exercise). Furthermore, using the LLMs, we can build a knowledge base with integrated health management guidelines at a low cost, giving users access to scientific and personalized professional responses through natural language queries. 56 Through these enhancements, HabitBot demonstrates the potential of blending rule-based and AI-driven approaches in enriching digital health solutions.
Limitations and future work
This study has several limitations. First, the intervention design process involved some subjective judgment. For example, the selection of theories in Phase 2 was influenced by the research team's expertise and interpretation of the literature, potentially biasing the framework by omitting other relevant theories. In future research, methods such as the Delphi consensus approach could be used to incorporate a broader range of expert opinions and reduce individual bias. Second, the Phase 1 needs assessment was based on a small sample of 8 stakeholders. This limited sample size may not have captured all potential user needs, and the insights might not be fully comprehensive. Conducting additional focus groups or interviews with more participants, including end-users, would likely provide a deeper understanding of prehypertensive individuals’ needs and help achieve data saturation. Third, the study participants—both experts involved in development and users in evaluation—were all from central urban areas of Beijing, had relatively high educational levels, and were young-to-middle-aged working professionals. This homogeneity limits the applicability of our findings, especially to rural populations, older adults, or those with lower digital literacy. Future research should test HabitBot in more diverse groups, especially in remote or underserved areas where accessible digital interventions could be particularly beneficial. Fourth, although basic privacy and security measures were integrated into HabitBot (such as secure data storage and limited access to personal information), participants’ ratings for perceived privacy/security were the lowest among the usability aspects. This shows that there is room to improve both the implementation of privacy features and how we communicate these protections to users. Enhancing data-protection measures (such as two-factor authentication and clearer privacy notices) and openly addressing privacy concerns will be essential as we refine the design.
Finally, a large-scale evaluation of HabitBot's effectiveness is essential. We have already started a randomized controlled trial (as mentioned earlier) to assess the chatbot's impact on PA behavior and clinical outcomes such as blood pressure, waist circumference, and body mass index. The results from this trial will offer more conclusive evidence of HabitBot's effectiveness in improving health metrics and will help identify any necessary adjustments before broader implementation. In the future, we also plan to explore long-term user engagement with HabitBot and whether the PA habits formed are sustained over time without ongoing chatbot support.
Conclusion
Ensuring that the development process of behavior change chatbots is rigorous and scientifically grounded, while effectively integrating LLM technology to enhance the fluidity of conversations within a structured framework, is a critical issue in global public health. This study describes the rigorous development of HabitBot, a chatbot-based intervention designed to help prehypertensive users form PA habits. In this development process, through rigorous needs assessment, theoretical screening and combined with empirical evidence, we provide useful references for future healthcare chatbot development. As AI and digital health technologies advance, solutions like HabitBot are poised to become fundamental components of public health strategies, providing scalable, personalized, and effective approaches for preventing and managing chronic diseases.
Supplemental Material
sj-docx-1-dhj-10.1177_20552076261421367 - Supplemental material for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension
Supplemental material, sj-docx-1-dhj-10.1177_20552076261421367 for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension by Haoming Ma, Hongzhen Cui, Hongyu Yu, Runyuan Pei, Sijia Li, Aoqi Wang, Xingyi Tang, Guangnan Liu and Meihua Piao in DIGITAL HEALTH
Supplemental Material
sj-docx-2-dhj-10.1177_20552076261421367 - Supplemental material for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension
Supplemental material, sj-docx-2-dhj-10.1177_20552076261421367 for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension by Haoming Ma, Hongzhen Cui, Hongyu Yu, Runyuan Pei, Sijia Li, Aoqi Wang, Xingyi Tang, Guangnan Liu and Meihua Piao in DIGITAL HEALTH
Supplemental Material
sj-pdf-3-dhj-10.1177_20552076261421367 - Supplemental material for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension
Supplemental material, sj-pdf-3-dhj-10.1177_20552076261421367 for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension by Haoming Ma, Hongzhen Cui, Hongyu Yu, Runyuan Pei, Sijia Li, Aoqi Wang, Xingyi Tang, Guangnan Liu and Meihua Piao in DIGITAL HEALTH
Supplemental Material
sj-pdf-4-dhj-10.1177_20552076261421367 - Supplemental material for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension
Supplemental material, sj-pdf-4-dhj-10.1177_20552076261421367 for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension by Haoming Ma, Hongzhen Cui, Hongyu Yu, Runyuan Pei, Sijia Li, Aoqi Wang, Xingyi Tang, Guangnan Liu and Meihua Piao in DIGITAL HEALTH
Supplemental Material
sj-doc-5-dhj-10.1177_20552076261421367 - Supplemental material for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension
Supplemental material, sj-doc-5-dhj-10.1177_20552076261421367 for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension by Haoming Ma, Hongzhen Cui, Hongyu Yu, Runyuan Pei, Sijia Li, Aoqi Wang, Xingyi Tang, Guangnan Liu and Meihua Piao in DIGITAL HEALTH
Supplemental Material
sj-docx-6-dhj-10.1177_20552076261421367 - Supplemental material for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension
Supplemental material, sj-docx-6-dhj-10.1177_20552076261421367 for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension by Haoming Ma, Hongzhen Cui, Hongyu Yu, Runyuan Pei, Sijia Li, Aoqi Wang, Xingyi Tang, Guangnan Liu and Meihua Piao in DIGITAL HEALTH
Supplemental Material
sj-xlsx-7-dhj-10.1177_20552076261421367 - Supplemental material for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension
Supplemental material, sj-xlsx-7-dhj-10.1177_20552076261421367 for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension by Haoming Ma, Hongzhen Cui, Hongyu Yu, Runyuan Pei, Sijia Li, Aoqi Wang, Xingyi Tang, Guangnan Liu and Meihua Piao in DIGITAL HEALTH
Supplemental Material
sj-docx-8-dhj-10.1177_20552076261421367 - Supplemental material for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension
Supplemental material, sj-docx-8-dhj-10.1177_20552076261421367 for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension by Haoming Ma, Hongzhen Cui, Hongyu Yu, Runyuan Pei, Sijia Li, Aoqi Wang, Xingyi Tang, Guangnan Liu and Meihua Piao in DIGITAL HEALTH
Supplemental Material
sj-docx-9-dhj-10.1177_20552076261421367 - Supplemental material for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension
Supplemental material, sj-docx-9-dhj-10.1177_20552076261421367 for Development and evaluation of an large language model-integrated chatbot intervention for physical activity habit formation in adults with prehypertension by Haoming Ma, Hongzhen Cui, Hongyu Yu, Runyuan Pei, Sijia Li, Aoqi Wang, Xingyi Tang, Guangnan Liu and Meihua Piao in DIGITAL HEALTH
Footnotes
Acknowledgements
We thank the multidisciplinary team of experts who participated in this study, including Professor Xin Zhang from Peking Union Medical College, Associate Professor Zhenling Ma, Dr Xinghe Huang, Professor Yu Sheng, Manager Guannan Zhu from Taikang Life Insurance Company, and Engineer Yuanhang Zhang from Shijiazhuang Cloud Technology Co., LTD. We also thank every patient who participated in this study.
Abbreviations
Ethics approval and consent to participate
This study was approved by the Ethics Committee of the School of Nursing, Peking Union Medical College (ethical approval number: PUMCSON-2024-23). All participants provided informed consent.
Consent for publication
The authors consent to publication.
Authors’ contributions
All authors have read and approved the final work. Every author significantly contributed to the study. HM contributed to the study through conceptualization, data curation, formal analysis, investigation, methodology, project administration, resources, software, supervision, and writing—original draft, as well as writing—review & editing. HC contributed to the software development and provided supervision throughout the study. HY contributed to the conceptualization, investigation, and methodology of the study. RP contributed to data curation and validation of the study. SL contributed to the validation and visualization aspects of the study. AW contributed to data curation and formal analysis of the study. XT contributed to investigation and formal analysis of the study. GL contributed to the validation and methodology aspects of the study. MP spearheaded the project, was responsible for funding acquisition, supervision, and project administration.
Declaration of conflicting interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: This manuscript has not been published or presented elsewhere in full, but a preliminary abstract of this work was presented at Nursing Informatics 2024.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Peking Union Medical College 2023 Medical Education Scholar Program, Non-Profit Central Research Institute Fund of Chinese Academy of Medical Sciences, (grant number 2023zlgc0711, 2023-RC320-01).
Availability of data and material
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
