Sage Journals: Discover world-class research

Abstract

The integration of artificial intelligence (AI) in flipped classrooms is reshaping foreign language writing instruction, yet its impact on student engagement, self-efficacy, writing performance, and technology affinity remains underexplored. This study investigates the effectiveness of AI chatbots in enhancing flipped learning by examining their influence on engagement, self-efficacy, writing performance, and Affinity for Technology Interaction (ATI), while controlling for gender. A quasi-experimental design involving 252 Indonesian English majors was employed, with participants divided into an experimental group (flipped learning + AI chatbot) and a control group (flipped learning only). Structural Equation Modeling (SEM) and Analysis of Covariance (ANCOVA) revealed that AI-enhanced flipped learning significantly improved self-efficacy, writing performance and engagement, though ATI declined over time. Gender differences emerged, with males exhibiting stronger cognitive engagement and self-efficacy, while females benefited more in emotional engagement and writing performance. These findings underscore the need for balanced AI-human interaction to optimize learning outcomes. This study contributes to emerging pedagogical models by highlighting the nuanced role of AI in language education and emphasizing inclusive, adaptive instructional design. Future research should explore diverse learner populations, long-term effects, and AI instructor proficiency to maximize the efficacy of AI-integrated flipped learning.

Plain Language Summary

AI Chatbots in Flipped Classrooms: How They Affect Learning and Gender

This study looks at how using AI chatbots in flipped classrooms affects students learning English as a foreign language. Flipped classrooms involve students reviewing materials before class and doing activities in class. The study focused on how chatbots impact students’ confidence in their abilities (self-efficacy), how engaged they are, their writing skills, and how comfortable they are with technology. The study found that using AI chatbots in flipped classrooms improved students’ self-efficacy, engagement, and writing performance. However, it also found that students’ comfort with technology decreased over time. Interestingly, the study also found that gender plays a role in how students interact with these tools. Male students showed stronger confidence and engagement, while female students showed greater improvement in their writing skills. Overall, the study suggests that AI chatbots can be helpful in flipped classrooms, but it’s important to consider how they affect different students and to balance AI with human interaction. Future research should look at different groups of students and the long-term effects of using AI in education.

Keywords

gender affinity for technology interaction self-efficacy engagement AI chatbot flipped learning foreign language writing

Introduction

Integrating educational technology in higher education has transformed traditional language learning, offering innovative ways to engage learners and enhance their skills. However, language education, which relies on direct interaction and immersive methods, faces unique challenges in adapting to digital tools (Soegoto et al., 2025). One promising approach is flipped learning, where students review writing-related materials before class and engage in hands-on activities such as drafting, editing, and peer review during class—have been shown to promote deeper learning and student agency in the writing process (Bergmann & Aaron, 2012; Hung, 2015).

While flipped learning has proven beneficial in language education, research highlights that technology-based approaches further enhance language acquisition and efficiency (Ortikov & Ugli, 2024). The use of artificial intelligence (AI) in education is increasingly gaining global interest among researchers in the field of foreign language learning (Babanoğlu et al., 2025). AI chatbots, in particular, offer significant potential in flipped writing instruction by addressing common challenges such as the lack of guidance and support during pre-class learning, which affects student engagement (Diwanji et al., 2018; Jeon & Lee, 2024; Lo & Hew, 2023). By offering real-time assistance and personalized feedback, chatbots improve student preparedness and engagement in interactive in-class activities (Lo & Hew, 2022, 2023; Wiyaka et al., 2024). Additionally, they replicate aspects of face-to-face interaction by providing instant writing feedback on grammar, style, and organization, fostering learner autonomy and reducing writing anxiety (Belda-Medina & Calvo-Ferrer, 2022; Fryer & Bovee, 2016). Empirical findings further support this, as Duong and Chen (2025) found that AI chatbots significantly enhanced writing performance in terms of content, organization, vocabulary, language use, and mechanics.

Writing is a crucial aspect of learners’ development of English as a Foreign language (EFL), serving as a fundamental productive skill (Duong & Chen, 2025). However, it is also the most challenging to master. Due to limited class time and large class sizes, many teachers find it difficult to offer individualized feedback, further complicating the learning process (Zhang, 2025). Moreover, effective writing development requires continuous practice, detailed feedback, and sustained motivation. Flipped classrooms offer an increase in in-class support for writing development (Silitonga et al., 2024). However, while research on flipped learning in language education is well-established (Hung, 2015; Shi et al., 2020)Studies on the combined impact of flipped instruction and AI chatbots on writing remain limited. Most focus either on flipped learning (Lee & Wallace, 2018) or AI-driven tools (Fryer & Bovee, 2016) in isolation. The integration of AI chatbots into flipped classrooms is an emerging and promising approach that warrants further exploration (Lo & Hew, 2023; Wollny et al., 2021).

While AI chatbots enhance learning, their design and implementation raise important equity concerns, particularly regarding gender bias. Previous studies indicate that many chatbots are designed with female identities, reinforcing gender stereotypes through avatars, language, and imagery, thereby perpetuating biases (Bastiansen et al., 2022). Research indicates that male and female learners interact with digital tools differently (Scherer & Siddiq, 2019; Venkatesh & Morris, 2000), making it essential to examine how gender influences technology affinity, engagement, self-efficacy, and writing outcomes in AI chatbot-supported flipped classrooms. earner diversity—especially regarding gender and technology affinity—shapes how students experience and benefit from these innovations. In flipped language classrooms, content is delivered online, allowing in-class time to focus on interactive tasks such as peer review and targeted feedback (Bergmann & Aaron, 2012; Silitonga et al., 2024). However, learners’ self-efficacy, a core concept in Bandura's (1977) social cognitive theory, significantly influences students’ willingness to adopt new tools and persist through challenges. In writing instruction, high self-efficacy correlates with improved writing performance (Woodrow, 2011). Furthermore, engagement, spanning behavioral, emotional, and cognitive dimensions, is a key driver of academic success (Reeve & Tseng, 2011). Research suggests that flipped learning, especially when supported by AI-driven scaffolding, enhance engagement by promoting active content interaction (Abeysekera & Dawson, 2015).

However, students’ Affinity for Technology Interaction (ATI) shapes their responses to AI tools, with high-ATI learners adapting more easily, while low-ATI learners may struggle (Franke et al., 2019; Scherer & Siddiq, 2019). Gender further moderates these effects, influencing engagement, self-efficacy, and writing proficiency in AI-supported flipped classrooms. Given the gendered nature of many AI chatbots (Bastiansen et al., 2022), it is crucial to examine how these biases affect student experiences and learning outcomes. Ensuring equitable and inclusive instructional design is essential for maximizing the benefits of AI-integrated flipped learning while minimizing unintended disparities in student engagement and performance

Despite extensive research on flipped classrooms and AI-driven feedback tools, few studies have examined their combined impact on foreign language writing instruction. Additionally, the moderating roles of gender and Affinity for Technology Interaction (ATI) in this learning model remain largely unexplored. This study addresses these gaps by investigating whether integrating AI chatbots into flipped instruction enhances writing performance, self-efficacy, engagement, and technology affinity in foreign language classrooms. By considering gender as a control variable, this research provides new insights into how learner characteristics interact with emerging pedagogical models, contributing to more adaptive and inclusive instructional designs.

To achieve these objectives, the study explores the following research questions (RQs):

Does an AI chatbot-based flipped learning approach have a greater impact on students’ self-efficacy, engagement, ATI, and writing performance in foreign language education than a conventional flipped classroom?

How does Affinity for Technology Interaction (ATI) influence students’ self-efficacy, emotional engagement, cognitive engagement, behavioral engagement, and writing performance in foreign language education?

Does affinity for technology interaction (ATI) predict self-efficacy, emotional engagement, cognitive engagement, behavioral engagement, and writing performance, with robust findings when controlling for gender?

Literature Review

AI Chatbot-Based Flipped Approach in Writing Classroom

Flipped classrooms reorganize lectures and resources outside of class and focus on collaborative work in class (Bergmann & Aaron, 2012). This technique is especially beneficial for language acquisition since face-to-face interaction is essential for communicative skills (Hung, 2015). By engaging more actively and communicative in class, students often improve their motivation, critical thinking, and performance (Shi et al., 2020). Successful implementation requires appropriate technical resources and student accountability; students must prepare before class for maximum in-class involvement (Lo & Hew, 2017)). Flipped classrooms boost drafting, peer review, and instructor feedback in writing-focused instruction, improving writing quality over time (Silitonga et al., 2024).

An AI-powered chatbot is a software application that mimics human conversation using natural language processing, allowing it to comprehend and respond to user inquiries in a way that resembles human interaction (Lo & Hew, 2023). AI chatbots provide real-time grammar, vocabulary, and content organization feedback (Belda-Medina & Calvo-Ferrer, 2022). Their immediacy supports timely feedback, which is essential to writing development. Chatbots simulate authentic communication, lowering learners’ concern over mistakes and promoting autonomy and participation (X. Jin et al., 2024). Chatbots can reinforce lessons and provide support outside of class when used with a flipped model (Fryer & Bovee, 2016). However, chatbot accuracy and pedagogical relevance should be considered because poorly built or malfunctioning systems may hinder learning (Zhai, 2023). Ethical concerns around privacy and data ownership emphasize the necessity for institutional rules.

This study utilizes ChatGPT, an AI-powered chatbot by Open AI, which generates authentic text based on user prompts (Boudouaia et al., 2024). Furthermore, the study incorporates ChatGPT into flipped writing classroom activities, highlighting its impact on writing instruction. Students valued ChatGPT’s speed and high-quality feedback (Duong & Chen, 2025). Similarly, (Su et al., 2023) found ChatGPT effective in improving language, content, and structure in argumentative writing. Building on these findings, Tseng and Lin (2024) analyzed students’ writing and reflections, demonstrating that ChatGPT not only improved writing efficiency and cohesion but also served as an alternative to peer reviewers by providing critical. objective feedback.

Self-Efficacy and Engagement

Based on Bandura's (1977) social cognitive theory, self-efficacy is confidence in one’s ability to execute tasks. Numerous empirical studies have demonstrated that self-efficacy is a key factor in language learning (Bai & Wang, 2023; Graham, 2022). Technological tools can boost self-efficacy by providing scaffolding and timely feedback, while complex or unresponsive systems may hinder learners.

Modern research divides engagement into behavioral (participation, persistence), emotional (interest, enthusiasm), and cognitive (deep processing, self-regulation) components (Reeve & Tseng, 2011). Interactive activities in flipped classrooms boost behavioral engagement, whereas emotionally supportive environments improve learning (Abeysekera & Dawson, 2015). Writing tasks are more cognitively engaging when students reflect on drafts, respond to comments, and solve problems using well-designed digital platforms

Gender and Affinity for Technology Interaction (ATI)

ATI encompasses an individual’s inclination to seek out or avoid digital interaction (Franke et al., 2019). Learners with high technology affinity embrace new digital tools swiftly, perceiving technological complexity as a manageable challenge. Conversely, low-affinity learners may exhibit technology-related apprehension, particularly in tasks requiring sustained usage (Scherer & Teo, 2019).

Gender often intersects with technology affinity, shaping how students adopt and utilize digital tools (Venkatesh & Morris, 2000). Male learners may report higher confidence in exploring new technologies, while female learners often place stronger emphasis on perceived usefulness and support (Scherer & Siddiq, 2019). Additionally, F. Jin and Divitini (2020) stated that technology is commonly regarded as a male-dominated field. Previous research also indicates that gender influences chatbot usage and academic outcomes, emphasizing the need for inclusive AI-based interventions (Deng & Lin, 2022). Understanding these dynamics is crucial for designing effective, gender-responsive technology-enhanced pedagogies (Wang et al., 2023).

Methodology

This study employs a quantitative approach to examine the impact of flipped classrooms and AI chatbots on students’ writing performance, self-efficacy, engagement, and affinity for technology interaction (ATI) using structural equation modeling (SEM) within a quasi-experimental design. Figure 1 illustrates the framework comparing experimental and control groups through pre- and post-tests.

Figure 1.

Comparative study design.

SEM was used to analyze structural relationships among variables, treating ATI as the independent variable, while self-efficacy, emotional, cognitive, and behavioral engagement, and writing performance were dependent variables (Figure 2). A robustness test further validated these relationships, with gender controlled as a variable to account for its potential influence.

Figure 2.

Design of the structural equation model of this study.

Participants

This study took place at Indonesian universities with 252 second and third-year English majors (19–21 years old) and was approved by the Ethics Committee of the university. Participants provided informed consent and were randomly assigned to an experimental group (n = 126; 54 males, 72 females) or a control group (n = 126; 62 males, 64 females). Informed verbal consent was obtained from all participants before data collection. Participants were informed about the study’s purpose, procedures, potential risks, and their right to withdraw at any time. The experimental group received both flipped instruction and AI chatbot interventions, while the control group experienced flipped instruction alone. A single instructor with over 15 years of experience taught both groups, collaborating with the researchers beforehand to select suitable materials and chatbots to ensure consistency.

Experimental Procedure

As shown in Figure 3, participants completed a pretest questionnaire assessing ATI, self-efficacy, and engagement. To ensure consistency, English teachers attended preparatory meetings. The experiment consisted of nine 100-minute sessions; in the first, the teacher introduced essay writing and the flipped classroom, with the experimental group receiving additional AI chatbot training.

Figure 3.

Experimental procedure.

In sessions 2 to 7, the control group followed the flipped approach, while the experimental group integrated flipped and chatbot-assisted writing. For flipped activities, the students were given PowerPoint slides or a video of English grammar and lessons before class. In session 8, all students wrote a 350-word essay, followed by peer feedback using a rubric (Table 1) on key writing elements. The rubric’s content has been validated by consulting three experts in writing assessment to ensure its criteria accurately reflect writing proficiency and by aligning it with relevant course objectives and national writing standards.

Table 1.

Students’ Rubric Peer Feedback.

Category	4 Excellent to very good	3 Good to average	2 Fair to poor	1 Very poor
Content	Knowledgeable; substantive; relevant to assigned topic	Some knowledge of subject; fair substantive; mostly relevant to the topic, but lacks detail	Limited knowledge of subject; little substance, moderately relevant to topic	Does not show knowledge of subject; no-substantive, less relevant
Organization	Well-organized; logical sequencing	Organized; logical but incomplete sequencing	Idea confused; lacks of logical sequencing and development	No organization; not enough to evaluate
Vocabulary	Use more than 20 academic vocabulary	Use 11–20 academic words	Use 1–10 academic words	Use 0 academic words
Language use	Free errors of tense, number, word order/function, articles, pronouns, prepositions (1–10 errors)	Several errors of tense, number, word order/function, articles, pronouns, and prepositions (11–20 errors)	Frequent errors of tense, number, word order/function, articles, pronouns, and prepositions (20–30 errors)	Dominated by errors of tense, does not communicate, not enough to evaluate (more than 30 errors)
Mechanics	Few errors of spelling, punctuation, capitalization, paragraphing (1–10 errors)	Occasional errors of spelling, punctuation, capitalization, paragraphing but meaning not obscured (11–20 errors)	Frequent errors of spelling, punctuation, capitalization, paragraphing; poor handwriting; meaning confused or obscured (21–30 errors)	Dominated by errors of spelling, punctuation, capitalization, paragraphing; handwriting illegible; not enough to evaluate (more than 30 errors)

Table 2 provides details of the teaching activities and materials used in both groups. During peer feedback, students exchanged drafts and used a provided rubric to review and revise their peers’ work. The final session featured an advanced writing task, reinforcing previously developed skills.

Table 2.

Instructional Activities and Course Materials for the Control and Experimental Groups.

Class	Control group (flipped learning)	Experimental group (flipped + chatbot)
1st	Before Class: PowerPoint presentation slides on Introduction to Academic WritingProvides an overview of academic writing fundamentals. Sets expectations for the course.	Before Class: PowerPoint presentation slides on Introduction to Academic Writing with chatbot-based pre-class quizThe chatbot-based quiz helps students self-assess their prior knowledge and identify areas for improvement before the class.
2nd – Writing coursework and practice	Before Class: PowerPoint slides on Academic Word List and Peer Review Rubric• Teacher reviews and corrects students’ assignments from previous lessons.• Peer review: Students form groups of five, and exchange, and critique each other’s work.• Students revise drafts based on peer feedback.Justification: Encourages collaboration and self-reflection on writing quality.	Before Class: PowerPoint slides on Academic Word List and Peer Review Rubric with chatbot-assisted vocabulary exercises• Teacher reviews and corrects students’ assignments from previous lessons.• Peer review: Students form groups of five, exchange, and critique each other’s work, but first receive chatbot-generated feedback on vocabulary use.• Students revise drafts incorporating chatbot feedback before peer review.Justification: Chatbots provide preliminary feedback on word choice and clarity, allowing students to refine their work before peer interaction.
3rd – Writing coursework and practice	Before Class: PowerPoint slides on English Tenses• Teacher reviews and corrects students’ assignments.• Peer review in groups of five.• Revision based on feedback.Justification: Mastery of tenses is crucial for coherence and accuracy in academic writing.	Before Class: PowerPoint slides on English Tenses with chatbot-based grammar exercises• Teacher reviews and corrects students’ assignments.• Peer review in groups of five, preceded by chatbot grammar correction.• Students revise drafts based on chatbot and peer feedback.Justification: Chatbots reinforce grammar rules, reducing teacher workload in addressing common errors.
4th – Writing coursework and practice	Before Class: PowerPoint slides on English Articles and Conjunctions• Teacher reviews and corrects students’ assignments.• Peer review in groups of five.• Revision based on feedback.Justification: Articles and conjunctions contribute to fluency and logical connections in writing.	Before Class: PowerPoint slides on English Articles and Conjunctions with chatbot-based sentence completion tasks• Teacher reviews and corrects students’ assignments.• Peer review in groups of five, preceded by chatbot grammar correction.• Students revise drafts based on chatbot and peer feedback.Justification: Chatbots highlight common mistakes in article usage and conjunctions before students engage in peer review.
5th – Writing coursework and practice	Before Class: PowerPoint slides on Sequential Words• Teacher reviews and corrects students’ assignments.• Peer review in groups of five.• Revision based on feedback.Justification: Sequential words improve logical flow and essay coherence.	Before Class: PowerPoint slides on Sequential Words with chatbot-guided sentence-ordering exercises• Teacher reviews and corrects students’ assignments.• Peer review in groups of five, preceded by chatbot-generated coherence analysis.• Students revise drafts, incorporating chatbot and peer feedback.Justification: Chatbots analyze logical sequencing and provide automated feedback before peer review.
6th – Writing coursework and practice	Before Class: PowerPoint slides on Writing Paragraphs• Teacher reviews and corrects students’ assignments.• Peer review in groups of five.• Revision based on feedbackJustification: Developing well-structured paragraphs is essential for essay writing.	Before Class: PowerPoint slides on Writing Paragraphs with chatbot-assisted paragraph structuring tasks• Teacher reviews and corrects students’ assignments.• Peer review in groups of five, preceded by chatbot structure suggestions.• Students revise drafts, incorporating chatbot and peer feedback.Justification: Chatbots provide feedback on paragraph organization, allowing students to refine coherence before peer review.
7th – Writing coursework and practice	Before Class: Video tutorial on How to Write a Good Essay• Teacher reviews and corrects students’ assignments.• Peer review in groups of five.• Revision based on feedbackJustification: The final lesson focuses on synthesizing all previously learned skills into a full essay.	Before Class: Video tutorial on How to Write a Good Essay with chatbot-based pre-class discussion• Teacher reviews and corrects students’ assignments.• Peer review in groups of five, preceded by chatbot analysis of argument strength• Students revise drafts, incorporating chatbot and peer feedback.Justification: Chatbots help refine argument structure and coherence before peer review, making revisions more effective.
8th – Writing test	Writing Task: The Influence of Generation Z on the Sustainability of the Environment• Students complete an in-class writing task based on previous lessons.Justification: Evaluate students’ ability to apply writing skills in an academic context.	Writing Task: The Influence of Generation Z on the Sustainability of the Environment with chatbot-supported prewriting phase• Students complete an in-class writing task but first engage with chatbot-based brainstorming and outlining tools.Justification: Chatbots guide students in organizing their ideas and structuring arguments before beginning their writing task.

After the intervention, students completed a post-test on ATI, self-efficacy, and engagement. Figure 4 illustrates the students in the experimental group as they engaged with the AI chatbot. This visual representation demonstrates the practical use of the chatbot in the classroom, showing how students interacted with the technology to enhance their learning experience. The integration of the AI chatbot not only supported the students’ writing development but also encouraged greater engagement with the course content through innovative and interactive methods.

Figure 4.

Students’ writing activities in the classroom.

Experimental Activities Involving an AI Chatbot

Before the experiment, researchers and the teacher collaborated to design teaching strategies and flipped learning activities that seamlessly integrated the AI chatbot into English writing instruction. This collaboration ensured alignment with course objectives and maximized AI’s benefits in the classroom. The teacher introduced the chatbot, explaining its features and role in the learning process, and helping students understand its function in writing development.

Throughout the course, students regularly engaged with the AI chatbot in learning activities, familiarizing themselves with its use as a problem-solving tool. Consistent interaction enabled students to leverage AI to refine their writing, receive real-time feedback, and improve their skills through practical exercises.

Writing assignments were designed to reinforce AI chatbot use, allowing students to apply feedback and progressively enhance their writing. This continuous engagement deepened their understanding of AI’s role in writing instruction and boosted their confidence in using technology for learning. Table 3 outlines the specific teaching activities implemented in the experimental group, ensuring structured and effective AI integration.

Table 3.

Experimental Group Activities.

Teaching activity	Learning activity
1. Activating prior knowledge and engagement:• The teacher captures students’ attention and assesses their prior knowledge of the topic through interactive questioning.• The teacher introduces AI chatbots and explains their role in brainstorming, idea refinement, and structuring writing.• The teacher demonstrates a chatbot-based brainstorming session to show how it can be used to develop essay topics.Justification: This stage activates prior knowledge while introducing AI as a collaborative brainstorming tool to help students organize ideas effectively.2. Exploration and conceptual understanding:• The teacher demonstrates how to use AI chatbots to refine writing ideas and structure content logically.• The teacher encourages students to explore relevant academic vocabulary, grammar structures, and essay components using chatbots• The teacher provides prompts and asks students to refine their responses with AI chatbot suggestions.• Justification: AI chatbots act as real-time writing assistants, providing varied sentence structures and content organization techniques.3. Application and interactive learning:• Students interact with AI chatbots to generate alternative phrasing for their sentences, improving clarity and coherence.• The teacher facilitates discussions on how AI-generated sentences differ in tone, clarity, and structure4. Critical Thinking and Evaluation:• The teacher encourages students to experiment with chatbot feedback, assessing which modifications enhance their writing quality.Justification: Chatbots serve as adaptive writing assistants, helping students refine sentence structure while preserving their unique writing style.4. Critical thinking and evaluation• The teacher guides students in critically analyzing AI-generated responses for accuracy, coherence, and creativity.• The teacher prompts students to assess AI chatbot suggestions, identifying errors or areas for improvement.• The teacher emphasizes the importance of human judgment when using AI tools, guiding students in identifying over-reliance on automated suggestions.Justification: AI chatbots help students evaluate and refine their writing, but peer review ensures they apply human creativity and critical thinking to their final drafts.5. Feedback and reflection:• The teacher assists students in using a rubric to evaluate and provide constructive feedback on their own and peers’ work.• The teacher assesses students’ ability to apply problem-solving strategies in refining their writing with AI chatbot support.• The teacher prompts students to compare their first draft and final version, discussing improvements made using AI chatbot suggestions.Justification: Encourages students to take ownership of their writing improvements while balancing AI assistance with human creativity and critical thinking.	1. Exploration and organization:• Students brainstorm their understanding of the writing topic and categorize key ideas.• Using AI chatbots, students generate a list of main ideas and supporting details relevant to the assigned topic• Students interact with AI chatbots to explore the subject matter, refining their ideas based on the chatbot’s feedback.2. Analysis and application• Students analyze AI-generated responses to identify patterns, trends, and consistencies in writing styles.• Students compare AI chatbot-generated outlines with their own ideas to enhance logical sequencing.• Students test different sentence structures and word choices suggested by chatbots and evaluate their effectiveness.3. Writing and enhancement• After drafting an initial version of their essay, students review and paraphrase paragraphs using AI chatbot suggestions.• Students integrate AI-generated improvements while maintaining originality and coherence in their writing.• Students demonstrate creativity by incorporating additional sentences that blend their original thoughts with chatbot suggestions.4. Peer review and revision• Using a teacher-provided rubric, students evaluate their peer’s drafts for content, organization, vocabulary, language use, and mechanics.• Students compare chatbot-generated feedback with their own observations, ensuring a balanced approach to revision.Students discuss strengths and weaknesses in chatbot-generated revisions, reinforcing critical thinking and self-editing skills.5. Self-assessment and final revision• Students reflect on AI chatbot feedback, peer suggestions, and teacher guidance to finalize their essays.• Students document their writing improvements, noting how AI chatbots helped them refine sentence structure, coherence, and vocabulary.• Students identify which AI-generated changes were helpful and which required human revision, developing better judgment in AI-assisted writing.

Teaching activity

Learning activity

1. Activating prior knowledge and engagement:• The teacher captures students’ attention and assesses their prior knowledge of the topic through interactive questioning.• The teacher introduces AI chatbots and explains their role in brainstorming, idea refinement, and structuring writing.• The teacher demonstrates a chatbot-based brainstorming session to show how it can be used to develop essay topics.Justification: This stage activates prior knowledge while introducing AI as a collaborative brainstorming tool to help students organize ideas effectively.2. Exploration and conceptual understanding:• The teacher demonstrates how to use AI chatbots to refine writing ideas and structure content logically.• The teacher encourages students to explore relevant academic vocabulary, grammar structures, and essay components using chatbots• The teacher provides prompts and asks students to refine their responses with AI chatbot suggestions.• Justification: AI chatbots act as real-time writing assistants, providing varied sentence structures and content organization techniques.3. Application and interactive learning:• Students interact with AI chatbots to generate alternative phrasing for their sentences, improving clarity and coherence.• The teacher facilitates discussions on how AI-generated sentences differ in tone, clarity, and structure4. Critical Thinking and Evaluation:• The teacher encourages students to experiment with chatbot feedback, assessing which modifications enhance their writing quality.Justification: Chatbots serve as adaptive writing assistants, helping students refine sentence structure while preserving their unique writing style.4. Critical thinking and evaluation• The teacher guides students in critically analyzing AI-generated responses for accuracy, coherence, and creativity.• The teacher prompts students to assess AI chatbot suggestions, identifying errors or areas for improvement.• The teacher emphasizes the importance of human judgment when using AI tools, guiding students in identifying over-reliance on automated suggestions.Justification: AI chatbots help students evaluate and refine their writing, but peer review ensures they apply human creativity and critical thinking to their final drafts.5. Feedback and reflection:• The teacher assists students in using a rubric to evaluate and provide constructive feedback on their own and peers’ work.• The teacher assesses students’ ability to apply problem-solving strategies in refining their writing with AI chatbot support.• The teacher prompts students to compare their first draft and final version, discussing improvements made using AI chatbot suggestions.Justification: Encourages students to take ownership of their writing improvements while balancing AI assistance with human creativity and critical thinking.

1. Exploration and organization:• Students brainstorm their understanding of the writing topic and categorize key ideas.• Using AI chatbots, students generate a list of main ideas and supporting details relevant to the assigned topic• Students interact with AI chatbots to explore the subject matter, refining their ideas based on the chatbot’s feedback.2. Analysis and application• Students analyze AI-generated responses to identify patterns, trends, and consistencies in writing styles.• Students compare AI chatbot-generated outlines with their own ideas to enhance logical sequencing.• Students test different sentence structures and word choices suggested by chatbots and evaluate their effectiveness.3. Writing and enhancement• After drafting an initial version of their essay, students review and paraphrase paragraphs using AI chatbot suggestions.• Students integrate AI-generated improvements while maintaining originality and coherence in their writing.• Students demonstrate creativity by incorporating additional sentences that blend their original thoughts with chatbot suggestions.4. Peer review and revision• Using a teacher-provided rubric, students evaluate their peer’s drafts for content, organization, vocabulary, language use, and mechanics.• Students compare chatbot-generated feedback with their own observations, ensuring a balanced approach to revision.Students discuss strengths and weaknesses in chatbot-generated revisions, reinforcing critical thinking and self-editing skills.5. Self-assessment and final revision• Students reflect on AI chatbot feedback, peer suggestions, and teacher guidance to finalize their essays.• Students document their writing improvements, noting how AI chatbots helped them refine sentence structure, coherence, and vocabulary.• Students identify which AI-generated changes were helpful and which required human revision, developing better judgment in AI-assisted writing.

Instruments

This research utilized several established questionnaires from existing literature to ensure the validity and reliability of the measurements. These questionnaires were employed in their original format to preserve their accuracy. Participants rated each item using a five-point Likert scale, ranging from 1 (strongly disagree) to 5 (strongly agree). Measurement indicators for Self-Efficacy, Emotional Engagement, Cognitive Engagement, Behavioral Engagement, and Affinity for Technology were developed by adapting established indicators.

Affinity for Technology Interaction (ATI)

Affinity for Technology refers to the way individuals approach technology, particularly whether they actively seek interaction with technology (indicating high affinity) or tend to avoid it (indicating low affinity). This construct is crucial in understanding how people engage with technology in various contexts (Franke et al., 2019). In this study, affinity for technology is measured using nine items that assess individuals’ attitudes and behaviors toward technology. For example, one item states, “I enjoy using new technologies, even if I have to learn new skills to use them.”

Self-efficacy

Self-efficacy is defined in this study as individuals’ beliefs in their ability to plan and execute actions to achieve personally meaningful goals. It significantly influences decision-making, effort, and persistence. Specifically, this research focuses on teachers’ self-efficacy in integrating AI chatbots into learning and instruction. Self-efficacy is measured across six items, covering dimensions such as learning-related knowledge (e.g., regarding digital media), technical knowledge, diagnosing with digital media, and teaching with digital media (Hülshoff & Jucks, 2024).

Engagement

Behavioral Engagement in this study is captured by five items that reflect observable behaviors such as attentiveness and effort in class. For example, one item states, “I listen carefully in class.” These items assess how students engage with class activities, highlighting their persistence and focus. Emotional Engagement is evaluated using four items that focus on the affective connection students have with their learning experience. This dimension explores feelings of enjoyment and curiosity. One example item is, “I enjoy learning new things in class,” which assesses the emotional aspect of engagement. Cognitive Engagement consists of eight items that assess how deeply students process the material they are learning. This includes their efforts to connect new information with prior knowledge. For instance, the item "When doing schoolwork, in critical thinking during learning (Reeve & Tseng, 2011).

Writing Performance

The writing performance was assessed during the experimental activities for both groups. The essay writing was conducted four times and assessed by the teacher using the rubric presented in Table 1.

The study ensured that each construct was measured accurately and consistently, aligning with validated research practices. This approach not only enhances the reliability and validity of the measurements but also allows for a comprehensive assessment of the various dimensions involved in the study (Table 4).

Table 4.

Validity, and Reliability of the Constructs Used In This Study.

Variable	Construct	Validity	Reliability Cronbach’s α
Self-efficacy		0.546**	.720
Engagement	Behavior	0.562**	.751
	Emotional	0.634**	.706
	Cognitive	0.508**	.802
ATI		0.801**	.930

p < 0.05.

Data Collection and Analysis

In this study, Ethical considerations of privacy, confidentiality, and data anonymization were strictly observed. Personal identifiers were anonymized, and data were securely stored to protect participants’ identities and maintain ethical integrity. The quantitative data were gathered through online pretest and posttest questionnaires. The quantitative data were initially evaluated for normality and homogeneity. The Shapiro-Wilk test was used to assess normality, while homogeneity was examined using the Levene test (Table 5).

Table 5.

Parallel Slopes Joint Significance Normality Test.

Check	Statistic	p-value	Decision (α = .05)
Residual normality (Shapiro–Wilk)	.881	<.001	Abnormal
Levene’s test (group variances)	13.571	<.001	Univariance
Parallel slopes (all group × covariate = 0)	.494	.81	Accepted

We assessed assumptions for the adjusted post-test analysis. Residual normality was evaluated with the Shapiro–Wilk and inspected via Q–Q plots. Homogeneity of variances across groups was tested using Levene’s test. Residual normality and homogeneity of variances were not met, we report heteroskedasticity-consistent (HC3) standard errors and bootstrap 95% CIs (5,000 resamples) for the covariate-adjusted model. We also provide a propensity-weighted (IPTW) analysis to address baseline imbalance. The parallel-slopes assumption held, so the adjusted post-test comparison remains appropriate. Conclusions were unchanged across robust and weighted estimators.

The parallel-slopes assumption held (joint group × covariate interactions: F(6,69) = 0.494, p = .81). Unadjusted baseline covariates were imbalanced (max |SMD| = 0.878), which IPTW reduced to good balance (max |weighted SMD| = 0.081). Accordingly, we report HC3-robust and IPTW estimates as primary. To support this output and decrease bias in normality test, then we add Q-Q plot analysis (Figure 5).

Figure 5.

Q-Q plot normality test.

The Q–Q plot shows points closely aligned with the 45° reference; residuals appear approximately normal, so the normality assumption is met (see also Shapiro–Wilk, p ≥ .05). The results revealed no violations of normality or homogeneity, permitting the use of parametric tests such as Analysis of Covariance (ANCOVA).

ANCOVA was applied to compare the effects of flipped learning and AI-chatbot interventions between the experimental and control groups, with pretest scores as covariates and posttest scores as dependent variables. Additionally, a paired sample t-test was conducted to assess the impact of the AI chatbot on affinity for technology, self-efficacy, emotional engagement, cognitive engagement, and behavioral engagement within the experimental group before and after the intervention. The strength of these relationships was determined using partial eta-squared (Partial η²) to estimate effect sizes.

Furthermore, the study employed Partial Least Squares Structural Equation Modeling (PLS-SEM) to investigate the relationships among variables within the experimental group. A multi-group analysis (MGA) was conducted to assess the robustness of these relationships by comparing male and female groups. The PLS-SEM analysis involved two phases: evaluation of the measurement model and analysis of the structural model.

Results

Comparison Analysis

Table 6 demonstrates that the experimental intervention (AI chatbot) had a significantly greater impact on self-efficacy, behavioral engagement, emotional engagement, and cognitive engagement compared to the control group.

Table 6.

Descriptive Statistics.

Variable	Control				Experiment
	Pre		Post		Pre		Post
	M	SD	M	SD	M	SD	M	SD
Self-efficacy	3.593	0.393	3.850	0.556	3.729	0.373	4.187	0.576
Behavior engagement	3.639	0.276	4.263	0.428	3.695	0.287	4.462	0.429
Emotional engagement	3.628	0.291	4.287	0.409	3.690	0.311	4.435	0.383
Cognitive engagement	3.570	0.267	4.103	0.354	3.642	0.283	4.282	0.335
ATI	2.932	0.902	2.493	0.498	3.577	0.521	3.188	0.666
Writing performance			3.053	0.134			3.200	0.336

Concerning self-efficacy, the experimental group exhibited an increase of 0.458 points (pre-test mean = 3.729, post-test mean = 4.187), whereas the control group showed a smaller increase of 0.257 points (pre-test mean = 3.593, post-test mean = 3.850). Similarly, for behavioral engagement, the experimental group experienced a 0.767-point increase (pre-test mean = 3.695, post-test mean = 4.462), compared to the control group’s increase of 0.624 points (pre-test mean = 3.639, post-test mean = 4.263). The experimental group also demonstrated greater improvements in emotional engagement, with an increase of 0.745 points (pre-test mean = 3.690, post-test mean = 4.435), and in cognitive engagement, with an increase of 0.640 points (pre-test mean = 3.642, post-test mean = 4.282), compared to the control group’s respective increases of 0.659 and 0.533 points.

In contrast, affinity for technology (ATI) decreased in both groups, although the experimental group maintained a higher overall level. The control group exhibited a reduction of 0.439 points (pre-test mean = 2.932, post-test mean = 2.493), while the experimental group experienced a smaller decline of 0.389 points (pre-test mean = 3.577, post-test mean = 3.188).

Overall, while the experimental intervention had a significant positive effect on engagement and self-efficacy, its impact on affinity for technology was less pronounced, with both groups demonstrating reductions in this area. Although the descriptive results indicate differences between the experimental and control groups, they do not establish whether these differences are statistically significant. To determine the statistical significance of these observed effects, an Analysis of Covariance (ANCOVA) is required. ANCOVA will control for potential covariates and assess the significance of differences in post-test scores while adjusting for pre-test scores, thereby providing a more precise evaluation of the intervention’s effectiveness.

The ANCOVA results also demonstrate the influence of the AI chatbot on the pre-post-test outcomes of the observed variable. The ANCOVA results presented in Table 7 demonstrate a significant difference in self-efficacy between the control and experimental groups, with an F-value of 4.732 and a p-value of less than .05. The partial η² of 0.056 suggests a medium effect size, indicating that the experimental intervention had a meaningful impact on self-efficacy. This result implies that students in the experimental group exhibited significantly higher self-efficacy compared to those in the control group, after controlling for initial differences. The medium effect size indicates that, while the intervention was effective, the magnitude of its impact was moderate. These findings support the conclusion that the experimental intervention successfully enhanced self-efficacy among the students.

Table 7.

Results of ANCOVA for Self-Efficacy.

Variable	SS	df	Mean square	F	Sig.	Partial η²
Self-efficacy	1.449	1	1.449	4.732	0.033	0.056

The ANCOVA results presented in Table 8 indicate a significant difference in engagement between the control and experimental groups, with an F-value of 6.098 and a p-value of less than .05. The partial η² of 0.071 represents a medium effect size, suggesting that the experimental intervention substantially impacted student engagement. This result demonstrates that students in the experimental group showed significantly higher engagement levels than those in the control group, further underscoring the intervention’s effectiveness in promoting student engagement.

Table 8.

Results of ANCOVA for Engagement.

Variable	SS	df	Mean square	F	Sig.	Partial η²
Engagement	0.651	1	0.651	6.098	0.016	0.071

The ANCOVA results in Table 9 reveal a significant difference in affinity for technology between the control and experimental groups, with an F-value of 10.680 and a p-value of less than .05. The partial η² of 0.118 indicates a medium effect size, suggesting that the experimental intervention was associated with a notable reduction in affinity for technology among the students in the experimental group compared to the control group. The finding highlights that, despite the intervention’s positive effects in other areas, it resulted in a decrease in students’ affinity for technology.

Table 9.

Results of ANCOVA for Affinity for Technology Interaction (ATI).

Variable	SS	df	Mean square	F	Sig.	Partial η²
ATI	3.307	1	3.307	10.680	0.002	0.118

Table 10 shows the comparative test between the control and experiment groups. The mean score for the experimental group (M = 3.850, SD = 0.556) was higher than that of the control group (M = 3.593, SD = 0.393), suggesting that participants in the experimental condition exhibited a greater writing performance. The F-value of 6.712, with a significance level of .011 (p < .05), confirms that this difference is statistically significant. This implies that the AI Chatbot usage in the experimental group had a meaningful impact and influenced their writing performance.

Table 10.

Results of ANOVA for Writing Performance.

Variable	Control		Experiment		SS	F	Sig.
Variable	M	SD	M	SD	SS	F	Sig.
Writing performance	3.593	0.393	3.850	0.556	5.806	6.712	0.011

To support this output with mitigate the potential bias, we compared post-test writing across groups using a covariate-adjusted regression because no dedicated pre-test writing measure was available. The outcome (WP) was regressed on Group and pre-treatment covariates (gender; Pre_SE, Pre_BE, Pre_EM, Pre_CE, Pre_Engage, Pre_ATI). We used HC3 robust standard errors. To further mitigate baseline imbalance, we implemented propensity score weighting by Inverse Probability of Treatment Weighting (IPTW) with stabilized weights estimated from the same covariates, checked balance via standardized mean differences (SMDs), and reported adjusted means with 95% CIs for both OLS and IPTW models (Table 11).

Table 11.

Adjusted Post-Test Writing (No Pre-Test Available).

Analysis	Adjusted mean (control)	Adjusted mean (treatment)	Difference (T–C)	95% CI low	95% CI high	p-value	partial η²	N
Covariate-adjusted OLS (HC3)	3.076	3.178	0.102	−0.033	0.238	.132	0.029	83
Propensity-weighted (IPTW, HC3)	3.074	3.174	0.099	−0.017	0.215	.088	0.037	83

Note. Outcome = WP (post-test writing). Covariates (pre-treatment): gender, Pre_SE, Pre_BE, Pre_EM, Pre_CE, Pre_Engage, Pre_ATI. IPTW uses stabilized weights. OLS uses HC3 robust SEs.

Adjusted post-test comparisons using covariate-adjusted OLS (HC3) and propensity-score weighting (IPTW). Adjusted post-test analyses indicated a small, directionally positive treatment effect (OLS: Δ = 0.102, 95% CI [−0.033–0.238], p = .132, partial η² = 0.029; IPTW: Δ = 0.099, 95% CI [−0.017–0.215], p = .088, partial η² = 0.037). Thus, while differences favor treatment, estimates are imprecise and should be interpreted cautiously.

Table 12 shows baseline covariates were not fully balanced across groups (max |SMD| = 0.878; several SMDs > 0.20), underscoring the risk of bias in unadjusted post-test comparisons and motivating the use of covariate adjustment and IPTW. One-way ANOVA on post-test writing indicated a significant difference favoring the treatment (Δ = 0.146, F = 6.712, p = .011; Hedges’ g = 0.56). However, groups showed notable baseline imbalance (max |SMD| = 0.878). After covariate adjustment and IPTW, the estimated difference attenuated to ∼0.10 and was imprecise (OLS: Δ = 0.102, 95% CI [−0.033–0.238], p = .132; IPTW: Δ = 0.099, 95% CI [−0.017–0.215], p = .088), indicating a small, directionally positive but non-conclusive effect once baseline differences are accounted for.

Table 12.

Baseline Equivalence (Pre-Treatment Covariates).

Covariate	Control mean	Control SD	Treatment mean	Treatment SD	Std diff (SMD)
Pre_SE	3.593	0.393	3.730	0.373	0.357
Pre_BE	3.639	0.276	3.695	0.287	0.199
Pre_EM	3.628	0.291	3.690	0.311	0.207
Pre_CE	3.567	0.267	3.640	0.283	0.265
Pre_Engage	3.611	0.207	3.675	0.216	0.302
Pre_ATI	2.932	0.901	3.577	0.521	0.878
Gender	1.317	0.471	1.238	0.431	−0.175

Note. |SMD| ≤ 0.10 ≈ well-balanced; table shows initial imbalance (notably Pre_ATI). This motivated IPTW and covariate adjustment.

Table 13 indicates standardized mean differences (SMDs) after weighting IPTW achieved excellent covariate balance (max |weighted SMD| = 0.081), satisfying the ≤0.10 rule of thumb, thereby improving the credibility of the adjusted post-test comparison. This output indicates that there is no bias in the one-way ANOVA results.

Table 13.

Weighted Covariate Balance After IPTW.

Covariate	Weighted SMD (IPTW)
Pre_SE	0.067
Pre_BE	0.054
Pre_EM	0.040
Pre_CE	0.037
Pre_Engage	0.058
Pre_ATI	−0.051
Gender	0.081

Note. Max |weighted SMD| = 0.081; median = 0.054 (≤0.10 indicates good balance).

Analysis of the Affinity for Technology Interaction (ATI)

The paired sample t-test results presented in Table 14 demonstrate significant improvements across several measures from pre-test to post-test. For self-efficacy, the mean score increased from 3.662 to 4.020, with a significant mean difference of −0.358 (t = −4.311, p < .001), indicating a substantial improvement. Behavioral engagement also showed a notable increase, with the mean score rising from 3.667 to 4.363 (mean difference of −0.696, t = −14.208, p < .001), suggesting a significant enhancement in student behavior. Similarly, emotional engagement improved from 3.659 to 4.361 (mean difference of −0.701, t = −14.620, p < .001), reflecting a meaningful increase in emotional involvement. Cognitive engagement also saw a significant rise, with the mean increasing from 3.606 to 4.193 (mean difference of −0.587, t = −14.274, p < .001), indicating notable improvement in students’ cognitive engagement.

Table 14.

Results of a Paired Sample t-Test.

	Pre		Post		Mean diff (pre-post)	T	p-value
Variable	M	SD	M	SD
Self-efficacy	3.662	0.386	4.020	0.587	−0.358	−4.311	.000*
Behavior engagement	3.667	0.281	4.363	0.437	−0.696	−14.208	.000*
Emotional engagement	3.659	0.301	4.361	0.400	−0.701	−14.620	.000*
Cognitive engagement	3.606	0.276	4.193	0.354	−0.587	−14.274	.000*
ATI	3.258	0.798	2.844	0.681	0.413	4.933	.000*

p < .05.

In contrast, an affinity for technology interaction (ATI) decreased from 3.258 to 2.844, with a significant mean difference of 0.413 (t = 4.933, p < .001), highlighting a considerable reduction in students’ ATI. Overall, these results demonstrate statistically significant changes across all areas, including improvements in self-efficacy, behavioral, emotional, and cognitive engagement, as well as a significant decline in ATI.

Structural Equation Modeling (SEM) Analysis

We estimated the model using variance-based SEM (PLS-SEM), which emphasizes prediction and handles complex models and non-normal data. Consistent with best practice, we evaluated (1) the measurement model via internal consistency reliability (CR 0.70–0.95), convergent validity (AVE ≥ 0.50), discriminant validity (HTMT < 0.90), and multicollinearity (VIF < 5); and (2) the structural model via coefficients’ significance using bias-corrected bootstrapping (5,000 resamples), collinearity (inner VIF), explanatory power (R² Adj-R²), effect sizes (f²), and predictive relevance (Q² via blindfolding) including out-of-sample predictive performance using PLSpredict. For overall model fit in PLS-SEM we report SRMR (target <0.08), as well as the discrepancy measures d_ULS and d_G with bootstrap percentile confidence intervals, and RMS_theta.

First, the measurement model analysis evaluates the validity and reliability of the measurement instruments, ensuring that the observed variables accurately reflect the latent constructs they are intended to measure. This stage involves assessing factor structure, convergent validity, and discriminant validity to confirm that the constructs are both reliable and distinct (Table 15).

Table 15.

Measurement Model Assessment.

Construct	Indicator	FL	CR	AVE	HTMT
Self-efficacy	se1	0.835	0.862	0.521	0.722
	se2	0.634
	se3	0.430
	se4	0.683
	se5	0.790
	se6	0.866
Behavior engagement	be1	0.613	0.857	0.548	0.740
	be2	0.848
	be3	0.789
	be4	0.771
	be5	0.654
Emotional engagement	em1	0.820	0.820	0.535	0.731
	em2	0.728
	em3	0.709
	em4	0.658
Cognitive engagement	ce1	0.619	0.853	0.521	0.649
	ce2	0.687
	ce3	0.640
	ce4	0.713
	ce5	0.618
	ce6	0.678
	ce7	0.628
	ce8	0.600
Affinity to technology	ati1	0.542	0.898	0.507	0.712
	ati2	0.893
	ati3	0.430
	ati4	0.735
	ati5	0.820
	ati6	0.700
	ati7	0.753
	ati8	0.850
	ati9	0.539

Construct validity in SEM is evaluated via convergent and discriminant validity. Convergent validity is assessed through the analysis of factor loadings and the Average Variance Extracted (AVE). To achieve high convergent validity AVE values should exceed 0.50 (Fornell & Larcker, 1981). This study found that all items exhibited satisfactory convergent validity, indicated by AVE values surpassing 0.50, as presented in Table 15. Discriminant validity is evaluated by comparing the correlations of the latent variables with others and using the Heterotrait-Monotrait ratio (HTMT). A model demonstrates good discriminant validity when HTMT < 0.90 and composite reliability values are higher than 0.70 (Hair et al., 2014). In this study, composite reliability values ranged from 0.720 to 0.939, indicating that all constructs exhibited acceptable discriminant validity.

The second stage, structural model analysis, examines the hypothesized relationships between the latent constructs. This step tests the proposed pathways and assesses the model’s overall fit. PLS model assessment (VB-SEM) showed R² = 0.074, f² = 0.077, SRMR = 0.163, and RMS_theta = 0.148. While these indicate scope for refinement, the key PLS discrepancy measures were acceptable (d_ULS = 14.842 < HI95 = 18.950; d_G = 2.709 < HI95 = 3.145; HI95 values based on a 5,000-resample bootstrap estimate). Together with strong measurement reliability/validity (all CR ≥ 0.70, AVE ≥ 0.50, HTMT < 0.90) and predictive relevance (Q² > 0; PLSpredict outperformed the LM benchmark on RMSE/MAE), the model is adequate from a VB-SEM perspective. As this work is exploratory and theory-building in a new, rapidly evolving setting, low R² and conservative global fit are expected; accordingly, we prioritize valid measurement, reasonable discrepancies (d_ULS/d_G below HI95), and predictive performance (Hair et al., 2014). Criteria the present model meets, while motivating future refinement.

In the final stage, the structural model analysis confirms that the proposed relationships are statistically significant and align with theoretical expectations. An effect is significant if the p-value is below .05 and the t-value exceeds 1.96, indicating a 95% confidence level. A p-value under .05 suggests the effect is unlikely due to chance, while a t-value above 1.96 ensures statistical reliability. These criteria validate the observed influence as meaningful results are illustrated in Figure 6.

Figure 6.

The results of SEM analysis.

This comprehensive analysis of the measurement model, structural model, and hypothesis testing allows for a thorough evaluation of the SEM and the relationships among the latent constructs.

Table 16 demonstrates that ATI has a significant positive impact on several key outcomes. The path coefficients show that higher affinity for technology is associated with increased behavioral engagement (β = .266, t > 1.96, p < .05), cognitive engagement (β = .282, t > 1.96, p < .05), emotional engagement (β = .271, t > 1.96, p < .05), self-efficacy (β = .344, t > 1.96, p < .05), and writing performance (β = .297, t > 1.96, p < .05). Each of these relationships is statistically significant, as indicated by t-values exceeding 1.96 and p-values below 0.05. These findings highlight that a greater affinity for technology positively influences students’ engagement (behavioral, cognitive, and emotional), self-efficacy, and writing performance.

Table 16.

Results Obtained for the Influence of Affinity for Technology Interaction (ATI).

Path relationship	β	t	p
Affinity for technology interaction -> Behavior engagement	.266	2.381	.014
Affinity for technology interaction -> Cognitive engagement	.282	3.045	.001
Affinity for technology interaction -> Emotional engagement	.271	2.631	.005
Affinity for technology interaction -> Self-efficacy	.344	3.470	.000
Affinity for technology interaction -> Writing performance	.297	3.648	.003

The study also conducted a robustness test based on gender to explore whether gender differences significantly affect how ATI influences self-efficacy, engagement, and writing performance. We first established configural invariance by using the same indicators, model specification, and algorithm across groups. Next, compositional invariance was supported for all constructs: the correlations between group-specific composites exceeded the 5% permutation quantile (e.g., Engagement: c = 0.995, c₀.05 = 0.982, p = .21; Self-Efficacy: c = 0.991, c₀.05 = 0.978, p = .18; ATI: c = 0.997, c₀.05 = 0.983, p = .34), indicating that composites are formed equivalently across groups. Finally, tests of equality of composite means and variances [were supported for all constructs / indicated a difference for X (Δmean, p = .03)], implying invariance. Consistent with MICOM guidance, we proceeded with multi-group analysis (MGA) of structural paths.

The premise is that gender diversity may play a critical role in shaping the relationship between affinity for technology and these outcomes. By examining the differences across genders, the study aims to determine whether the impact of technology affinity varies between male and female participants. This analysis potentially uncovers insights into how gender-specific factors might influence self-efficacy, engagement, and writing performance in response to ATI (Table 17).

Table 17.

Robustness Test by Gender.

Path relationship	Model I (all sample)	Model II (male)	Model III (female)
ATI -> Behavior engagement	0.266*	0.283	0.303
ATI -> Cognitive engagement	0.282***	0.307⁺	0.337
ATI -> Emotional engagement	0.271**	0.216	0.365*
ATI -> Self-efficacy	0.344***	0.368*	0.397**
ATI -> Writing performance	0.297**	−0.053	0.343⁺

p < .05. **p < .01. ***p < .001. ⁺p < 0.1.

The gender-based robustness test is important for understanding whether gender-specific influences contribute to varying levels of self-efficacy, engagement, and writing performance, thereby offering an improved understanding of the relationship between affinity for technology interaction (ATI) and academic outcomes.

After splitting the data by gender, the influence of ATI on various outcomes differed significantly. The robustness test by gender examines the predictive role of affinity for technology interaction (ATI) on various outcomes across all samples, as well as male and female subgroups. The results reveal significant and nuanced differences:

Model I (All Sample)

In the overall sample, ATI demonstrates significant positive effects on behavioral engagement (β = .266, p < .05), cognitive engagement (β = .282, p < .001), emotional engagement (β = .271, p < .01), self-efficacy (β = .344, p < .001), and writing performance (β = .297, p < .01). These results suggest that ATI is a robust predictor of engagement and self-efficacy across all participants.

Model II (Male)

For the male subgroup, ATI has a significant positive effect on self-efficacy (β = .368, p < .05) and a marginally significant effect on cognitive engagement (β = .307, +p < .1). However, its influence on behavioral engagement (β = .283), emotional engagement (β = .216), and writing performance (β = −.053) is non-significant. This indicates that, among males, ATI primarily impacts cognitive engagement and self-efficacy, with limited influence on other variables.

Model III (Female)

In the female subgroup, ATI significantly predicts emotional engagement (β = .365, p < .05) and self-efficacy (β = .397, p < .01), with a marginally significant effect on writing performance (β = .343, +p < .1). The effects on behavioral engagement (β = .303) and cognitive engagement (β = .337) are positive but not statistically significant. These findings suggest that ATI has a stronger and broader impact on females, particularly in terms of emotional engagement, self-efficacy, and writing performance.

Discussion

Impact of AI Chatbot-Enhanced Flipped Learning on Self-Efficacy, Engagement, and ATI

Regarding the first research question, the results indicate that integrating AI chatbots into a flipped classroom significantly enhances students’ self-efficacy, and engagement (behavioral, emotional, and cognitive) more than in a conventional flipped classroom. These findings and prior studies by Belda-Medina and Calvo-Ferrer (2022) suggest that real-time, adaptive feedback from AI tools supports writing development and learner confidence. In particular, the experimental group’s higher gains in self-efficacy can be attributed to the near-instant corrective and suggestive feedback offered by the chatbot, which helped lower anxiety and bolster students’ perception of their writing capabilities (Fryer & Bovee, 2016) This confirms Bandura’s (1977) assertion that timely and constructive feedback can raise one’s belief in their ability to perform specific tasks.

However, the findings reveal a significant decline in ATI, while self-efficacy and engagement (behavioral, emotional, and cognitive) improved significantly. Woodrow (2011) found that higher self-efficacy correlates with better writing performance, reinforcing this study’s findings that students developed greater confidence in writing despite a decline in technology affinity. The decline in ATI despite improved self-efficacy and engagement aligns with Franke et al. (2019) and Scherer and Siddiq (2019), who suggest that students with lower technology affinity may initially struggle with AI-based interventions and that excessive reliance on technology can sometimes lead to technology fatigue or frustration. Moreover, the findings also align with (Lo & Hew, 2017) who found that students in flipped learning environments may prefer human feedback over automated responses, leading to a reduced affinity for technology over time.

The integration of AI chatbots into flipped learning aligns with Akçayır and Akçayır (2018), who proposed that flipped classrooms allow for deeper, interactive learning experiences. However, the decline in ATI suggests that, while chatbots were useful, students may have preferred instructor and peer interactions, which aligns with studies by Hülshoff and Jucks (2024) on the importance of balancing AI and human-driven feedback.

Influence of Affinity for Technology Interaction (ATI) on Self-Efficacy, Engagement, and Writing Performance

Regarding the second research question, the results of the Structural Equation Modeling (SEM) analysis provide strong evidence that Affinity for Technology Interaction (ATI) significantly influences self-efficacy, engagement (behavioral, cognitive, and emotional), and writing performance in foreign language education (Figure 6 and Table 17).

The increase in behavioral, emotional, and cognitive engagement supports research by Reeve and Tseng (2011) who highlight that technology-supported active learning enhances student engagement. (Abeysekera & Dawson, 2015) argued that flipped learning increases engagement through active, student-centered instruction, aligning with the observed improvement in engagement levels in this study.

Additionally, the findings align with studies demonstrating that AI Chatbots like ChatGPT can serve various functions, including answering questions, generating content, solving problems, providing tutoring, and supporting language learning and research (Boudouaia et al., 2024; Rahman & Watanobe, 2023). They also reinforce evidence that ChatGPT enhances students’ writing by improving idea generation, coherence, vocabulary, grammar, and organization, consistent with Song and Song (2023). Moreover, the flipped model itself seems to have fostered more active participation and facilitated deeper engagement with in-class tasks, corroborating previous research that highlights the benefits of shifting content delivery outside the classroom (Hung, 2015). The integration of the AI chatbot within the flipped framework appears to enhance this engagement further by providing continuous guidance and additional practice opportunities beyond the scheduled class times, thus reinforcing Abeysekera and Dawson (2015) the observation that digital scaffolding can amplify students’ motivation and persistence.

Predictive Role of Affinity for Technology Interaction (ATI) on Cognitive Engagement, Emotional Engagement, and Self-Efficacy: Robustness Test Controlling for Gender Groups

Regarding the third research question, the study conducted a robustness test based on gender to assess whether the predictive role of Affinity for Technology Interaction (ATI) on cognitive engagement, emotional engagement, and self-efficacy differs between male and female participants. The results reveal gender-specific variations in the impact of ATI, providing deeper insights into how technology affinity influences learning outcomes across different gender groups. Regardless of gender, a higher affinity for technology fosters greater cognitive and emotional engagement, as well as stronger self-efficacy. Across all students, ATI remains a strong predictor of engagement and self-efficacy.

Among male students, ATI primarily influences cognitive engagement and self-efficacy, suggesting that men benefit more from technology when it enhances their ability to process and retain information (Venkatesh & Morris, 2000). However, ATI has a weaker impact on emotional engagement, possibly because male students may engage with technology in a more task-oriented manner rather than emotionally. Conversely, female students exhibit a stronger link between ATI and emotional engagement, suggesting a more holistic integration of technology into their learning and emotional experiences. This finding aligns with existing research indicating that female users tend to be more cautious about the consequences of technology use (Bouzar et al., 2024; Cai et al., 2017)and express greater concern about excessive dependence on technology.

Additionally, the positive relationship between ATI and writing performance in female students—absent in their male counterparts—underscores gender-based differences in how technology supports task-specific outcomes. In contrast, males exhibit a weaker association between ATI and behavioral engagement, possibly due to differences in how they interact with technology during practice-oriented tasks compared to its effects on cognitive processes and self-confidence. These insights highlight the importance of considering gender-specific responses when designing technology-based educational interventions.

These findings align with studies emphasizing the significant influence of gender on technology adoption (van Elburg et al., 2022; Yeboah et al., 2025). Furthermore, there has been growing interest in exploring how gender classifications shape perspectives on technology (Bouzar et al., 2024; Cai et al., 2017). The study suggests that gender moderates the impact of ATI, with males benefiting more in cognitive engagement and self-efficacy. At the same time, females experience broader benefits, including emotional engagement and writing performance.

Conclusion, Implication, and Future Research

This study examined the impact of AI chatbots in flipped classrooms on students’ writing performance, self-efficacy, and engagement while considering the roles of Affinity for Technology Interaction (ATI) and gender. The findings align with existing research emphasizing that flipped learning and AI tools can enhance engagement and self-efficacy. However, the study also reveals a decline in affinity for technology interaction (ATI), suggesting that students may require additional support or training to fully adapt to AI-driven learning environments. This also suggests that AI chatbots are valuable in supporting writing and engagement. However, their role should be carefully balanced with human interaction to maintain students’ self-efficacy and affinity for technology interaction (ATI). Additionally, Gender differences emerged, with males showing stronger cognitive engagement and self-efficacy, while females experienced broader gains, particularly in emotional engagement and writing performance.

The study’s limitations include constraints in scope and duration, which may affect the generalizability of the findings. Future research should investigate diverse learner populations, compare different AI tools, and assess long-term effects. Additionally, addressing gender-specific responses and instructor proficiency with AI is essential to optimize AI-integrated flipped learning for inclusive and effective pedagogy.

Footnotes

Acknowledgements

The authors would like to thank all participants for taking part in this study.

ORCID iDs

Sri Suciati

Lusia Maryani Silitonga

Ali Akbar Anggara

Ethical Considerations

This study involving human participants was reviewed and approved by the Chairperson of the Institute of Research and Community Service, University of Persatuan Guru Republik Indonesia Semarang, in accordance with the 7 (seven) WHO 2011 Standards. The ethics approval reference number for this study is No. 025/LPPM-UPGRIS/VIII/2024. No animal subjects were involved in this research.

Consent to Participate

Informed consent was obtained from all participants prior to their participation in the study. The consent procedures adhered to the ethical principles outlined in the Council for International Organizations of Medical Sciences (CIOMS) 2016 Guidelines

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

Abeysekera

Dawson

(2015). Motivation and cognitive load in the flipped classroom: Definition, rationale and a call for research. Higher Education Research & Development, 34(1), 1–14. https://doi.org/10.1080/07294360.2014.934336

Akçayır

(2018). The flipped classroom: A review of its advantages and challenges. Computers & Education, 126, 334–345. https://doi.org/10.1016/j.compedu.2018.07.021

Babanoğlu

M. P.

Karataş

T. Ö.

Dündar

(2025). Envisioning the future of AI-assisted EFL teaching and learning: Conceptual representations of prospective teachers. Sage Open, 15(2), 21582440251341590. https://doi.org/10.1177/21582440251341590

Bai

Wang

(2023). The role of growth mindset, self-efficacy and intrinsic value in self-regulated learning and English language learning achievements. Language Teaching Research, 27(1), 207–228. https://doi.org/10.1177/1362168820933190

Bandura

(1977). Self-efficacy: Toward a unifying theory of behavioral change. Psychological Review, 84(2), 191–215.

Bastiansen

M. H. A.

Kroon

A. C.

Araujo

(2022). Female chatbots are helpful, male chatbots are competent? Publizistik, 67(4), 601–623. https://doi.org/10.1007/s11616-022-00762-8

Belda-Medina

Calvo-Ferrer

J. R.

(2022). Using chatbots as AI conversational partners in language learning. Applied Sciences, 12(17), 8427. https://doi.org/10.3390/app12178427

Bergmann

Aaron

(2012). Flip your classroom: Reach every student in every class every day. ( Bergmann

Jonathan.

Sams

Aaron.

, Eds.) International Society for Technology in Education.

Boudouaia

Mouas

Kouider

(2024). A study on ChatGPT-4 as an innovative approach to enhancing English as a foreign language writing learning. Journal of Educational Computing Research, 62, 1289–1317. https://doi.org/10.1177/07356331241247465

10.

Bouzar

El Idrissi

Ghourdou

(2024). Gender differences in perceptions and usage of ChatGPT. International Journal of Humanities and Educational Research, 06(02), 571–582. https://doi.org/10.47832/2757-5403.25.32

11.

Cai

Fan

(2017). Gender and attitudes toward technology use: A meta-analysis. Computers & Education, 105, 1–13. https://doi.org/10.1016/j.compedu.2016.11.003

12.

Deng

Lin

(2022). The benefits and challenges of ChatGPT: An overview. Frontiers in Computing and Intelligent Systems, 2(2), 81–83. https://doi.org/10.54097/feis.v212.4465

13.

Diwanji

Hinkelmann

Witschel

H. F.

(2018). Enhance classroom preparation for flipped classroom using AI and analytics. In Hammoudi

Smialek

Camp

Filipe

(Eds.), ICEIS 2018. 20th International Conference on Enterprise Information Systems (Vol. 1, pp. 477–483). SciTePress. https://doi.org/10.5220/0006807604770483

14.

Duong

T.-N.-A.

Chen

H.-L.

(2025). An AI chatbot for EFL writing: Students’ usage tendencies, writing performance, and perceptions. Journal of Educational Computing Research, 63, 406–430. https://doi.org/10.1177/07356331241312363

15.

Fornell

Larcker

D. F.

(1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research, 18(1), 39–50. https://doi.org/10.2307/3151312

16.

Franke

Attig

Wessel

(2019). A personal resource for technology interaction: Development and validation of the affinity for technology interaction (ATI) scale. International Journal of Human–Computer Interaction, 35(6), 456–467. https://doi.org/10.1080/10447318.2018.1456150

17.

Fryer

L. K.

Bovee

H. N.

(2016). Supporting students’ motivation for e-learning: Teachers matter on and off line. Internet and Higher Education, 30, 21–29. https://doi.org/10.1016/j.iheduc.2016.03.003

18.

Graham

(2022). Self-efficacy and language learning – What it is and what it isn’t. Language Learning Journal, 50(2), 186–207. https://doi.org/10.1080/09571736.2022.2045679

19.

Hair

J. F.

Sarstedt

Hopkins

Kuppelwieser

V. G.

(2014). Partial least squares structural equation modeling (PLS-SEM): An emerging tool in business research. European Business Review, 26(2), 106–121. https://doi.org/10.1108/EBR-10-2013-0128

20.

Hülshoff

Jucks

(2024). Pre-service English teachers’ approaches to technology-assisted teaching and learning: Associations with study level, self-efficacy and value beliefs. Computers and Education Open, 7, 100199. https://doi.org/10.1016/j.caeo.2024.100199

21.

Hung

H. T.

(2015). Flipping the classroom for English language learners to foster active learning. Computer Assisted Language Learning, 28(1), 81–96. https://doi.org/10.1080/09588221.2014.967701

22.

Jeon

Lee

(2024). The impact of a chatbot-assisted flipped approach on EFL learner interaction. Educational Technology & Society, 27, 218–234. https://doi.org/10.30191/ETS.202410_27(4).RP12

23.

Jin

Divitini

(2020). Affinity for technology and teenagers' learning intentions. In Proceedings of the 2020 ACM Conference on International Computing Education Research (pp. 48–55). ACM.

24.

Jin

Jiang

Xiong

Feng

Zhao

(2024). Effects of student engagement in peer feedback on writing performance in higher education. Interactive Learning Environments, 32, 128–143. https://doi.org/10.1080/10494820.2022.2081209

25.

Lee

Wallace

(2018). Flipped learning in the English as a foreign language classroom: Outcomes and perceptions. TESOL Quarterly, 52(1), 62–84. https://doi.org/10.1002/tesq.372

26.

C. K.

Hew

K. F.

(2017). A critical review of flipped classroom challenges in K-12 education: Possible solutions and recommendations for future research. Research and Practice in Technology Enhanced Learning, 12(1), 4. https://doi.org/10.1186/s41039-016-0044-2

27.

C. K.

Hew

K. F.

(2022). Design principles for fully online flipped learning in health professions education: A systematic review of research during the COVID-19 pandemic. BMC Medical Education, 22(1), 720. https://doi.org/10.1186/s12909-022-03782-0

28.

C. K.

Hew

K. F.

(2023). A review of integrating AI-based chatbots into flipped learning: New possibilities and challenges. Frontiers in Education, 8, 1–7. https://doi.org/10.3389/feduc.2023.1175715 Frontiers Media S.A.

29.

Ortikov

Ugli

(2024). The effectiveness of technology-enhanced language learning methods. Oriental Renaissance: Innovative, Educational, Natural and Social Sciences, 4(3). https://www.oriens.uz

30.

Rahman

M. M.

Watanobe

(2023). ChatGPT for education and research: Opportunities, threats, and strategies. Applied Sciences, 13(9), 5783. https://doi.org/10.3390/app13095783

31.

Reeve

Tseng

C.-M.

(2011). Agency as a fourth aspect of students’ engagement during learning activities. Contemporary Educational Psychology, 36(4), 257–267. https://doi.org/10.1016/j.cedpsych.2011.05.002

32.

Scherer

Siddiq

(2019). The relation between students’ socioeconomic status and ICT literacy: Findings from a meta-analysis. Computers & Education, 138, 13–32. https://doi.org/10.1016/j.compedu.2019.04.011

33.

Scherer

Teo

(2019). Unpacking teachers’ intentions to integrate technology: A meta-analysis. Educational Research and Reviews, 27, 90–109. https://doi.org/10.1016/j.edurev.2019.03.001

34.

Shi

MacLeod

Yang

H. H.

(2020). College students’ cognitive learning outcomes in flipped classroom instruction: A meta-analysis of the empirical literature. Journal of Computers in Education, 7(1), 79–103. https://doi.org/10.1007/s40692-019-00142-8

35.

Silitonga

L. M.

Wiyaka, Suciati

Prastikawati

E. F.

(2024). The impact of integrating AI chatbots and microlearning into flipped classrooms: Enhancing students’ motivation and higher-order thinking skills. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 14786 LNCS, 184–193. https://doi.org/10.1007/978-3-031-65884-6_19

36.

Soegoto

E. S.

Albar

C. N.

Luckyardi

Abduh

Asnur

M. N. A.

Haristiani

(2025). IT and management strategies for language education: Lessons from the digitalization of education activities. International Journal of Languages Education, 8(4), 807–835. https://doi.org/10.26858/ijole.v8i4.70019

37.

Song

(2023). Enhancing academic writing skills and motivation: Assessing the efficacy of ChatGPT in AI-assisted language learning for EFL students. Frontiers in Psychology, 14, 1–14. https://doi.org/10.3389/fpsyg.2023.1260843

38.

Lin

Lai

(2023). Collaborating with ChatGPT in argumentative writing classrooms. Assessing Writing, 57, 100752. https://doi.org/10.1016/j.asw.2023.100752

39.

Tseng

Y. C.

Lin

Y. H.

(2024). Enhancing English as a foreign language (EFL) learners’ writing with ChatGPT: A university-level course design. Electronic Journal of e-Learning, 22(2), 78–97. https://doi.org/10.34190/ejel.21.5.3329

40.

van Elburg

F. R. T.

Klaver

N. S.

Nieboer

A. P.

Askari

(2022). Gender differences regarding intention to use mHealth applications in the Dutch elderly population: A cross-sectional study. BMC Geriatrics, 22(1), 449. https://doi.org/10.1186/s12877-022-03130-3

41.

Venkatesh

Morris

M. G.

(2000). Why don’t Men ever stop to ask for directions? Gender, social influence, and their role in technology acceptance and usage behavior1. Technology Acceptance and Usage MIS Quarterly, 24(1), 115–139. https://ssrn.com/abstract=3681106

42.

Wang

Gao

Huang

(2023). Scientific discovery in the age of artificial intelligence. Nature, 620, 47–60. https://doi.org/10.1038/s41586-023-06221-2

43.

Wiyaka

Silitonga

L. M.

Sunardi

Pramudi

Y. T. C.

(2024). From nervous to fluent: The impact of AI chatbot-assisted assessment on English reading anxiety and performance in Indonesia. Theory and Practice in Language Studies, 14(12), 3851–3860. https://doi.org/10.17507/tpls.1412.20

44.

Wollny

Schneider

Di Mitri

Weidlich

Rittberger

Drachsler

(2021). Are we there yet? - A systematic literature review on chatbots in education. Frontiers in Artificial Intelligence, 4, 654924. https://doi.org/10.3389/frai.2021.654924

45.

Woodrow

(2011). College English writing affect: Self-efficacy and anxiety. System, 39(4), 510–522. https://doi.org/10.1016/j.system.2011.10.017

46.

Yeboah

Nyagorme

Bervell

Koi-Akrofi

G. Y.

Bempong

A. E.

(2025). Gender moderation on students’ adoption of WhatsApp for learning-support communication in Sub-Saharan Africa. Journal of Educational Technology Development and Exchange, 18(1), 154–174. https://doi.org/10.18785/jetde.1801.09

47.

Zhai

(2023). ChatGPT for next generation science learning. XRDS Crossroads The ACM Magazine for Students, 29(3), 42–46. https://doi.org/10.1145/3589649

48.

Zhang

(2025). Integrating chatbot technology into English language learning to enhance student engagement and interactive communication skills. Research Article Journal of Computational Methods in Sciences and Engineering, 0(0), 1–12. https://doi.org/10.1177/14727978241312992

Gender and AI in Flipped Classrooms: Unpacking the Impact of Chatbots on Engagement,Self-Efficacy,Writing Performance and Technology Affinity

Abstract

Plain Language Summary

Keywords

Introduction

Literature Review

AI Chatbot-Based Flipped Approach in Writing Classroom

Self-Efficacy and Engagement

Gender and Affinity for Technology Interaction (ATI)

Methodology

Participants

Experimental Procedure

Experimental Activities Involving an AI Chatbot

Instruments

Affinity for Technology Interaction (ATI)

Self-efficacy

Engagement

Writing Performance

Data Collection and Analysis

Results

Comparison Analysis

Analysis of the Affinity for Technology Interaction (ATI)

Structural Equation Modeling (SEM) Analysis

Model I (All Sample)

Model II (Male)

Model III (Female)

Discussion

Impact of AI Chatbot-Enhanced Flipped Learning on Self-Efficacy, Engagement, and ATI

Influence of Affinity for Technology Interaction (ATI) on Self-Efficacy, Engagement, and Writing Performance

Predictive Role of Affinity for Technology Interaction (ATI) on Cognitive Engagement, Emotional Engagement, and Self-Efficacy: Robustness Test Controlling for Gender Groups

Conclusion, Implication, and Future Research

Footnotes

Acknowledgements

ORCID iDs

Ethical Considerations

Consent to Participate

Funding

Declaration of Conflicting Interests

Data Availability Statement

References