Abstract
In recent years, supported by digital technology, virtual reality (VR) has brought about a paradigm shift in education. Students can experience the interactivity and immersive learning offered by VR technology in virtual worlds. This study applies virtual reality as a new instructional mode in field trip courses and explores the factors influencing students’ behavior intention in virtual art field trips (VAFT). By integrating the flow theory and the Unified Theory of Acceptance and Use of Technology 2 (UTAUT2), a research hypothesis model was constructed. The results demonstrate that performance expectancy, facilitating conditions, and flow have positive effects on behavior intention in VAFT courses, with flow showing a particularly strong significant impact. Furthermore, high-quality interactivity and high intensity of immersion indirectly influence behavior intention through students’ flow state. Based on the research findings, this study provides new development ideas and guidance recommendations for VAFT courses in higher education.
Introduction
Field trips are student-centered interactive teaching activities (Behrendt & Franklin, 2014) that contribute to enhancing students’ interest and motivation in learning (Springer et al., 2020). Art field trips, specifically, are field studies focused on art (Florick et al., 2021), where students have the opportunity to personally experience art exhibits in museums, art galleries, or theaters (Erickson et al., 2020). In art education activities, students can develop their creative abilities and enhance their imagination (Sawyer, 2017), gaining artistic inspiration through observation and tactile experiences (Watson, 2019). Shaby and Vedder-Weiss (2020) mentioned in their research that students exhibit higher enthusiasm and motivation while visiting museums and cultural heritage sites compared to other instructional activities. Furthermore, Erickson et al. (2024) found in their study that students’ enthusiasm for school and standardized test scores improved after participating in multiple art field trips.
Despite the potential of field trips, there are limitations that hinder their educational equity and inclusivity (Klippel et al., 2019; Zhao et al., 2020). For instance, not all schools can provide such teaching activities due to differences in school size, quality, and resources. Field trips also require students to have the necessary time, financial means, experience, and physical conditions, making it difficult for economically disadvantaged and disabled students to participate (Chiarella & Vurro, 2020; Giles et al., 2020). In the past 3 years, the comprehensive impact of the COVID-19 pandemic on society has yet to fully recover, further exacerbating the burdens on schools, teachers, and students (Das et al., 2022; Reyes-Portillo et al., 2022). Students’ participation in large-scale travel and practical activities has been reduced by schools due to restrictions on social mobility (Garcia et al., 2023), which has also hindered the implementation of art field trips. Relevant research has indicated a reduction in student participation in art field trips (Erickson et al., 2020; Watson, 2023).
The United Nations Educational, Scientific, and Cultural Organization (UNESCO) indicated in 2023 that the application of technology, open educational resources, and distance education are promising methods for addressing educational pressures, disparities, and achieving equality and inclusivity (Global Education Monitoring Report Team, 2023). Therefore, current field trips need to undergo a transition from traditional forms, methods, and tools to digital and blended forms, methods, and tools (Olsen et al., 2020; Qiu et al., 2021). Massive Open Online Courses (MOOCs) are one such approach, but the effectiveness and completion rates of online learning are controversial due to the lack of face-to-face interaction between learners and teachers (Joshi et al., 2022). Therefore, to sustain tradition and ensure that students have experiences that cannot be replicated in the classroom while meeting the unique needs of teachers and students (Evelpidou et al., 2021; Seifan et al., 2019). Virtual field trips (VFTs; also known as virtual guided tours or virtual tours) have become a new choice for many schools.
With the support of digital technology and virtual reality (VR) technology, virtual reality has brought about a paradigm shift in education, and the application of VFTs in modern teaching is becoming increasingly widespread (Seifan et al., 2020), either as an alternative or supplementary teaching mode to traditional field trips (Han, 2021). Hence, VFTs, supported by VR technology, have demonstrated numerous advantages and educational benefits. Previous research has also indicated that combining traditional field trips with virtual field trips can enhance students’ learning experiences and outcomes (Zhao et al., 2020). Furthermore, incorporating immersive VR-based teaching approached in pre-lesson activities has the potential to stimulate students’ situational interest in subsequent traditional coursework (Cheng, 2021).
Although VR technology is rapidly evolving, there remains a lack of empirical knowledge necessary for evaluating and optimizing immersive learning experiences (Klippel et al., 2020).In response to these challenges, it is crucial to identify strategies that alleviate constraints imposed by objective conditions on traditional field trips, thereby ensuring the benefits are accessible to all students. Considering the practical difficulties associated with traditional art field trips and the emerging trends of virtual art field trips (VAFT) supported by VR technologies in educational contexts, we propose the following research questions:
RQ1: What factors influence the behavioral intentions of design students to participate in VAFT courses?
To further understand students’ experiences within immersive virtual environments, this study also investigates:
RQ2: How do students’ psychological perceptions and experiential states affect their intentions to adopt VAFT courses?
Theoretical Framework and Research Hypotheses
Virtual Art Field Trips
In recent years, virtual technology has emerged as a widely utilized instructional tool within the educational domain (Makransky & Mayer, 2022; Nurlaela et al., 2025; H. Wang et al., 2023; X. Wang et al., 2024; B. Wu et al., 2020; W.-L. Wu et al., 2021). It has also gradually been employed as a guiding tool for tourism, cultural heritage, and museum experiences (Jiang et al., 2022; Serravalle et al., 2019). Through VFTs, students can observe wildlife within natural environments—an experience typically difficult to achieve in conventional science education settings (Filter et al., 2020).
Watson (2019) noted that participation in art field trips significantly contributes to students’ academic achievement and social development. For instance, students engaging in VFTs exploring archaeological sites of the Roman Empire demonstrated considerably improved motivation and academic performance compared to traditional teaching methods (Villena Taranilla et al., 2022). Furthermore, in fashion design courses, augmented reality (AR)-based instruction can enrich students’ learning experiences and enhance their capacity for solving complex problems (Elfeky & Elbyaly, 2021; Yip et al., 2019). In art history courses, the adoption of spherical video-based virtual reality (SVVR) instruction based on Self-regulated strategy (SRS) has helped students enhance their self-regulation and self-learning abilities in immersive learning environments (W.-L. Wu et al., 2021).
It is evident that whether through AR or VR technology, students can benefit from immersive contextual experiences, improving their learning experiences and enhancing their understanding of knowledge (Gu, Chen, Yang, et al., 2022; Udeozor et al., 2021). Virtual reality provides students with a three-dimensional interactive environment that helps simulate real sensory experiences, enhancing spatial perception, presence, and intrinsic motivation, which are highly effective for perceptual understanding (Wu et al., 2020). Song and Li (2018) mentioned in their research that through the application of virtual technology, students can experience design works that would be challenging to achieve in practical operations. Furthermore, compared to field trips, VR technology can enhance the efficiency of knowledge learning. Students have reported that through VR, they can recall the spatial configurations of buildings without physically being present at the site (Udeozor et al., 2021).
Immersive VR, by providing enhanced environmental control, offers learners a greater sense of autonomy, thereby resulting in improved perceptual learning outcomes (Makransky & Mayer, 2022). Cheng (2021) pointed out that using immersive VFTs can enhance students’ self-directed learning abilities, as they actively engage with learning materials within immersive VFT systems and interact with their peers. Virtual reality not only provides consumers with what they may expect in the real world but also helps tourists experience things that are not available in the real world during their travel (Loureiro et al., 2020).
Based on previous research findings, user experience emerges as critically important. Jose et al. (2017) highlighted that traditional field trip courses emphasize experiential learning. In contrast, VR facilitates the creation of virtual scenes from a first-person perspective, simulating interactive behaviors encountered in real-life environments, and generating a realistic experiential learning atmosphere (Han, 2021). This approach significantly enhances students’ sense of presence and provides an immersive learning experience (Cheng, 2021; Xie et al., 2019).
Immersion, identified as a prerequisite for achieving a state of flow, can foster positive user experiences within VR contexts (Michailidis et al., 2018). Jensen and Konradsen (2018) further emphasized that the sense of presence created by immersive experiences is a primary motivation for adopting VR in educational and other settings, with deeper immersion positively influencing learning outcomes. Such immersive experiences not only improve students’ learning engagement but also facilitate their entry into optimal states of flow (Li et al., 2025; Tai et al., 2022; H. Wang et al., 2023).
In summary, we argue that Virtual Art Field Trips (VAFT) represent an innovative instructional approach combining virtual reality technology with traditional art field studies. It exemplifies a novel educational paradigm integrating technological and experiential components. The Unified Theory of Acceptance and Use of Technology 2 (UTAUT2) has been well established as an effective theoretical framework for examining technology acceptance (Venkatesh et al., 2012) and has been widely applied in research investigating users’ intentions to adopt virtual technologies (Bower et al., 2020; Çalışkan et al., 2023; Du & Liang, 2024; Mütterlein et al., 2019; Nurlaela et al., 2025; Xie et al., 2024). Regarding user experience, which has consistently been a critical research concern (Huang et al., 2020), flow is regarded as a reliable indicator for evaluating user experience quality (Perttula et al., 2017). Centering on flow experience as the core construct, this study aims to propose an enhanced explanatory model beyond UTAUT2, thereby exploring and analyzing factors influencing students’ intentions to participate in VAFT courses utilizing SVVR instructional materials, and further examining their impact on students’ behavioral intentions.
Unified Theory of Acceptance and Use of Technology Model
The Unified Theory of Acceptance and Use of Technology (UTAUT) model is an integration of eight models: Theory of Reasoned Action (TRA), Technology Acceptance Model (TAM), Theory of Planned Behavior (TPB), a model combining TAM and TPB, the Motivational Model, the Model of PC Utilization, the Innovation Diffusion Theory, and the Social Cognitive Theory (Venkatesh et al., 2003). UTAUT is widely used to investigate the factors influencing users’ intention to use new imaging technologies in different contexts (Khechine et al., 2016). Initially introduced by Venkatesh et al. (2003), the UTAUT model has recently been applied to various studies on user behavior intention in education (Ogemdi Uchenna & Uzoma Oluchukwu, 2022). Four variables, namely facilitating conditions, performance expectancy, effort expectancy, and social influence, have been identified as key factors influencing behavior intention. Venkatesh et al. (2012) extended the UTAUT model by including consumer background variables (hedonic motivation, price value, and habit) and proposed a new extended model called UTAUT2, while removing voluntariness of use as a moderating variable and retaining gender, age, and experience. UTAUT2 is an integrative technology acceptance theory that includes important explanatory variables from existing technology acceptance models (Venkatesh, Morris, et al., 2003; Venkatesh, Thong, & Xu, 2012) and is commonly applied in educational research related to teaching (Dečman, 2015) and learning software usage (Chen, 2011). The UTAUT2 model consists of seven constructs, and theoretically, these constructs have an impact on the behavior intention to use technology (Harborth & Pape, 2020).
Du and Liang (2024) in their study utilizing the UTAUT2 model to investigate teachers’ continued intention to use VR technology in classrooms, found that performance expectancy, effort expectancy, social influence, facilitating conditions, and hedonic motivation significantly influenced continued usage intention. However, social influence exhibited a weaker impact compared to the other four factors. Similarly, Xu et al. (2024) identified performance expectancy, effort expectancy (with effort expectancy as the strongest predictor), and hedonic motivation as significant determinants influencing teachers’ intentions to adopt AI tools in teaching. While these studies have identified important predictors, further exploration is warranted to comprehensively understand their roles in influencing adoption behavior. However, based on the research objectives and the characteristics of VR technology and the teaching needs of virtual art field trip courses, we believe that the constructs of performance expectancy, effort expectancy, facilitating conditions, and hedonic motivation are more closely related to students’ intention to use VAFT. Therefore, the research hypotheses are based on these factors.
First, in terms of performance expectancy, many studies have confirmed the positive impact of performance expectancy on behavior intention (Bawack & Kamdjoug, 2018; Esawe, 2022; Yu et al., 2021). In a study involving users who utilized spherical video-based virtual reality (SVVR) to watch 360° documentary videos, 360° picture slide-shows, and played a 3-min game, researchers found that performance expectancy was a significant factor in determining users’ intention to use VR devices (Hartl & Berger, 2017). Additionally, research has indicated that performance expectancy has a positive influence on users’ intention to use Sport VR (Kunz & Santomier, 2019). Based on this, the study operationally defines performance expectancy as the level of perceived ability of students to use SVVR to complete virtual artistic wilderness exploration tasks in the VAFT course (Venkatesh et al., 2003) and proposes the following hypothesis:
Hypothesis 1 (H1): Students’ performance expectancy has a positive impact on their behavior intention to use SVVR instructional materials.
Furthermore, in the field of education, the positive impact of effort expectancy and facilitating conditions on behavior intention has been supported by numerous studies (Muniandy et al., 2022; Ogemdi Uchenna & Uzoma Oluchukwu, 2022). Rahmanu et al. (2022), in the context of foreign learners studying the Indonesian language, confirmed the significant influence of performance expectancy, effort expectancy, and facilitating conditions on the willingness to use spherical video-based immersive virtual reality (SV-IVR) during the learning process. Shen et al. (2017) also demonstrated in their study that performance expectancy, effort expectancy, and facilitating conditions have a positive and significant impact on students’ behavior intention to use virtual reality in their learning. In this study, effort expectancy is operationally defined as the level of perceived ease of mastering virtual technology when using SVVR instructional materials in the VAFT course, while facilitating condition is operationally defined as the level of perceived support in terms of relevant technologies and equipment for system operation when using SVVR instructional materials in the VAFT course (Venkatesh et al., 2003). Based on this, the following hypotheses are proposed:
Hypothesis 2 (H2): Students’ effort expectancy has a positive impact on their behavior intention to use SVVR instructional materials.
Hypothesis 3 (H3): Facilitating conditions have a positive impact on their behavior intention to use SVVR instructional materials.
Additionally, hedonic motivation directly influences behavior intention in many situations (Çera et al., 2020; García Botero et al., 2018; Salloum et al., 2019). Sitar-Tăut (2021) noted that if using technology systems brings entertainment or joy, students are more likely to use these technologies and achieve their learning goals. In a study on learning through Immersive Virtual Reality (IVR) games, Udeozor et al. (2021) found that students perceived enjoyment and ease of use to have a positive impact on their behavior intention to engage in IVR game-based learning. Shen et al. (2022), in the context of tourism education during the COVID-19 pandemic, confirmed hedonic motivation as an important predictor for Chinese students in adopting and using augmented reality and virtual reality applications. Therefore, this study operationally defines hedonic motivation as the level of perceived enjoyment of operational behavior when using SVVR instructional materials in the VAFT course (Venkatesh et al., 2012) and proposes the following hypothesis:
Hypothesis 4 (H4): Students’ hedonic motivation has a positive impact on their behavior intention to use SVVR instructional materials.
Flow Experience Based on VR Technology
Flow experience is defined as the overall sensation people feel when fully engaged in an activity (Csikszentmihalyi, 1975). Flow often leads to higher levels of enjoyment and sustained involvement in specific activities (Hsu et al., 2012), and learning in a state of flow is both pleasurable and successful (Nakamura & Csikszentmihalyi, 2014). Therefore, this study operationally defines flow as the degree to which students are engaged in the VAFT course when using SVVR instructional materials (Csikszentmihalyi, 1990). There is a considerable amount of research on flow in the context of virtual technology-based products and services, such as AR mobile games, VR online learning, VR online shopping, and VR tourism (Fan et al., 2022; Kim & Ko, 2019; Oliveira et al., 2021). For example, in terms of user experience, VR technology provides a higher level of vividness and interactivity compared to traditional media such as television, computers, or mobile phones (Kim & Ko, 2019; Nah et al., 2011). In higher education research, whether in online courses or virtual reality courses, flow experience has been shown to impact students’ learning outcomes (Tai et al., 2022). Chen et al. (2021) used AR-assisted (augmented reality-assisted) instructional materials in a basic design course, which not only led students to experience flow under the influence of flow antecedents but also helped them better understand formal structure knowledge. The design of interactive narrative based on AR German picture books had a direct positive impact on flow experience and indirectly influenced satisfaction with AR German picture books through the mediating effect of flow (Gu, Chen, Yang, et al., 2022).
In the context of virtual artistic wilderness exploration, students are required to use VR devices for learning in artistic design. Therefore, factors related to VR technology and learning outcomes are of particular interest in this study. The key features of VR technology are high levels of immersion and interactivity (Makransky & Petersen, 2021; Petersen et al., 2022), while flow experience is closely related to learning outcomes. Therefore, this study posits that perceived interactivity, perceived immersion, and flow experience are closely associated with students’ willingness to use VR devices in the VAFT course. Perceived interactivity is operationally defined as the perceived extent of students’ ability to control interactive behavior using SVVR instructional materials in the VAFT course (Hoffman & Novak, 2009), and perceived immersion is operationally defined as the level of sensory fidelity that an SVVR instructional material provides (Bowman & McMahan, 2007).
Perceived interactivity typically refers to the extent to which students can influence the form or content of the virtual environment in a VR system (Xia & Hwang, 2020). Existing literature has indicated a close relationship between perceived interactivity and immersion (Bae et al., 2020; Joo & Yang, 2023; Komarac & Ozretić Došen, 2022). It is generally believed that being able to interact with the environment, rather than just passively observing it, creates a sense of presence within that environment (McMahan, 2013). Mütterlein (2018) confirmed in their study on VR experiences that perceived interactivity has a positive impact on immersion. Bae et al. (2020) demonstrated in their research on satisfaction with cultural heritage sites and brand loyalty that the interactivity of mixed reality has a positive effect on perceived immersion. Therefore, we propose that perceived interactivity of SVVR in the VAFT course has a positive impact on students’ perceived immersion and presents the following hypothesis:
Hypothesis 5 (H5): Perceived interactivity has a positive impact on perceived immersion.
Existing literature on flow emphasizes that good human-computer interaction can lead to higher levels of immersion and flow states (Animesh et al., 2011; Arghashi & Yuksel, 2022). Rodríguez-Ardura and Meseguer-Artola (2016) mentioned in their research that perceived interactivity is a key factor in students’ use of virtual learning environments and has a positive impact on flow states. Furthermore, Gu, Chen, Lin, et al. (2022) also highlighted that perceived interactivity is an important factor driving students to have a flow experience in the learning process. Therefore, this study posits that in the VAFT course, students’ high-quality interaction with SVVR can lead to a state of flow and proposes the following hypothesis:
Hypothesis 6 (H6): Perceived interactivity has a positive impact on flow experience.
Previous research has mentioned the close relationship between immersion and flow (Mütterlein, 2018; S.-H. Wu et al., 2021), with the highest level of complete immersion leading to a flow experience (Salar et al., 2020). For example, players immersed in VR games can reach a higher level of flow state (S.-H. Wu et al., 2021). IVR technologies enable students to experience full immersion and a sense of presence in a simulated environment (Udeozor et al., 2021), and complete immersion signifies that students have entered a flow state. In other words, a high level of immersion has a positive impact on flow. Guerra-Tamez (2023) also confirmed in their research on the learning experience of art and design students that immersive VR positively influences the flow experience. Based on this, this study posits that in the VAFT course, perceived immersion can facilitate students’ entry into a flow state in SVVR learning, and proposes the following hypothesis:
Hypothesis 7 (H7): Perceived immersion has a positive impact on flow experience.
Furthermore, previous studies have indicated that flow experience influences behavior intention (Kim & Hall, 2019; Liu et al., 2022). Wang et al. (2021) confirmed in their research on Virtual Reality Online Learning Systems (VROLS) that flow experience is the most important factor influencing learners’ intention to use VROLS. Based on this, we propose the following hypothesis:
Hypothesis 8 (H8): Flow experience has a positive impact on behavior intention.
According to the aforementioned hypotheses, this study proposed a hypothetical model consisting of eight dimensions (performance expectancy, effort expectancy, facilitating conditions, hedonic motivation, perceived interactivity, perceived immersion, flow, and behavior intention) and eight corresponding hypotheses, as illustrated in Figure 1.

Hypothetical model.
Methodology
Research Design and Questionnaire Design
Based on the literature review, this study believes that integrating VR technology with art field trip courses is feasible. Virtual learning simulations are typically designed to replace or enhance real-world learning environments by allowing users to manipulate objects and parameters in a virtual environment, and students can observe scenes that are not observable in the real world (Makransky et al., 2019). This study chose SVVR as the instructional device for the art field trip course for its ease of operation and good interactivity, allowing students to obtain high-quality visuals and realistic immersive interactive experiences (W.-L. Wu et al., 2021). Based on this, the teaching experiment of the art field trip course was conducted during the second semester (2023–2024 academic year) in order to collect data. The SVVR learning materials used in the VAFT course were obtained from the UtoVR website (http://www.utovr.com/). In the teaching experiment, students first used the virtual environment of SVVR instructional materials for field trips and completed the learning tasks required by the course syllabus. Then, the course teacher required students to complete a survey questionnaire on-site during the teaching session. The questionnaire items used in this study were modified from scales validated in previous research and were set in Likert’s 7-point style (1 as strongly disagree to 7 as strongly agree). This study employed a Likert’s 7-point scale because previous research has indicated that, compared to a Likert’s 5-point scale, the 7-point scale demonstrates superior reliability and discriminant validity. (Preston & Colman, 2000). Reverse questions were designed in the questionnaire to check the validity of respondents’ answers, and respondents’ focus was assessed through their response times to differentiate valid questionnaires. The sources of the measurement scales are presented in Table 1.
Questionnaire Items.
Data Collection
The participants of this study were students from a design college at a Chinese university. A total of 598 questionnaires were distributed, out of which 525 valid responses were received, accounting for 87.8% of the total. Among the participants, there were 208 male students (39.6%) and 317 female students (60.4%). The majority of students were from the second and third years, with 214 students from the second year and 311 students from the third year. The students were mainly enrolled in five design disciplines, including digital media and fashion design. Additionally, a portion of the students (270 individuals) had previous experience using VR devices for learning or playing VR games, while some students (255 individuals) were encountering VR devices for the first time. Detailed descriptive statistics are presented in Table 2.
Descriptive Analysis of Respondents.
Results
Normality Test
Prior to conducting structural equation modeling, the skewness and kurtosis values of all scale items were examined to ensure that the hypothesis of normality was not violated (Kline, 2023). The analysis was performed using SPSS Version 26.0, and the results indicated that all skewness and kurtosis values fell within acceptable ranges. As shown in Table 3, none of the scale items had a skewness value greater than 3 or a kurtosis value greater than 8, thereby confirming that the data in this study are normally distributed (West et al., 1995). Therefore, further analysis of the data can be appropriately conducted.
Descriptive Statistics of the Measurement Items.
Reliability Test
In this study, the reliability of the questionnaire was assessed using the corrected-item-to-total correlation (CITC), Cronbach’s α if item deleted, and Cronbach’s α coefficient. After deleting FC2 and PIY1, the overall reliability of the questionnaire reached the acceptable standard. The results indicated that all subscales had Cronbach’s α values greater than .7, and the Cronbach’s α values did not improve after deleting any item, suggesting that item deletion was unnecessary. Furthermore, all items had CITC values above .5, indicating good internal consistency of the selected scale in this study. The overall data reliability was considered satisfactory, as presented in Table 4.
Results of Reliability Test.
Exploratory Factor Analysis
Exploratory factor analysis was conducted on the questionnaire items in this study, and the results are presented in Table 5. Firstly, the KMO values for each scale ranged from .678 to .835, all exceeding the critical value of .5. Additionally, Bartlett’s sphericity test results yielded a significance level below .05, indicating the suitability of the data for further factor analysis (Norusis, 1992). Secondly, a principal component analysis was employed to perform factor analysis on the items. The results revealed that each scale extracted only one factor with an eigenvalue greater than 1. Moreover, the cumulative variance contribution of each variable exceeded 60%. These findings indicate that the scales used in this study possess strong explanatory power (Kaiser, 1974). Furthermore, all items exhibited commonalities greater than .5, and the factor loadings were all above .6, consistent with previous research recommendations. Based on the above, the scale utilized in this study demonstrates good unidimensionality (Kohli et al., 1998).
Results of Exploratory Factor Analysis.
The level of significance is .05.
Confirmatory Factor Analysis
Convergent validity and discriminant validity are commonly used tools to assess the validity of a scale. In this study, AMOS V23.0 was employed to analyze the structural equation model. Firstly, factor analysis was conducted to examine the item factor loadings of each item on its corresponding latent factor. The results, as shown in Table 6, indicate that all item factor loadings were above .5, suggesting good explanatory power of each variable by its items. Based on the factor loadings, the squared multiple correlations (SMC), average variance extracted (AVE), and composite reliability (CR) values were calculated for each construct. The SMC values exceeded .4, the AVE values were all greater than .5, and the CR values were all above .7, indicating good convergent validity of the scale (Hair et al., 2006).
Results of the Convergent Validity Test.
The level of significance is .05.
Next, in terms of discriminant validity, this study calculated the correlation coefficients between each pair of constructs and compared them with the square root of the AVE values for each variable. The results, as presented in Table 7, demonstrate that the correlation coefficients between constructs are smaller than the square root of the AVE values, indicating good discriminant validity (Fornell & Larcker, 1981). Furthermore, Carrión et al. (2016) mentioned that the Fornell–Larcker Criterion and cross-loading criteria are not very reliable for detecting discriminant validity problems. They suggested using the Heterotrait–Monotrait (HTMT) ratio of correlations, which has demonstrated sufficient reliability in model assessment. As shown in Table 8, the HTMT ratios exhibit good scores, all below .85, which aligns with the criteria recommended in previous studies (Kline, 2023). Therefore, the constructs examined in this study possess good discriminant validity.
Discriminant Validity: Fornell–Larcker Criterion.
Note. Diagonal (bold) entries are the square roots of AVE.
The level of significance is .05.
Discriminant Validity: Heterotrait–Monotrait Ratio (HTMT).
Lastly, this study assessed the model fit indices through confirmatory factor analysis (CFA), and the results are presented in Table 9. According to the recommended criteria by Hair et al. (2006), if the chi-square to degrees of freedom ratio (χ2/df) is below 5, the RMSEA value is less than .08, the SRMR value is below .08, and the GFI, AGFI, NFI, and CFI values are all above .9, it indicates a good fit for the model. Additionally, we conducted the common latent factor method (CCLFM) to test for common method bias by establishing a control model. However, the results of the model fit indices did not show a significant change between CCLFM and CFA. The change in RMSEA for CCLFM was less than .05, and the changes in GFI, TLI, NFI, CFI, and SRMR did not exceed the critical value of .1. This suggests that introducing the common method factor did not result in a significant improvement in the model. Therefore, the issue of common method bias in this study has been effectively controlled, and no substantial method bias was found in the data. In conclusion, the model in this study demonstrates a good fit.
Comparison of Fit Indices Between the Measurement Model and the CCLFM Model.
Note.χ2/df = normed Chi-square; RMSEA = root mean square error approximation; GFI = Goodness of Fit Index; TLI = Tucker–Lewis Index, NFI = Normative Fit Index; CFI = Comparative Fit Index; SRMR = standardized root mean square residual.
Path Analysis
Table 10 presents the fit indices for the structural equation model. The χ2/df is below 5, and the values for GFI, TLI, NFI, and CFI are all above .9, indicating a good fit to the recommended standards (Hair et al., 2006). Additionally, the RMSEA and RMR values are both below .05, which further supports a good fit for the model. Moreover, all the standard model fit degree evaluation indices meet the independent level and combination rule of the recommended fit criteria, confirming that the structural model has a good fit.
Results of Model Fit Test.
Note.χ2/df = normed Chi-square; RMSEA = root mean square error approximation; GFI = Goodness of Fit Index; TLI = Tucker–Lewis Index; NFI = Normative Fit Index; CFI = Comparative Fit Index; SRMR = standardized root mean square residual.
In this study, the bias-corrected percentile method was utilized to examine the direct and indirect effects of the influencing paths, and the results are presented in Table 11. The results indicate that the two-tailed significance of the H2 and H3 paths is greater than .05, indicating that the coefficients for these paths are not statistically significant. However, apart from the H2 and H3 paths, all other paths in the hypothesized model demonstrate significance. There exist positive relationships among the constructs in all the relational paths, as illustrated in Figure 2.
Results of Path Analysis.
The level of significance is .05.

Results of the structural equation model test.
Discussion
The results of this study demonstrate that performance expectancy and facilitating conditions have a positive and significant impact on behavior intention (H1 and H3 are supported). Particularly, the β value of facilitating conditions is significantly higher than that of performance expectancy (.309 > .154), indicating that facilitating conditions have a stronger influence on behavior intention. In other words, students in the VAFT course place greater importance on whether SVVR-related technologies, equipment, and resources can support their completion of the course content. This result aligns with previous research that emphasizes facilitating conditions as a prerequisite for behavior intention (Shen et al., 2017). Preparing the necessary equipment, resources, and technological support is essential for the system to be used and for users to engage in the intended behavior (Bervell & Arkorful, 2020; Emhmed et al., 2021). Additionally, ample resource support enables new technology systems to be more easily used, which generates a range of responses from users, regardless of their prior experience (Bervell & Arkorful, 2020; Emhmed et al., 2021). That the initial perception and willingness to use a learning management system (LMS) for remote education are triggered by facilitating conditions, including the presence and availability of resources, support, and motivation (Bervell & Umar, 2017a, 2017b). The more facilitating conditions exist, the higher the voluntary intention to use the system. For students in the VAFT course, the implementation of virtual field art field trip using SVVR equipment can only be achieved when the school has the necessary equipment, instructional materials, and technological support. The more abundant and accessible the relevant instructional materials and the easier it is to access SVVR equipment, the stronger the students’ motivation to try and use them. Conversely, the frustration that arises from seeking the necessary resources and technical assistance for accessing the equipment can hinder students’ actual usage. Therefore, facilitating conditions serve as the foundational basis and key influencing factor for students to engage in virtual field art field trip.
Relatively speaking, although performance expectancy has a positive impact on behavior intention, its influence is not as strong as facilitating conditions. The primary reason for this lies in the variability of instructional materials and fluctuations in students’ subjective factors. On one hand, differences in instructional resources can impact both the content and quality of the course, which in turn may influence students’ learning outcomes (Ajoke, 2017; Hung et al., 2017). High-quality instructional materials provide rich content and engaging presentations, attracting students and making it easier for them to understand the material, thereby yielding positive learning outcomes. Conversely, poor-quality materials may have the opposite effect, causing students to lose interest and consider not participating in the VAFT course. On the other hand, there are individual differences among students (Huang, 2021; Liu, 2020; Mushtaq et al., 2021). For some students, they possess a rich imagination and stronger adaptability, allowing them to gain more knowledge through virtual artworks and achieve better learning outcomes. However, due to physiological differences, some students may experience various issues such as dizziness, increased fatigue, or discomfort in the eyes when engaging with VAFT, which may prevent them from attaining the expected learning effects. These fluctuations in these aspects contribute to variations in performance expectancy and the strength of their influence.
The influence of effort expectancy on behavior intention is not significant (H2 is not supported). This result contradicts previous studies (Al-Adwan et al., 2022; Muniandy et al., 2022; Nurlaela et al., 2025; Ogemdi Uchenna & Uzoma Oluchukwu, 2022). However, there are also limited studies that support this finding (Ayaz & Yanartaş, 2020; GC et al., 2024; Jasrai, 2025). This may be attributed to the distinct interaction methods of SVVR—such as gesture control, controller input, and a 360-degree interface—which differ significantly from conventional mouse, keyboard, and touchscreen operations (Abdelaziz et al., 2014). Compared to traditional devices, learning to operate and interact within an SVVR environment can be more demanding, requiring students to invest additional time and effort for adaptation (Erra et al., 2019). As a result, students tend to focus more on the system’s features, functionalities, and perceived benefits rather than its ease of use or simplicity, requiring students to invest more time and effort in learning and adaptation (Jasrai, 2025). These findings suggest that the use of SVVR in VAFT courses remains relatively novel, and students’ unfamiliarity with the technology, along with potential operational challenges during the adoption process, may weaken the influence of effort expectancy on their behavioral intentions.
Additionally, the influence of hedonic motivation on behavioral intention was found to be insignificant (H4 was not supported), and there is limited literature that aligns with this result (Benrahal et al., 2022; Gupta et al., 2018; Hwang & Mulyana, 2022). Although these studies are consistent with our findings, the study by Xu et al. (2024) revealed a significant effect of hedonic motivation on behavioral intention. In their research, university instructors believed that using AI tools introduced new learning and application challenges, which, in turn, stimulated their curiosity and desire for exploration, thereby increasing their motivation to adopt such tools. However, this relationship does not appear to be consistent across different user groups. Some studies have shown that hedonic motivation may not significantly influence students’ intentions to use educational tools (GC et al., 2024; Grassini et al., 2024). This suggests that students may focus more on the content being delivered rather than on how engaging or immersive the content is designed to be, resulting in a lack of perceived entertainment during the learning process. Moreover, learning is often regarded as a serious endeavor, and students are frequently under pressure to achieve academic success. This pressure can diminish the sense of enjoyment and reduce the perceived entertainment value of educational experiences.
Perceived interactivity has a significant positive impact on perceived immersion (H5 is supported). This result is supported by previous studies (Bae et al., 2020). Multisensory interaction allows students to immerse themselves in the virtual environment (Sanfilippo et al., 2022), enriching the atmosphere of artistic exploration and shaping the immersive experience. Perceived interactivity and perceived immersion have a significant positive impact on flow (H6, H7 are supported), which is consistent with previous research findings (Makransky & Petersen, 2021; Petersen et al., 2022). On one hand, using VR in educational systems can enhance the interactivity of the learning process (Li et al., 2023), thereby improving students’ interactive experience and facilitating their entry into the state of flow. On the other hand, research has also indicated that VR-based educational systems provide a realistic and engaging learning environment, and the sense of immersion has a positive impact on students’ learning (Udeozor et al., 2021), making it easier for them to enter the state of flow. Furthermore, in the context of art courses using VAFT for artistic exploration, flow has a strong significant impact on behavior intention (H8 is supported). This means that when students are in a state of flow during their learning, their learning outcomes are enjoyable and successful (Nakamura & Csikszentmihalyi, 2014). This finding echoes the results of Li et al. (2025), who identified flow as a key determinant of behavioral intention. The use of SVVR in VAFT courses changes the learning experience from traditional teaching methods, making it easier for students to enter a state of flow. As a result, students perceive that SVVR can bring them more benefits and are more willing to use it.
It is worth noting that perceived interactivity and perceived immersion have an indirect relationship with behavior intention through flow (β = .427, p < .001; β = .214, p < .001), indicating a complete mediation. This means that the quality of interaction and immersion design in SVVR cannot directly influence students’ behavior intention. Even in a favorable environment, it cannot guarantee students’ learning outcomes. The purpose of students using instructional tools for learning is to acquire knowledge (Dumiyanto et al., 2021; Nirbita & Sartika, 2021). On the other hand, are more about creating an environment and facilitating teaching. They are external factors that play a disruptive role rather than a determinant role. Therefore, they cannot guarantee students’ learning outcomes. If students cannot achieve their learning goals, they naturally cannot generate the intention to use the tool. On the other hand, the state of flow is a psychological state that directly affects one’s attention, intrinsic motivation, and ability to absorb knowledge (Conradty et al., 2023; Marty-Dugas et al., 2021). When students enter a state of flow during learning, it not only means that they are attracted to the instructional content but also indicates their recognition of the meaningfulness and value of artistic exploration, leading to a comprehensive understanding of the relevant knowledge and rapid absorption in an exhilarating state. Therefore, only when flow is achieved can learning outcomes be produced and directly influence behavioral intention. Although perceived interactivity and perceived immersion, as external factors, cannot directly influence behavioral intention, they are among the many facilitating conditions for the generation of flow. Therefore, the promoting and facilitating role of perceived interactivity and perceived immersion in SVVR should not be overlooked.
This ability to experience virtual field trips in different locations without leaving one’s home not only alleviates the pressures on schools in terms of safety, funding, and accountability but more importantly, it provides opportunities for students who are unable to participate in traditional field trips due to economic constraints or disabilities. These students can engage in VFTs and gain the same learning opportunities and benefits as their peers (Bonali et al., 2019; Chiarella & Vurro, 2020), thus promoting fairness and inclusiveness in art and design education.
Conclusions
This study contributes to the literature on technology adoption in higher education by exploring the influence of UTAUT2 variables and flow experience on students’ behavioral intentions within the context of VAFT. As immersive technologies such as SVVR become increasingly prevalent in educational settings, gaining a deeper understanding of the interaction between students’ cognitive engagement and instructional strategies can help foster greater motivation and initiative in future learning environments. The findings suggest that students using SVVR in VAFT courses place greater emphasis on the availability of adequate support conditions, the realization of expected learning outcomes, and the achievement of academic goals. Moreover, the richness of interaction and the high degree of immersion—serving as external facilitating factors—also significantly influence students’ intention to adopt SVVR. These findings provide a research foundation and guidance for the development of virtual reality art and design education, as well as important insights for teachers and the development of SVVR systems. The main recommendations include:
(1) In VAFT courses, schools and teachers need to provide adequate infrastructure and support conditions to meet various needs for course implementation and student learning. This includes an adequate quantity of high-quality VR devices, content-rich instructional materials, and related technical support (support for interaction with artworks or support for multi-person online communication, etc.).
(2) In the practical application of SVVR instructional materials, emphasis should be placed on delivering high-quality visual output to improve the quality of virtual content and align with students’ psychological expectations. By collecting students’ learning behavior data throughout the learning process, personalized feedback and guidance can be provided, further enhancing students’ engagement with SVVR and improving their overall learning outcomes.
(3) Enhancing the interactive quality and immersion of the learning content is an effective means to help students achieve a state of flow in their learning experience. Therefore, it is important to provide richer sensory interactions and realistic virtual learning environments. This study suggests that the benefits of VAFT in art and design education are mainly reflected in two aspects. Firstly, VR can help students experience virtual learning environments in a first-person perspective, which has a positive impact on cultivating students’ autonomous creativity and allows for diverse and unique teaching methods. Secondly, it enables students to immerse themselves in learning tasks for longer periods and achieve a state of flow, thereby enhancing their intention to use SVVR as a teaching aid.
Research Limitations and Future Studies
While this study explores and analyzes the flow experience of students participating in VAFT and its impact on their intention to participate in VAFT, there are some limitations that need to be considered in future research.
Firstly, the sample in this study primarily consists of students from the School of Design at a university in China, who possess specific characteristics in terms of professional background, learning styles, and acceptance of virtual technologies. As such, the generalizability of the findings may be limited when applied to students from other academic disciplines or educational contexts. Students from different fields may vary in their learning motivation, cognitive styles, frequency of technology use, and course participation behaviors, which could lead to different pathways influencing their flow experience and behavioral intention. Additionally, this study did not cover students from other grade levels, such as elementary or middle school students. The limited sample may not fully highlight the differences between college students and students at other stages in art and design field trips.
Secondly, there may be gender differences in the use of SVVR for learning. Future research could conduct comparative studies across a broader range of academic disciplines, educational levels, and gender groups to further validate the robustness and generalizability of the proposed model. Such efforts would provide more comprehensive theoretical support for the application of VAFT in interdisciplinary educational contexts.
Thirdly, it is recommended that future research integrate video-watching behavior into the VAFT learning system across different disciplines to encourage students to engage in higher-level learning and enhance their higher-order thinking skills, thereby broadening the horizontal scope of the research and supporting deeper understanding.
Finally, the current instructional materials remain primarily content-driven, lacking systematic design in terms of interactivity and immersion, which may limit students’ perceived enjoyment and flow experience. This limitation suggests that future research should place greater emphasis on enhancing the consistency and quality of instructional materials through improvements in content updates, interface interaction, and visual presentation. It is also important to consider students’ subjective factors—such as technology acceptance, learning motivation, and cognitive engagement—as these variables may have significant impacts on behavioral intention. Incorporating these factors into future models could enhance both the explanatory power and applicability of research findings, thereby contributing to more effective design and implementation of VAFT systems across diverse educational contexts.
Footnotes
Acknowledgements
Thanks to Jiangnan University School of Design for providing related support.
Ethical Considerations
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of the Ministry of Social Science Changshu Institute of Technology (protocol code No. CIT MSS-E-2024-003).
Consent to Participate
Informed consents were obtained from all participants before the survey; anonymity was assured. The authors confirmed that all participants involved in the study were informed of its purposes, potential outcomes, and their rights, including the right to withdraw at any point. They gave their consent to participate voluntarily.
Consent for Publication
Not applicable.
Author Contributions
Conceptualization, J.C.; Methodology, J.C.; Software, W.W.; Validation, J.C. and X.S.; Formal analysis, W.W. and J.B.; Investigation, X.S. and J.B.; Resources, J.C. and W.W.; Data curation, J.C.; Writing—original draft preparation, J.C.; Writing—review and editing, J.C. and J.B. All authors have read and agreed to the published version of the manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Research Start-up Fund for High-level Talents of Huaqiao University under the project “Research on the Construction of Visual Assets in Digital Media Art” (Project No. 24SKBS011). Undergraduate Education and Teaching Reform Project of Huaqiao University titled “Innovative Integration of Virtual Technology and AI-based Intelligent Interaction in Basic Art Education: A Case Study of the ‘Design Fundamentals’ Course” (Project No. HQJGYB2544).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.
