Abstract
Artificial intelligence is revolutionizing music creation and consumption, yet understanding sustained user engagement with AI-generated digital music platforms remains a critical knowledge gap. This study investigates factors influencing users’ continuous usage intention for these platforms using the Stimulus-Organism-Response (S-O-R) theory. We develop and empirically validate a research model integrating multi-dimensional user experience with dual mediating effects of emotional attachment and technology self-efficacy. Using Partial Least Squares Structural Equation Modeling (PLS-SEM) on 509 valid questionnaires, we reveal key findings. All five user experience dimensions—sensory, social, entertainment, content, and functional—significantly influence both emotional attachment and technology self-efficacy (
Plain Language Summary
This study explores why people continue using AI-powered music creation platforms. Researchers found that users’ experiences on these platforms, especially social interactions and content quality, strongly influence their emotional attachment. Surprisingly, users’ confidence in using the technology doesn’t significantly affect their intention to keep using the platform. Instead, the emotional bond users form with the platform is the key factor in determining whether they’ll stick around. This challenges traditional ideas about technology adoption and suggests that AI music platforms should focus on creating meaningful, emotionally engaging experiences rather than just improving technical features. The findings could help developers design better AI music platforms that users love and want to use long-term.
Keywords
Introduction
The application of artificial intelligence technology in creative industries is increasingly widespread, especially in the music field, where generative AI is profoundly changing traditional music creation, production, and distribution modes (Anantrasirichai & Bull, 2022; Ardeliya et al., 2024). According to GEMA’s calculations, the global market size for AI-generated music in 2023 was $300 million. It is expected that by 2028, the AI music market will grow more than tenfold, with a compound annual growth rate of about 60%, and the market size will exceed $3 billion (CISAC, 2024). This means that in just a few years, the size of the AI music market will reach 28% of global music copyright revenue in 2022 (CISAC, 2024). AI-generated digital music platforms, as emerging technological applications, not only provide unprecedented creative tools for music creators but also bring new music experiences for ordinary users (Atanacković, 2024). Statistics show that currently, 60% of musicians are using AI tools to create music, with 20.3% of musicians using artificial intelligence in the music production process, and 36.8% of music producers have incorporated AI into their workflows (Shalwa, 2024). Particularly among the younger generation, 51% of creators under 35 actively use AI to create music (Shalwa, 2024). However, the rapid development of AI-generated music has also sparked a series of controversies and challenges. Notably, 82% of listeners find it difficult to distinguish between AI-created music and human-created music, reflecting the progress of AI music technology (Shalwa, 2024). At the same time, music creators’ income may decrease by 27% by 2028, highlighting the urgent need for balanced regulation of AI music creation (Shalwa, 2024). Faced with these complex phenomena, it is necessary to delve deeper into users’ usage behavior of AI-generated digital music platforms to better understand the actual impact and value of these platforms.
In recent years, academic attention to artificial intelligence music creation has grown significantly, with research focus gradually shifting from early technical evaluation to in-depth exploration of user psychology and behavior. Early systematic reviews primarily concentrated on technical trends and capabilities of AI music generation systems (Civit et al., 2022; W. Yang et al., 2024), while research on factors influencing users’ continuous use of AI-generated digital music platforms remained insufficient. However, emerging studies have begun to reveal the complex social and psychological dimensions of AI-generated digital music platforms. Suh et al.’s (2021) research suggests that AI can play the role of a “social glue” in social music creation, affecting human social dynamics and creative processes, indicating that users’ continuous usage intention may be influenced by complex social and psychological factors rather than solely depending on technological characteristics. Lin and Chen (2024) further point out that AI-generated music not only provides new possibilities for creators but also offers economical solutions for general music playback needs, expanding our understanding of user motivations.
Recent research has identified users’ psychological perceptions as an unavoidable key factor, revealing several critical phenomena. A significant finding is the existence of “AI bias”: when listeners believe music is created by AI, they tend to give lower evaluations, even when users cannot distinguish differences in audio quality (Ansani et al., 2025; Candusso, 2024). Furthermore, although evidence suggests that AI music can evoke emotional responses from listeners (van Schaik, 2025), listeners generally believe that human-created music has greater advantages in the “authenticity” and “soulfulness” of emotional expression, representing a core challenge facing AI music (Lecamwasam & Chaudhuri, 2025; Q. Li & Ma, 2024). These findings highlight the complex interplay between technological capabilities and human psychological responses in AI music consumption.
While existing literature has made progress in user perception and emotional arousal, a critical research gap remains that limits our in-depth understanding of user behavior on AI-generated digital music platforms. Most studies focus on users’ one-time evaluations of individual musical works (S. Wang et al., 2021), yet fail to explain what factors drive users to use AI-generated digital music platforms continuously and long-term, given known biases and emotional limitations. Previous research has emphasized technical functions and immediate perceptions (Mokoena & Obagbuwa, 2025), while deeper factors such as emotional connections and psychological experiences formed through users’ continuous interactions with platforms have been largely overlooked. This research deficiency creates two main problems: first, inability to identify key factors driving continuous user engagement, bringing uncertainty to platform development; second, limited understanding of specific user psychological mechanisms, constraining in-depth prediction of user behavior. Given the unique human-machine interaction model of AI-generated digital music platforms, exploring potential mediating factors becomes particularly important. For instance, high-quality user experience may promote continuous usage by enhancing users’ emotional attachment to the platform (Xiang et al., 2022), while good functional design may increase usage intention by improving users’ technology self-efficacy (Y.-Y. Wang & Chuang, 2024).
Revealing these mechanisms not only contributes to theory construction but also has significant implications for practitioners. For developers and operators of AI-generated digital music platforms, user retention is a key indicator of platform success. By understanding the relationship chain between user experience, psychological states, and usage intentions, platforms can more targetedly optimize design, enhance users’ emotional connections and self-efficacy, thereby improving user retention rates. For example, if research finds that social experience significantly affects continuous usage intentions by enhancing emotional attachment, platforms can focus on improving social function design; if technology self-efficacy is proven to be an important mediating factor, platforms can strengthen user training and guidance to help users better master AI music creation tools. Through such in-depth mechanism research, we can not only explain “what” factors influence users’ continuous usage intentions but also answer questions of “why” and “how,” thus providing richer and more valuable insights for theoretical research and practical application of AI-generated digital music platforms. Based on the above discussion, this study aims to fill the gaps in existing research by exploring in depth the factors influencing users’ continuous use of AI-generated digital music platforms and their mechanisms. To this end, we propose the following research questions (RQs):
To address these research questions, this study adopts the S-O-R (Stimulus-Organism-Response) framework to construct a comprehensive research model where multiple user experience dimensions (sensory, social, entertainment, content, and functional) serve as stimuli, emotional attachment and technology self-efficacy as internal organism states, and continuous usage intentions as the response. This model investigates antecedents of continuous usage intentions and reveals underlying influence mechanisms. The study contributes by constructing a multi-dimensional user experience framework that enriches AI-generated digital music platform theory, examining mediating roles of emotional attachment and technology self-efficacy to reveal internal psychological processes of continuous usage intention formation, and applying S-O-R theory to AI-generated digital music platforms, expanding theoretical applicability in emerging technology fields. Through this research, we provide theoretical insights and practical suggestions for platform development, helping designers and operators better understand user needs and promote healthy development of AI music technology.
Theoretical Background
S-O-R Theory
One of the theoretical foundations of this study is the S-O-R (Stimulus-Organism-Response) model proposed by Mehrabian and Russell (1974). This model reveals the relationships between external stimuli (Stimulus), internal human states (Organism), and behavioral responses (Response), providing a framework for explaining how environmental factors influence human psychology and behavior. The logical rationale for selecting this theory lies in its unique advantages and high compatibility with the research context of this study. AI-generated digital music platforms are not merely tools but complex environments that provide rich sensory, social, and emotional experiences. The “Stimulus (S)” dimension of S-O-R theory can perfectly accommodate these multidimensional user experiences as inputs. More critically, its “Organism (O)” component provides us with a theoretical “black box,” allowing us to deeply explore how users’ internal psychological states—including emotions (such as emotional attachment) and cognition (such as technology self-efficacy)—are activated by these external experiences. Finally, this theory connects these internal psychological changes with external behavioral intentions (continuous usage intention) through the “Response (R)” component, forming a complete causal chain. Therefore, the process-oriented and integrative perspective of S-O-R theory, which can simultaneously accommodate experience, emotion, and cognition, makes it an ideal theoretical tool for interpreting user behavior on AI-generated digital music platforms.
In the context of this study, S-O-R theory provides a theoretical perspective for exploring user experience and continuous usage behavior. External stimuli (S) refer to the platform’s multidimensional user experiences, internal states (O) refer to users’ levels of emotional attachment and technology self-efficacy, and behavioral responses (R) refer to users’ continuous usage intention. Users perceive different levels of experiential stimuli in their interactions with the platform, triggering changes in emotional connection and self-efficacy, ultimately driving their behavioral decisions to continue using the platform. Numerous empirical studies have confirmed the explanatory power of S-O-R theory for user behavior in digital product/service contexts (Kabadayi et al., 2023; Patel et al., 2024). However, existing literature mostly focuses on general digital products/services and still lacks systematic analysis of user behavior mechanisms in generative AI contexts. The unique attributes of AI-generated content platforms, such as human-machine co-creation, algorithmic recommendation, and intelligent interaction, may have differential impacts on S-O-R mechanisms. This study aims to expand the S-O-R framework by incorporating novel user experience dimensions and special mediation mechanisms in AI contexts, deeply exploring the user experience-response chain of AI-generated digital music platforms, with the goal of enriching the application of S-O-R theory in algorithm-driven intelligent service domains.
User Experience Theory
User Experience is an interdisciplinary research topic spanning human-computer interaction, cognitive psychology, and design studies. ISO (2019) defines it as “user’s perceptions and responses that result from the use or anticipated use of a product, system or service,” emphasizing that user experience originates from human-system interaction and encompasses the entire experience chain. Norman’s (2013) three-level emotional design theory provides a framework for understanding user experience, pointing out that user experience includes visceral, behavioral, and reflective levels, featuring multi-level and multi-dimensional characteristics. McCarthy and Wright (2004) further proposed a four-element user experience framework, including compositional, sensual, emotional, and spatio-temporal elements, providing theoretical support for characterizing the multi-dimensionality of user experience. Additionally, Pine and Joseph’s (1998) four-dimensional model of tourist experience and Oh et al.’s (2007) comprehensive tourist experience framework reveal the interaction of multiple elements in experience scenarios, offering insights for understanding the user experience of digital products and services.
However, existing research mainly focuses on general products and services, with less exploration of continuous usage behavior of generative AI products from the perspective of experience dimensions. Based on AI-generated digital music platforms, this study reconstructs the user experience framework from five dimensions: sensory experience, social experience, entertainment experience, content experience, and functional experience, combining the unique attributes of generative AI platforms with classical user experience theories. Specifically, sensory experience stems from Norman’s visceral level and McCarthy’s sensual element (McCarthy & Wright, 2004; Norman, 2013). AI-generated digital music platforms create audiovisual feasts through intelligent algorithms and immersive design, with personalized music recommendations and intelligent timbre matching stimulating users’ emotional responses and influencing their impressions and emotional connections to the platform. Social experience echoes the compositional element of McCarthy’s model (McCarthy & Wright, 2004), aligning with the platform’s social attributes. Specifically, this dimension refers to the interactions between users (user-user interaction), wherein users communicate, share, and collaborate based on common musical interests within the platform. These friendly social functions promote a sense of belonging and emotional attachment, becoming a key factor influencing user retention. Entertainment experience draws on experience economy theory (Pine & Joseph, 1998). The platform satisfies users’ auditory enjoyment and creative pleasure through algorithmic generation and interactive experiences, with quality entertainment experience driving continuous use. Content experience originates from the platform’s core value proposition, namely the massive, diverse, and high-quality AI-generated music works. The innovation, richness, and matching degree of content generation algorithms directly affect users’ willingness to use. AI technology decentralizes content production, making content personalization and user creativity release key issues. Functional experience is based on Norman’s behavioral level theory (Norman, 2013). The platform provides diverse tools for music creation, editing, and sharing, with the usability, practicality, and stability of functions affecting user task efficiency and overall experience. In the context of AI-empowered music creation, platforms need to provide low-threshold, personalized, and intelligent functional support to help users unleash creativity and achieve human-machine collaboration. Thus, the quality of functional experience has a crucial impact on users’ continuous usage intention.
In summary, the five dimensions of sensory experience, social experience, entertainment experience, content experience, and functional experience constitute the core user value proposition of AI-generated digital music platforms, reflecting their digital, intelligent, social, and entertainment characteristics. Although existing user experience theories provide a theoretical basis for this dimensional division, they are mainly limited to general digital service contexts and lack targeted examination of the unique experience mechanisms of generative AI platforms. This study aims to verify the applicability of this multi-dimensional experience framework through empirical analysis, enriching the application of user experience theory in the field of AI-generated creative services.
Emotional Attachment
Emotional attachment theory originates from attachment theory in the field of psychology, first proposed by British psychologist Bowlby (1969) when studying mother-infant relationships. He believed that an infant’s attachment to its mother is a natural response aimed at reducing distance with the attachment figure to enhance a sense of security, with an emotional bond existing between the two. Shultz (2007) extended attachment theory to the marketing field, discovering that this attachment emotion exists not only between people but also between people and objects. Subsequently, scholars have defined the connotation of emotional attachment based on different application contexts. For example, Park et al. (2006) distinguished emotional attachment from satisfaction and usage preference, defining it as the emotional bond connecting an individual to a brand. Maulana and Eckhardt (2007) found that users’ emotional attachment to websites stems from long-term human-computer interaction processes. M. Luo (2022) studied the influence of virtual experience marketing on consumer purchasing behavior, exploring how virtual interactions affect users’ attachment to online environments. Z. Li (2023) examined the impact of tourists’ participation in value co-creation on experience value and revisit intention, emphasizing how user participation in services promotes emotional attachment and drives long-term behavior. Emotional attachment can form between people and between people and objects. It acts like a bond, closely connecting users with products and users with other users and has a subtle influence on users’ continuous usage behavior. Combining existing research, this paper defines emotional attachment as: the long-term emotional experience and connection formed between users and the platform during the use of AI-generated digital music platforms. In essence, emotional attachment is an affective and relational construct, focusing on the user’s emotional bond with the platform, which is conceptually distinct from a user’s cognitive evaluation of their own operational skills.
Existing research confirms that many factors influence the formation of users’ emotional attachment, such as self-realization, positive emotions, social identity, and user experience. Thomson (2006) proposed that when consumers satisfy their needs and achieve self-improvement through products, they develop an attachment to the brand. Bartels and Zeki (2004) pointed out that if a product or brand can bring positive emotions, it will evoke users’ emotional attachment. Stokburger-Sauer et al. (2012) introduced social identity theory, finding that consumers’ perceived brand-self congruity and brand distinctiveness significantly affect their attachment to the brand. Meanwhile, the formation of emotional attachment has a positive impact on user behavior change. Hung (2014) demonstrated that brand attachment in online communities effectively promotes brand loyalty and word-of-mouth behavior, revealing that emotional attachment represents a deep bond formed through long-term use rather than short-term responses. This bond strengthens user-brand stickiness over time. Traditional research examining user experience and behavioral intentions primarily focused on satisfaction and trust as mediating variables, with emotional attachment receiving less attention. However, AI-generated digital music platforms offer unique value propositions—immersive experiences, personalized recommendations, and intelligent creative assistance—that facilitate stronger emotional connections through sustained platform interactions. Therefore, this study introduces emotional attachment as a mediating variable to explore how users’ multi-dimensional experiences influence continuous usage intentions.
Technology Self-Efficacy
Self-efficacy, proposed by American psychologist Bandura in social cognitive theory, refers to an individual’s belief and judgment about successfully executing a task, and is an important factor influencing behavior. Davis et al. further proposed Computer Self-efficacy, which is an individual’s judgment of their ability to use computers to complete specific tasks. With technological development, researchers have derived concepts such as mobile self-efficacy and internet self-efficacy. Technology self-efficacy is an extension of computer self-efficacy in specific technological fields. McDonald and Siegall (1992) first proposed this concept, referring to an individual’s level of confidence in their ability to learn and apply new technologies. Since then, numerous studies have explored its measurement, antecedent variables, and outcome variables. For example, Y. Wang et al. (2024) found in the context of English as a Foreign Language (EFL) teaching that technology self-efficacy significantly predicts students’ acceptance of educational technology. Xu et al. (2024) studied its impact on Information and Communication Technology (ICT) usage behavior in the education field in the post-pandemic era, emphasizing the importance of cultivating technology self-efficacy. In this study, technology self-efficacy refers to users’ evaluation and confidence level in their ability to use AI-generated digital music platforms, reflecting their sense of control over platform functions and ability to deal with problems. Crucially, it is a cognitive construct centered on the user’s internal assessment of their own capabilities, which is theoretically different from the affective connection they may form with the platform itself. In AI-generated digital creative platforms, users are not only content receivers but also creative participants. Faced with massive music element libraries, intelligent creative tools, and diverse templates, users need to possess certain musical literacy and technical abilities, both of which jointly determine their technology self-efficacy.
Technology self-efficacy has a significant impact on users’ adoption of new technologies and platforms. Igbaria (1995) found in the context of PC use that technology self-efficacy has a positive effect on perceived ease of use and usage intention. M.-H. Hsu and Chiu (2004) also pointed out that internet technology self-efficacy is a key factor influencing continuous usage intention of electronic services. The stronger the users’ technology self-efficacy, the more they tend to believe they can obtain the desired experience, thus being more willing to try and continue using related technologies. Recent research further reveals its role in the field of educational technology. Y. Wang et al. (2024) found that technology self-efficacy significantly predicts Chinese EFL learners’ acceptance of educational technology, outperforming other psychological factors such as achievement emotions. Xu et al. (2024) found perceived trust mediates the relationship between technology self-efficacy and ICT usage behavior, with perceived security and electronic word-of-mouth as moderators, emphasizing the importance of cultivating technological confidence. However, existing research focuses mainly on general information systems. In AI creative platforms, users must efficiently collaborate with AI for creation, posing new challenges to technology self-efficacy. AI technology lowers music creation barriers by enabling creation without professional knowledge, enhancing self-efficacy. Conversely, complex AI functions, massive material tags, and frequent recommendations may increase cognitive load, weakening technology self-efficacy. This study explores technology self-efficacy’s unique mechanism in AI-generated digital music platforms, providing theoretical support for optimizing platform design and user retention.
Continuous Usage Intention
Continuous Usage Intention is a key concept in information systems research, referring to users’ subjective willingness to continue using a technology or service after initial use (Bhattacherjee, 2001b). In the field of digital platforms and online services, continuous usage intention is crucial for the long-term success of platforms, as it directly affects user retention rates and platform vitality (Cho, 2016). Its theoretical foundation can be traced back to the Technology Acceptance Model proposed by Davis et al. (1989), which emphasizes perceived usefulness and perceived ease of use as key factors in users’ acceptance of new technologies. However, as research deepened, scholars found that initial acceptance and continuous use are two different behavioral decision processes. Based on this, Bhattacherjee (2001b) proposed the IS Continuance Model, integrating Expectation-Confirmation Theory, emphasizing the impact of user satisfaction and perceived usefulness on continuous usage intention. In recent years, with the iteration of technology and changes in user needs, the research framework of continuous usage intention has been continuously expanding. Kim and Oh (2011) pointed out that hedonic motivation plays an important role in users’ continuous use decisions. Hsiao et al. (2016) further explored the role of social factors in the continuous use of mobile social applications, finding that social interaction and social influence significantly affect user willingness. In the context of artificial intelligence and generative technology, H. Li et al. (2024) revealed the importance of innovation and personalized experiences for users’ continuous use of AI-generated services.
This study proposes a multi-dimensional user experience model based on the S-O-R framework for AI-generated digital music platforms’ continuous usage intention. S includes sensory, social, entertainment, content, and functional experiences. O comprises emotional attachment (user-platform connection) and technology self-efficacy (confidence in AI music tools). R is continuous usage intention. The model suggests experience dimensions directly and indirectly affect usage intention through emotional attachment and technology self-efficacy, considering both traditional information system factors and AI’s unique impact on music creation.
Research Model and Hypotheses
Research Model
The integrated model explores influencing mechanisms of users’ continuous usage intentions on AI-generated digital music platforms. It consists of three parts: S with five user experience dimensions covering various platform experiences; O with emotional attachment and technology self-efficacy as mediating variables; and R with continuous usage intentions as the dependent variable. The model hypothesizes that user experience dimensions directly and indirectly affect continuous usage intentions through the mediating variables, as shown in Figure 1.

The theoretical model.
Research Hypotheses
The Impact of User Experience Dimensions on Emotional Attachment
In AI-generated music platforms, the multi-dimensional nature of user experience significantly influences emotional attachment. This study explores five key dimensions of user experience: sensory experience, social experience, entertainment experience, content experience, and functional experience. Sensory experience refers to the visual, auditory, and tactile impressions perceived by users during their interaction with the platform. Hu and Kim’s (2024) latest research on Suno AI shows that users’ sensory experience with AI-generated music significantly enhanced their emotional connection to the platform. Visual design, interface layout, and interactive elements improved users’ sense of immersion, fostering stronger emotional attachment. Social experience refers to users’ interactions with other platform users, including communication, sharing, and a sense of community belonging. Ali et al. (2024) found that social interaction and community support are important sources of emotional attachment, especially in the context of AI-generated content. Entertainment experience captures the fun and pleasure users derive from the platform. Chu et al.’s (2022) study shows that highly entertaining AI-generated digital music platforms attract repeated use. Content experience encompasses the quality, innovation, and personalization of music content provided by the platform. Gu (2024) emphasizes that the uniqueness and artistic value of AI-generated content greatly influence users’ emotional attachment. Functional experience includes music creation tools, editing functions, and personalized recommendations provided by the platform. W. Li (2025) found that the diversity and ease of use of platform functions directly affect users’ emotional attachment.
Based on these research findings, each user experience dimension influences users’ emotional attachment in unique and significant ways through platform design, social interaction, entertainment value, content quality, and functional completeness. Therefore, this study proposes the following hypotheses:
The Impact of User Experience Dimensions on Technology Self-Efficacy
Hu and Kim’s (2024) research on Suno AI shows that sensory design can significantly boost users’ confidence. Specifically, the design of the user interface, page layout, and color scheme affect the smoothness of users’ interaction with the platform, thereby enhancing their confidence and perceived ability with AI technology. In generative AI platforms, when users interact pleasantly with visual and auditory elements, their technology self-efficacy is significantly improved. Ali et al. (2024) found that social experience can significantly enhance users’ technology self-efficacy. By sharing music or participating in discussions with other creators, users receive feedback and support. This interaction not only boosts confidence but also improves their sense of control over technical operations. Chu et al.’s (2022) research indicates that entertainment experience plays an important role in enhancing technology self-efficacy. While enjoying the process of AI-generated music creation, users not only improve their proficiency in using platform tools but also strengthen their confidence in the technology. Gu (2024) emphasizes that high-quality, personalized content is crucial in enhancing technology self-efficacy. When the platform provides quality music content that matches users’ interests, users feel more at ease and confident when creating and using platform technology. W. Li (2025) found that the usability and diversity of platform functions directly affect users’ technology self-efficacy. In AI-generated digital music platforms, well-designed creation tools with comprehensive functions help users complete creative tasks efficiently, thus enhancing their confidence in using the technology.
The five user experience dimensions—sensory experience, social experience, entertainment experience, content experience, and functional experience—significantly influence users’ technology self-efficacy through different mechanisms. Based on this, this study proposes the following research hypotheses:
The Impact of Mediating Variables
Emotional attachment plays a crucial role in influencing users’ continuous usage intention. Research by D. Wang et al. (2015) shows that emotional attachment can significantly drive continuous usage intention on social network platforms. They found that when users establish an emotional connection with the platform, they are more likely to use it long-term. Especially in social platforms, emotional investment promotes user loyalty, further driving their continued use behavior. Recent studies have also confirmed the importance of emotional attachment in various technology platforms. Vafaei-Zadeh et al. (2024) found in their research on personal cloud storage services that emotional attachment is one of the key factors affecting users’ continued use. Users’ emotional investment and dependence on the platform directly influence their behavioral intentions, namely continuous usage intention. This indicates that emotional attachment plays a decisive role in user decision-making. In the context of AI-generated digital music platforms, emotional attachment can enhance users’ stickiness to the platform, promoting the formation of stronger continuous usage intention. Users’ emotional connection to AI-generated music platforms may stem from personalized music creation experiences, community interactions, and dependence on the platform’s unique features, all of which collectively promote long-term user engagement.
Technology self-efficacy also has a significant positive impact on users’ continuous usage intention. Research by Seridaran et al. (2024) reveals the mediating role of technology self-efficacy in mobile application continuous usage intention. They found that when users are confident in their technical operation of the platform, they are more likely to perform tasks on the platform and use it long-term. This sense of self-efficacy makes users feel in control of platform functions, thereby enhancing their intention for continued use. Liu et al.’s (2022) research in the field of mobile health services further emphasizes the importance of technology self-efficacy. They point out that technology self-efficacy is closely related to users’ sense of self-control and capability on the platform. When users feel they can effectively use platform functions, they are more likely to continue using the platform. In the complex environment of AI music creation, users’ technology self-efficacy may have a profound impact on continuous usage behavior. High levels of technology self-efficacy give users more confidence to explore advanced platform features, overcome technical challenges, and maintain active participation in music creation.
Emotional attachment and technology self-efficacy, as mediating variables, promote users’ continuous usage intention by enhancing users’ emotional connection and loyalty, and technical confidence and operational ability, respectively. The combined effect of these two has a significant impact on the long-term engagement of users on AI-generated digital music platforms. Based on this, this study proposes the following hypotheses:
Mediating Effect
Emotional attachment, as a key factor in establishing long-term relationships between users and platforms, has been proven to have a significant mediating effect in multiple studies. Xiang et al. (2022) found that emotional attachment significantly influenced continuous usage intention by enhancing users’ emotional connection to tourism platforms. This provides important insights for understanding the role of emotional attachment in AI-generated digital music platforms. Regarding sensory experience, Z. Wang and Wang (2023) verified how interface design factors influence users’ continuous usage intention through emotional attachment. They found that high-quality visual and tactile experiences can enhance users’ emotional connection, thereby promoting usage intention. The mediating effect of social experience was supported by J. Luo et al.’s (2020) research, which showed that social interactions in virtual communities significantly influence users’ willingness to continuously share knowledge by enhancing emotional attachment, emphasizing the importance of social experience in AI-generated digital music platforms. M. Yang et al.’s (2021) research focused on entertainment experience, finding that entertainment content in social media increased users’ platform loyalty and continuous usage intention by enhancing emotional attachment, highlighting the key role of entertainment experience in AI-generated digital music platforms. For content experience, C.-L. Hsu and Chen (2018) pointed out that high-quality and personalized content significantly increased usage intention by enhancing users’ emotional attachment, emphasizing the importance of content quality in AI-generated digital music platforms. Finally, J. Li’s (2019) research analyzed the impact of functional experience on user behavior, finding that functional design and tools promoted continuous usage intention by enhancing emotional attachment, highlighting the critical position of functional experience in AI-generated digital music platforms.
AI-generated digital music platforms integrate technological innovation and artistic creation, providing unique user experiences. Sensory experience is crucial as music is a sensory art. Social experience enables sharing creations, receiving feedback, and collaborating. Entertainment experience is significant as users are both consumers and creators. Content experience reflects user-created works and platform-generated content. These factors influence continuous usage intention by enhancing emotional attachment, particularly prominent in AI music platforms as they serve as carriers of creative expression and artistic exploration rather than mere tools. This study proposes the following hypotheses:
The mediating role of technology self-efficacy between user experience and continuous usage intention is equally crucial. Yu et al. (2025) point out that technology self-efficacy plays a significant mediating role in digital behavioral decision-making, affecting users’ continuous usage intention toward platforms. When users have sufficient control and confidence in technological operations, they are more inclined to use the platform long-term. Regarding sensory experience, Agnihotri et al.’s (2024) research shows that technology self-efficacy mediates users’ sensory experience in virtual environments, enhancing user engagement and continuous usage intention. This finding is equally applicable to the sensory experience context of AI-generated digital music platforms. The relationship between social experience and technology self-efficacy was verified in Yao et al.’s (2023) study. They found that social experience significantly enhanced users’ usage intention for social media platforms by increasing their technology self-efficacy, emphasizing the mediating role of technology self-efficacy between social experience and continuous usage intention. Concerning entertainment experience, Yan et al. (2024) found in their study of e-learning systems that technology self-efficacy played a significant mediating role between users’ perceived ease of use and continued use. This finding can be extended to the context of entertainment experience: when users enjoy the platform’s entertainment content, if they feel proficient in controlling the technology, they are more likely to continue using the platform. For content experience, Yu et al. (2025) proved that technology self-efficacy is an important mediating factor between content experience and users’ continuous usage intention. When users can confidently use the platform’s content and tools, they are more likely to continue using the platform. Finally, Zheng et al. (2024) verified the mediating role of technology self-efficacy between functional experience and continuous usage intention, finding that users’ confidence level in operating platform functions directly influenced their usage decisions, especially in cases of complex functionality.
Generative AI music platforms combine advanced AI with music creation, creating complex, innovative environments where technology self-efficacy is crucial. Users must master AI-assisted tools, requiring higher technical abilities than traditional platforms. The five experience dimensions (sensory, social, entertainment, content, functional) may influence continuous usage intention through enhanced technology self-efficacy. Intuitive interfaces boost operational confidence, user-friendly social functions increase sharing confidence, entertaining AI generation sparks technical exploration interest, high-quality AI content enhances technology confidence, and feature-rich tools directly improve self-efficacy. This technology-creativity fusion may amplify technology self-efficacy’s mediating role compared to other platforms. The study proposes the following hypotheses:
Methods
Participants
This study investigates user behavior on “AI-generated digital music platforms,” specifically defined as platforms centered on AI-assisted creation where users play a dominant role by providing creative inputs (e.g., text prompts, melodies). This focus on active creators excludes “pure AI generation” models where users are passive listeners, as user experiences differ fundamentally between these contexts. Following approval by X University’s Academic Ethics Committee in December 2024, this study employed a questionnaire survey method. To avoid selection bias from traditional snowball sampling, we collaborated with a leading Chinese professional survey company (Wen Juan Xing) using their sample service function. The survey was conducted over 1 week in July 2024, with participants accessing the online questionnaire through WeChat or email links.
Several measures protected participant rights and well-being. The anonymous online questionnaire collected no personally identifiable information. A detailed informed consent form explained the research purpose, voluntary participation, and data confidentiality. Participants were informed of their right to withdraw without penalty and had to formally consent before starting. After eliminating questionnaires with patterned responses and high missing rates, 509 valid questionnaires remained for analysis.
The 509 valid respondents reveal a predominantly young user base, with 82.32% aged 18 to 35. Gender distribution was relatively balanced. Most participants were corporate employees/white-collar workers with college diplomas or bachelor’s degrees. Regarding platform usage, most had been using platforms for 1 to 6 months with high frequency (daily or near-daily). Detailed demographics are in Table 1.
Demographic Profile of Respondents (
Measures
The scales used in this study have been validated in numerous studies and applied in the Chinese context. The original English scales were translated into Chinese and back-translated to ensure consistency in meaning between the two language versions. Unless otherwise specified, all scales used a 7-point Likert scale (1 = strongly disagree, 7 = strongly agree).
Sensory Experience was measured using a 3-item scale developed by Venkatesh and Davis (2000). The items include “The AI-generated digital music platform page is visually appealing,”“The overall color scheme and layout of the AI-generated digital music platform page are good,” and “The page classification and navigation design of the AI-generated digital music platform are clear, reasonable, and hierarchical.” The composite reliability (CR) and Cronbach’s α of this scale were .917 and .865, respectively.
Social Experience was measured using a 4-item scale developed by Leung (2003). Items include “I can smoothly communicate music creation experiences with other users on this platform,”“The AI-generated music I share can gain recognition and support from other users,”“I have established friendships with other music enthusiasts through the platform,” and “I can gain a sense of belonging and satisfaction when using this AI-generated digital music platform.” The CR and Cronbach’s α of this scale were .918 and .882, respectively.
Entertainment Experience was measured using a 3-item scale developed by Mehmetoglu and Engen (2011) and Oh et al. (2007). Items include “The process of creating and sharing music on this AI-generated digital music platform makes me feel pleased,”“I enjoy watching music works shared by other users on this AI-generated digital music platform,” and “Exploring the various functions provided by this AI-generated digital music platform is an entertainment experience in itself.” The CR and Cronbach’s α of this scale were .908 and .849, respectively.
Content Experience was measured using a 4-item scale developed by Waqas, Salleh, and Hamzah (2021). Items include “The music content provided by this AI-generated digital music platform is of high quality,”“This AI-generated digital music platform can provide personalized music content recommendations,”“This AI-generated digital music platform provides unique and novel music content,” and “The music content of this AI-generated digital music platform has high artistic value and is worth appreciating.” The CR and Cronbach’s α of this scale were .916 and .877, respectively.
Functional Experience was measured using a 3-item scale developed by Coker (2013). Items include “The songs recommended by the platform’s AI music recommendation are ones I like,”“The platform’s music generation functions are rich and diverse,” and “The platform’s music editing and adjustment function design can meet my needs.” The CR and Cronbach’s α of this scale were .906 and .845, respectively.
Emotional Attachment was measured using a 3-item scale developed by Thomson et al. (2005). Items include “I have an emotional connection (Affection) to this AI-generated digital music platform,”“Using this AI-generated digital music platform makes me feel pleasant (Passion),” and “I feel there is a bond (Connection) between me and this AI-generated digital music platform.” The CR and Cronbach’s α of this scale were .905 and .842, respectively.
Technology Self-efficacy was measured using a 4-item scale developed by Huffman et al. (2013) and Compeau and Higgins (1995). Items include “I can successfully use all functions of this AI-generated digital music platform even without guidance,”“I am confident that I can use this AI-generated digital music platform well even if I have never used a similar platform before,”“I can skillfully use this AI-generated digital music platform to complete the tasks I want,” and “For me, it is easy to learn to use all functions of this AI-generated digital music platform.” The CR and Cronbach’s α of this scale were .912 and .872, respectively.
Continuous Usage Intention was measured using a 3-item scale developed by Bhattacherjee (2001a). Items include “I am willing to continue using this AI-generated digital music platform without terminating its use,”“I am willing to recommend this AI-generated digital music platform to others,” and “I am willing to continue using this platform rather than other alternative AI-generated digital music platforms.” The CR and Cronbach’s α of this scale were .925 and .879, respectively.
Results
Measurement Model
Reliability and Validity
Internal consistency reliability measures result stability and reliability. This study uses Cronbach’s Alpha (CA) and Composite Reliability (CR). CA ranges 0 to 1, with α > .7 indicating reliability and α > .9 high reliability. CR reflects construct indicator consistency, with CR > 0.7 acceptable, though Fornell and Larcker (1981) suggest CR > 0.6 suffices. Average Variance Extracted (AVE) tests convergent validity, showing how well observed variables explain latent variables. AVE ≥ 0.5 indicates good composite validity. Results are in Table 2.
Reliability and Validity.
Discriminant Validity
Discriminant validity analysis verifies statistical differences between constructs. Items from different constructs should not be highly correlated (correlation coefficient < .85), otherwise they may measure the same concept. This study uses the AVE method following Fornell and Larcker (1981): the square root of AVE for each factor should exceed its correlation coefficients with other factors. Results show diagonal AVE square root values exceed non-diagonal correlation coefficients, confirming good discriminant validity. Table 3 shows detailed results with diagonal representing AVE square roots and lower triangle showing correlation coefficients.
Discriminant Validity.
Next, we used the Heterotrait-Monotrait (HTMT) ratio, which compares between-trait correlations to within-trait correlations. Results in Table 4 show all HTMT values below 0.85, indicating good discriminant validity between variables.
HTMT Discriminant Validity.
Structural Equation Model
Predictive Relevance Indicator Q 2
Q2 measures predictive relevance of exogenous variables for endogenous variables (0–1 range). Values > 0 indicate predictive capability; negative values show no predictive power. Thresholds: 0.02 to 0.13 (small), 0.13 to 0.26 (medium), >0.26 (large).
Predictive Relevance Indicator
Collinearity Diagnosis VIF
A collinearity diagnostic analysis was performed on the model. The results are shown in Table 6. It can be concluded that the VIF values between the measurement variables are all below 8, and the VIF values between the latent variable factors are also below 3, indicating that there is no collinearity in the model, as shown in Table 7.
Collinearity of Measurement Variable Indicators.
Collinearity of Latent Variable Indicators.
Significance of Path Coefficients
Path coefficients assess hypothesis relationships. Standardized coefficients (−1 to 1) indicate correlation strength and direction.

Results of the hypothesized structural equation model.
Path Coefficients of PLS Structural Equation Model.
Results show five user experience dimensions (SeE, SoE, EE, CE, FE) significantly positively affect both EA and TS (
Effect Size (f 2)
Effect size (
Effect Size Assessment
Bootstrap Mediation Effect Test
To demonstrate the mediating effect, this study used the Bootstrap mediation effect test method to examine whether the mediating effect is significant. The confidence interval was set to Bias Corrected (95%), with 5,000 resampling iterations. The results of the mediation effect test are shown in the Table 10 below.
Bootstrap Mediation Effect Test.
Mediation analysis shows 95% confidence intervals for technology self-efficacy mediated paths (SeE/SoE/EE/CE/FE→TS→CUI) include 0 (non-significant), while emotional attachment mediated paths (SeE/SoE/EE/CE/FE→EA→CUI) exclude 0 (significant). Therefore,
Discussion
Discussion of Key Findings
We obtained several important findings from the current study. Firstly, the empirical results of this study support
Secondly, emotional attachment demonstrates a significant mediating role between user experience and continuous usage intention, validating hypotheses
Thirdly, a key finding of this study is that the direct effect of technology self-efficacy on continuous usage intention is not significant (β = .070,
Second, AI-generated digital music platforms are inherently hedonistic, experience-oriented systems where emotional factors dominate decision-making (Fišer et al., 2025; Lerch et al., 2025). Users’ primary motivation involves experiencing creative joy, sensory pleasure, and emotional expression rather than completing utilitarian tasks. In such “sensory consumption” and “emotional experience” scenarios, user behavior follows emotional logic over instrumental rationality. Consequently, emotional attachment (β = .615,
Fourthly, through a systematic analysis of the total effects of various user experience dimensions, this study finds that social experience (total effect = 0.196) and content experience (total effect = 0.187) show the most significant impact. The prominence of these two dimensions can be understood by their fundamental roles in fulfilling users’ higher-order needs. Content experience represents the platform’s core value proposition, as high-quality, personalized, and novel AI-generated music directly satisfies the user’s primary creative and esthetic goals. Social experience, in turn, addresses a deeper, intrinsic human need for connection and recognition, transforming the solitary act of creation into a powerful social ritual. This hierarchy of importance aligns well with the perspective of Conservation of Resources Theory (Hobfoll et al., 2018). In the unique environment of AI-generated digital music platforms, the most valued resources users seek to gain are not merely functional but are centered on cognitive pleasure (derived from high-quality content) and emotional support (derived from social interaction). The other dimensions—functional, sensory, and entertainment—play crucial, albeit supportive, roles, highlighting their interplay. They function as foundational enablers. A seamless functional experience and an appealing sensory design are necessary prerequisites for users to create high-quality content efficiently and pleasantly. Likewise, an entertaining process enhances both content creation and social interaction. In essence, while functional, sensory, and entertainment experiences ensure the platform is usable and enjoyable, it is the quality of the content and the richness of the social interactions that ultimately provide the sustained, higher-order value that keeps users coming back.
Theoretical Implications
This research offers several significant theoretical implications for understanding user behavior in the era of generative AI. Firstly, our findings challenge the universality of traditional technology acceptance paradigms. As established in our discussion, the non-significant influence of technology self-efficacy—long considered a core antecedent of usage behavior (Bandura, 1997; Compeau & Higgins, 1995)—is a critical theoretical anomaly. This suggests that the boundary conditions of established models like the Technology Acceptance Model must be re-examined. Our study contributes by identifying a specific, emerging context where a core psychological antecedent—the sense of technological control—loses its predictive power, thus calling for a more nuanced application of these classic theories in the AI era.
Secondly, this study refines the application of the S-O-R theory in technologically advanced environments. The clear asymmetry in the mediating effects—where the emotional pathway (via emotional attachment) decisively outweighs the cognitive pathway (via technology self-efficacy)—enriches the S-O-R framework originally proposed by Mehrabian and Russell (1974). It demonstrates that the “Organism” state is not always a balanced interplay of parallel cognitive and emotional responses. In AI-driven creative contexts, the emotional pathway can become the dominant mechanism, a finding that deepens our understanding of user decision-making processes.
Lastly, this study extends user experience theory by revealing the shifting hierarchy of experiential dimensions. The prominence of social and content experiences over functional aspects suggests a re-weighting of the user experience construct is necessary for AI-native platforms (Hassenzahl & Tractinsky, 2006). In these environments, user value appears to be migrating from traditional usability and functionality (Norman, 2007) toward the quality of AI-generated content and the social interactions it fosters. This echoes the principle of context-dependency in user experience (Forlizzi & Battarbee, 2004) and provides an empirical foundation for developing new UX models tailored to the unique characteristics of generative AI.
Implications to Practice
The findings of this study provide profound and actionable practical guidance for AI-driven creative platforms. First, the research clearly indicates that social experience and content experience are the two strongest drivers with the greatest total effect on continuous usage intention. Therefore, platforms should invest core resources in these two dimensions to establish and strengthen users’ emotional attachment. Specifically, platforms should not merely stop at providing basic social functions such as likes and comments. To maximize the promotional effect of social experience on emotional attachment, deeper interactive mechanisms should be introduced. Currently, platforms like Suno AI already possess the “Continue this song” feature, allowing users to create derivative works based on others’ compositions, which represents the embryonic form of social co-creation. Platforms can further deepen this model by, for example, developing “AI-assisted collaborative arrangement” functions that allow User A to generate a melody through text input, then invite User B to add vocals or harmonies through humming, ultimately completing a collaborative work. Additionally, platforms can design social sharing incentive mechanisms. For instance, drawing inspiration from “Tianpuyue’s” AI music collection campaign in collaboration with Guangzhou cultural tourism, platforms can regularly host themed creative challenges (such as “Summer Cyberpunk” or “Graduation Songs”) or establish “This Week’s Popular AI Works” rankings, providing honor badges like “Certified Musician” for highly active creators. These features not only enhance users’ sense of participation and community belonging but also establish strong emotional bonds between users and between users and the platform through collaboration and recognition.
Furthermore, content experience optimization should not be limited to precise algorithmic recommendations. To enhance users’ emotional connection with content, platforms can introduce an “emotional tagging” system. Beyond traditional genre classifications, users or AI can tag music with emotional or scenario labels such as “healing,”“immersive,”“epic,” or “cyberpunk rainy night.” For example, when users input “write a jazz piece for listening to while driving alone on a rainy night” in Suno or Udio, platforms not only generate music but can also automatically attach tags like “rainy night,”“driving,”“jazz,” and “contemplative.” This not only enables emotion-based music discovery but also makes users feel that the platform “understands me,” thereby deepening emotional attachment. Additionally, platforms can enhance the artistic value perception by emphasizing the uniqueness and narrative quality of AI music. For instance, they can draw inspiration from “Tianpuyue’s”“Dream Composer” program, showcasing the collaborative process between AI and human artists, or display the creative prompts that generated the music alongside user works, allowing other users to see the creativity behind them and imbue algorithmic outputs with warm humanistic emotions.
Second, the findings of this study are not limited to AI-generated digital music platforms but also have important implications for other AI-driven creative technology domains. The core conclusion of this study—that the importance of emotional attachment exceeds technology self-efficacy—is likely applicable to AI creative platforms that share two core characteristics: significantly lowering technical entry barriers and having hedonic and esthetic experiences as primary user goals. For AI painting platforms (such as Midjourney, Kling AI, etc.), users similarly play the role of “art director” or “creative director” rather than “digital painter” requiring mastery of advanced painting techniques. The core experience is equally esthetic and emotional. Therefore, this study’s model is likely equally applicable. Platform designers should also focus on strengthening the artistic exchange atmosphere within communities (such as hosting themed exhibitions and establishing style challenges) and enhancing emotional resonance with content, rather than merely pursuing technical indicators of generated images. For AI writing platforms (such as iyunbi, ima.copilot, etc.), the applicability of this study’s model may depend on users’ specific writing objectives. For creative writing (such as poetry, novels, scripts), emotional connection (users’ emotional investment in stories and characters) may remain key. However, for utilitarian writing (such as technical reports, marketing copy, code assistance), factors from traditional technology acceptance theory, such as perceived usefulness, may significantly regain importance, and the role of technology self-efficacy may also become important again. Based on this, for the entire AI creative field, platform design paradigms are undergoing profound transformation. For platforms aimed at inspiring user creativity and providing esthetic experiences, the key to success is no longer merely providing a powerful “tool” but creating a “community” and “creative partner” where users can immerse themselves, establish emotional connections, and gain recognition.
Limitations and Future Research Directions
Although this study reveals the key mechanisms underlying users’ continuous usage intention for AI-generated digital music platforms, several limitations remain. First, this study primarily employs cross-sectional data for analysis, which cannot eliminate potential endogeneity issues between variables. Second, the scope of this study focuses on exploring the driving factors of continuous usage while not addressing mechanisms leading to user churn. Finally, the sample of this study consists entirely of Chinese users, and this geographical and cultural specificity poses significant limitations to the generalizability of our research findings. Cultural factors can profoundly influence user behavior, particularly in terms of emotional attachment and technology adoption. For instance, Chinese culture, typically characterized by collectivist values, may place greater emphasis on social harmony and community integration, which might amplify the strong influence of “social experience” on emotional attachment observed in this study. In contrast, users from more individualistic cultural backgrounds may form emotional attachment based on different experiential drivers (such as personal achievement or self-expression). Therefore, the relationships identified in this research model may not be universally applicable.
Based on these limitations, this study opens multiple directions for future theoretical exploration. The most important finding of this study—the non-significant mediation effect of technology self-efficacy—represents a noteworthy null result that warrants deeper theoretical explanation in future research. For example, researchers could examine the contextual specificity of this phenomenon by comparing whether technology self-efficacy similarly fails in other AI-generated creative platforms such as digital painting and writing, or by introducing new moderating variables (such as users’ familiarity with AI technology). Furthermore, future research should also consider the impact of AI technology’s “black box effect” on user trust and construct a “bidirectional behavioral model” that integrates user retention and churn. Additionally, future research should strive to validate this research model across different cultural contexts. A cross-cultural comparative study would be invaluable for testing the robustness of this study’s findings and revealing how cultural dimensions moderate the pathways between user experience, emotional attachment, and continuous usage intention.
Conclusion
While AI-generated digital music platforms offer unprecedented creative opportunities, understanding the drivers of long-term user retention remains challenging. This study identified multi-dimensional user experience factors as critical antecedents of continuous usage intention. The impacts of sensory, social, entertainment, content, and functional experiences were primarily mediated by users’ emotional attachment to the platform. However, technology self-efficacy showed no significant mediating role, underscoring that in this context, emotional connection is a more critical driver of sustained engagement than users’ technical confidence.
Supplemental Material
sj-docx-1-sgo-10.1177_21582440251406713 – Supplemental material for The Impact of User Experience on Continuous Usage Intention for AI-Generated Digital Music Platforms: Examining the Mediating Roles of Emotional Attachment and Technology Self-Efficacy
Supplemental material, sj-docx-1-sgo-10.1177_21582440251406713 for The Impact of User Experience on Continuous Usage Intention for AI-Generated Digital Music Platforms: Examining the Mediating Roles of Emotional Attachment and Technology Self-Efficacy by Zhixin Liu in SAGE Open
Footnotes
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
