Abstract
Audiobooks have gained significant popularity in China and other regions worldwide. While scholars and practitioners have extensively researched strategies to develop the audiobook market, limited attention has been given to the factors influencing audiences’ attitude toward audiobooks, which is a gap this study aimed to explore. Survey data was collected from a total of 537 consumers and analyzed using partial least squares structural equation modeling. The results indicate that experiencing telepresence and emotional connectedness while listening to audiobooks positively contributes to audiences’ affective outcomes, fostering their favorable attitude. The findings also reveal that narrator performance, background music, and the narrative style of an audiobook can potentially enhance telepresence and emotional connectedness. These results provide both theoretical insights and practical suggestions for audiobook publishers and other industry players.
Plain language summary
This study aims to explore the factors affecting the audiences’ attitude towards audiobooks. A total of 537 consumers responded and the data were analyzed via PLS-SEM. The research findings show that the narrators’ performance, background music, and the narration style of the audiobook potentially promote telepresence and emotional connectedness. The results also show that experiencing telepresence and emotional connectedness while listening to the audiobook contributes positively to affective outcomes, leading to more positive attitudes in audiences. The results offer theoretical and experiential suggestions for more audiobook publishers and beyond.
Introduction
Representing the pinnacle of the “ear economy,” audiobooks in China and numerous other countries are currently undergoing what has been described as the “audiobook revolution,”“golden age of audio,” or “audiobook boom” (Elleström, 2021). Audiobook consumption is gradually becoming ingrained in daily life, seamlessly integrating into the routines of many individuals. Audiences now have the flexibility to enjoy the latest literary works at any time and from any location. This trend gained momentum, particularly after the implementation of China’s COVID-19 prevention and control policies during the pandemic, as many recovering individuals began to switch from ebooks to audiobooks while confined to their homes. According to the 2022 China Audiobook Industry Development Report, the total revenue of China’s audiobook market surged from 1.6 billion in 2016 to 8.74 billion in 2021, with an anticipated increase up to 12.4 billion by 2026.
Audiobooks are an electronic book format enriched with sound effects, such as music or ambient sounds, and offer narration by either the author, a professional or amateur actor, or a computerized voice (Alatas & Solehat, 2020). Initially designed to cater to individuals unable to read books for various reasons, including visual impairment or recovery from a stroke (Lee et al., 2021), audiobooks faced slow development due to technological limitations and the dominance of visual media (Snelling, 2021). However, recent times have witnessed a notable shift, with audiobooks becoming increasingly accepted by listeners and contributing not just to entertainment but also to areas such as education, language learning, and mental health (Miranda-Cueva & Cabanillas-Carbonell, 2020).
Despite the rapid development of the audiobook market, scholarly research on audiobooks remains somewhat fragmented (Ameri et al., 2017). The prevailing perspective within both industry and academia tends to be of audiobook production as an external process disconnected from the publishing sector; in other words, it is seen as a product derived from traditional paper format publications to target specific market niches (Kuzmičová, 2016). Consequently, empirical exploration of this field has been considerably overlooked, earning it the status of a “silent” domain within academia and an anomaly in the dynamic publishing market (Murphy, 2022).
Prior research has predominantly followed three main paths: first, studies on audio quality encompass factors such as speech synthesis quality and the comparison between synthetic and human voices in audiobooks second, inquiries into business management, education, mental health, and related fields discuss business model innovations within audiobook markets across different countries or the broader application of audiobooks in various environments (Singh & Alexander, 2022); third, research on auditory culture explore the communication perspective of audiobooks (Snelling, 2021). While some studies have examined audience intention to adopt audiobook applications using frameworks like the Unified Theory of Acceptance and Use of Technology (UTAUT) and the Theory of Planned Behavior (TPB), they often overlooked the intrinsic features of audiobooks themselves and their potential impact on the listener’s experience and attitude (Hunsaker et al., 2020).
Therefore, to bridge existing research gaps, the objective of this research was to explore the factors affecting audiences’ attitude towards listening to audiobooks. The specific antecedent factors of attitude tested in this study were narrator performance, background music, narrative style, telepresence, and emotional connectedness, with telepresence and emotional connectedness also serving as mediators linking the other factors to attitude. By constructing a comprehensive model that assesses both the direct and indirect effects of audiobook features on attitude toward audiobook consumption, this research makes substantial contributions to audiobook engagement in theory and practice.
Theoretical Foundation and Literature Review
Theoretical Foundation
Stimulus-Organism-Response Framework
The overarching framework is the stimulus-organism-response (hereafter SOR) model introduced by Mehrabian and Russell (1974), focusing on the emotion-eliciting or emotional qualities of an individual’s surroundings. This model has recently been applied in the retail industry to better understand consumer purchasing behavior and comprehend how the shopping environment influences selections (Cheah et al., 2020; Gong et al., 2023; Kaur et al., 2017). According to the model, environmental stimulus (S) is known to trigger consumers’ emotional responses (O), which in turn causes them to act in certain ways (R). The model may be useful for evaluating how customers react in understanding their feelings and perceptions regarding external stimuli and the behaviors that follow. To summarize, the SOR framework conceptualizes behavior or behavioral intentions as a product of a certain environment with specific stimuli. The stimuli influence the organism, most often a consumer’s emotional, cognitive, or affective process, which eventually gives rise to his or her behavioral response.
For example, Donovan et al. (1994) were among the first to adopt the S-O-R model to understand the impact of the retail environment on consumer decision-making. Their research illustrated that the in-store environment (stimulus) affects the degree of consumer pleasure and arousal (organism), thereby triggering consumer behaviors (response). Kim and Lennon (2013) applied the model by positing internal (website quality) and external (reputation) sources of information as stimuli that affect purchase intention (response) through consumers’ cognition and emotion (organisms). However, very less studies apply SOR in audiobook selection and its corresponding listening attitude. Thus, we postulate the model using narrator’s performance, background music, and narrative style as stimuli, telepresence, emotional connectedness as organisms, and listening attitude as responses. Underpinned by SOR and coupled with concepts of telepresence and emotional connectedness, this study examines the mechanism behind the listening attitude of audiobooks among readers.
Telepresence in Audiobooks
Audiobooks have garnered considerable attention, drawing listeners with their ability to create immersive virtual worlds. A central aspect of these virtual environments is the concept of presence. Originally introduced by Lombard and Ditton (1997), presence is conceptualized as the perceptual illusion of non-mediation, signifying an immersive experience, where users feel completely involved without recognizing the medium’s existence. Previous studies have divided this concept into two distinct subcategories: social presence and telepresence (Nowak & Biocca, 2003).
Telepresence is generally understood as the subjective experience of being present in a remote environment, or alternatively, as a user’s ability to be psychologically transported into another location (Witmer & Singer, 1998). McCall et al. (2004) defined it as the subjective experience of being in one place or environment, even when physically situated elsewhere. Strong perceptions of telepresence often lead individuals to feel that their experiences are facilitated by media. As such, this concept holds practical significance in the design and evaluation of media products, particularly in entertainment, such as movies and video games (Xu et al., 2023).
Given the effect of telepresence and positive psychological outcomes, both researchers and practitioners have shown keen interest in studying the factors influencing telepresence. In the context of audiobook listening, telepresence is receiving increased attention from computer scientists, psychologists, and communication scholars due to its increasingly sophisticated media and role in simulating interactions with people. In fact, telepresence significantly influences the audience's attitude towards audiobook consumption, leading to certain behaviors.
For audiobooks to captivate audiences, availability alone is insufficient; they must also generate substantial levels of telepresence to immerse and engage users (Algharabat et al., 2018). A heightened sense of telepresence fosters audience comfort, enabling audiobooks to deliver additional plot information through narrators and background music, thereby increasing attraction. This positive association with perceived immersion in audiobook listening forms the basis of this study’s exploration into the factors influencing users’ perceived telepresence during audiobook consumption.
Emotional Connectedness with Audiobooks
Connectedness is defined as the sense of being linked to another person or thing through a shared experience. It typically involves a person experiencing a sense of comfort and reduced anxiety when actively engaged with another individual, object, group, or environment (Stieler & Germelmann, 2016). Scholars have expanded the concept of connectedness to elucidate the complex relationships between individuals and various entities, giving rise to related concepts like social connectedness, school connectedness, cultural connectedness, community connectedness, and more. While connectedness has broad implications, most definitions of the concept acknowledge its “emotional” elements (Maulana & Eckhardt, 2007).
The association between the use of social media and emotional connectedness has been a focal point for scholars (Watson et al., 2022). Individuals feel connected when experiencing psychological security in an information communication technology setting. Recent literature has indeed measured users’ emotional dependence on social media, revealing that image-based and text-based social media is linked to reduced loneliness and heightened emotional connectedness (Pittman & Reich, 2016). However, the impact of audio-based websites/books on emotional connectedness has not been widely discussed.
The experience of listening to audiobooks not only enhances the enjoyment of the content itself but also positively contributes to emotional recovery and behavioral changes through the storytelling communication process. Audiobook listening may alleviate feelings of loneliness, fostering a sense of connection to the content or other listeners sharing the same books. Research indicates that audiobook listening can reduce loneliness, subsequently increasing emotional connectedness. Ultimately, listening to audiobooks creates a general feeling of connection, contributing to improved well-being and relaxation among listeners.
Literature Review
Narrator Performance, Telepresence, and Emotional Connectedness
Listeners choose audiobooks based on the performance of narrators and the quality of the stories. An exceptional audiobook experience necessitates human narration rather than machine-generated recordings. (Saarikallio et al., 2021) holistically analyzed audiobook experiences, considering aspects like “technical framework,”“reading situation,” and “performing voice.” They found that beyond the evident reliance on spoken language, audiobooks invariably feature a narrator who plays a pivotal role in providing an auditory rendition of the text. Factors influencing the choice of narrator include the demographics of the target audience, the narrative’s complexity, and the need to resonate with cultural and regional nuances (Larson, 2015; Mainardes et al., 2023; Tattersall Wallin, 2021). Consequently, selecting a narrator whose voice aligns not only with the story but also with these diverse elements can make the difference between an audiobook that listeners tune out and one that captivates them, making the experience immersive and emotionally resonant.
Previous research suggests that audiobook narration is a distinct professional skill, blending elements of performance and standard reading. A skilled narrator ensures the consistency of voice and character traits, manipulating his intonation to breathe life into the text and captivate the audience. Larson (2015) emphasized how the narrator’s voice imparts authenticity to the text, interpreting the implicit intentions of a novel or text in the context of an audiobook. The narrator’s decisions regarding tone, voice modulation, and focus serve as decisive factors influencing the audience’s engagement or disengagement from the listening experience (Tattersall Wallin, 2022). Their voice is not merely a recitation but a critical interpretive layer that extends the author’s voice and intentions. Consequently, narration can either enhance or diminish the listener’s connection and understanding of the story (Sini et al., 2022).
Moreover, when selecting a narrator for an audiobook, matching the narrator’s voice to the age of the role is crucial. The narrator’s voice is critical in conveying age-appropriate timbre, emotional nuance, and modulation of tone and pace. These choices significantly impact the audience’s engagement and understanding. Apart from the narrator’s voice, their rhythm also communicates the essence of book content, as the strategic use of pauses by the narrator conveys meanings akin to spoken words. Narration must seamlessly align with the rhythmic flow of actions within the text. Suspenseful and dynamic dialogues and scenes demand an appropriate reading pace, while sections that aim to evoke profound emotions or savor delicate nuances benefit from a slower pace.
Overall, audiobook listening, being a private activity, is shielded from external noise by the narrator’s performance, fostering a sense of connection and immersion for listeners. The narrator's ability to navigate the tempo and tone of the audiobook contributes significantly to the listener’s engagement and emotional experience.
Therefore, it is postulated that:
H1. There is a positive correlation between narrator performance and telepresence in audiobook reading.
H2. There is a positive correlation between narrator performance and emotional connectedness in audiobook reading.
Background Music, Telepresence, and Emotional Connectedness
Existing research has established that listening to music can yield positive effects on mood and cognition. The purpose of background music in audiobooks is to enrich the ambiance and rhythm connected to the narrative or information being conveyed. Background music serves to amplify emotional depth or intensity within specific scenes, accentuate the rhythm of narration and action, or reinforce the cultural significance of the text (Lobo et al., 2021). Moreover, music is employed to guide listeners through transitions, such as the commencement and conclusion of a chapter or shifts in position, time, or events within the storyline.
Background music can profoundly influence the experience the author seeks to create for the listener regarding the story. Acknowledging the pivotal role of music in the listening experience, producers occasionally engage musicians to curate or compose music that aligns with the cultural and historical context of the books. Existing literature indicates that sound effects or background music contribute significantly to users’ emotional engagement (Cahill & Moore, 2017; Steinhaeusser et al., 2021). Some scholars argue that environmental sound effects, as non-interactive components of audiobooks, constitute fundamental narrative elements that intensify the story’s atmosphere, enrich the perception of the narrative, and deepen the user’s immersion into the virtual environment (Röber et al., 2006). The incorporation of background music in audiobooks aims to heighten the emotional resonance and pace associated with the narrative or information presented, fostering enhanced emotional connectedness among listeners.
Therefore, it is predicted that:
H3. There is a positive correlation between background music and telepresence in audiobook reading.
H4. There is a positive correlation between background music and emotional connectedness in audiobook reading.
Narrative Style, Telepresence, and Emotional Connectedness
An audiobook’s narrative style is pivotal in determining its success and can be categorized into four forms: fully voiced, partially voiced, multivoiced, and unvoiced (Ameri et al., 2017). Fully voiced narration employs distinct voices for each character in the audiobook, a style well-suited for books with diverse characters possessing distinct personalities. Partially voiced narration features pronounced voices for main characters, while the voices of other characters remain consistent. In contrast to the single narrator in fully voiced narration, the multivoiced style employs several narrators to portray different characters, akin to a movie, enhancing the audiobook’s quality. The last style is unvoiced reading, where a single narrator delivers the text with minimal variation in character voices.
Linkis (2021) explored how audiobooks create a social and intimate listening space through different narrative styles. Each genre of book has an ideal narrative style, which must be considered carefully to avoid paying for narration that neither adds value to the audiobook nor enhances the reader’s experience. Matching the narrative style to the genre ensures that the narration complements the content, contributing positively to the audiobook’s overall appeal and listener engagement.
Therefore, it is hypothesized that:
H5. There is a positive correlation between narrative style and telepresence in audiobook reading.
H6. There is a positive correlation between narrative style and emotional connectedness in audiobook reading.
Telepresence and Audiobook Listening Attitude
In contrast to a virtual experience, the concept of telepresence refers to the extent to which one feels present in a mediated environment. In this study, telepresence is the perceived immersion of audiobook listeners into the world crafted by audiobook storytelling. It is characterized by involvement and emotional arousal, akin to the experiences one might have in a virtual world, where individuals perceive themselves as being submerged or immersed in that environment.
Telepresence is recognized as a source of enjoyment (Nah et al., 2011). Previous research has suggested that consumer experiences can be heightened through the impact of virtual reality resulting from telepresence. Scholars have discussed that the enjoyment of advergames, and consequently the positive impact of advergame play on brand attitude, may increase when players experience a stronger sense of telepresence.Pelet et al. (2017) demonstrated that telepresence positively influences the flow experience, while Gao et al. (2018) indicated that telepresence is positively associated with consumers’ autonomy and engagement in online shopping.
In the context of audiobooks, telepresence reflects the extent to which the audience perceive the immediacy or physical distance between themselves and the story. It plays a crucial role in shaping the audience’s experience in the virtual world, providing a psychological state of “being in the world” created by audiobooks and triggering recall of the audiobook content. It is reasonable to infer that telepresence might influence listeners’ experiences, subsequently impacting their attitude.
Therefore, it is proposed that:
H7. There is a positive correlation between telepresence and audiences’ listening attitude.
Emotional Connectedness and Audiobook Listening Attitude
Emotional connectedness can create a lifelike virtual environment that reduces the need for physical interactions (Lyu et al., 2024). Establishing a sense of emotional connectedness is a fundamental aspect of human life and can enhance various dimensions of psychological well-being. In the online environment, emotional connectedness can stem from the use of social media, particularly for individuals seeking enjoyment from these platforms. Substantial research indicates that emotional connectedness has positive effects on users’ purchase intentions, attitudes toward adopting social commerce, and intentions to travel (Anindito & Handarkho, 2022; Knox, 2011; Pansari & Kumar, 2017).
Individuals are more likely to feel connected when engaged in activities. Ergo, when listening to audiobooks, the audience experiences a sense of connection with the content, the narrator’s performance, and fellow listeners. This sensation leads them to perceive the story as more captivating, fostering active engagement in the listening experience.
Therefore, it is postulated that:
H8. There is a positive correlation between emotional connectedness and audiences’ listening attitude.
The research framework (see Figure 1) illustrates the study’s variables and their hypothesized relationships.

Theoretical framework of this study.
Methods
Sampling and Data Collection
This study utilized a quantitative approach within the positivist paradigm. Accordingly, the data collection instrument employed was a survey. The study’s target population comprised Chinese individuals with experience listening to audiobooks. A filter question was explicitly presented at the beginning of the questionnaire to ensure respondents met this criterion. Convenience sampling was chosen to reach the target respondents, due to its efficiency and simplicity in implementation. It allows researchers to collect data from easily accessible sources without the necessity for extensive planning or resources (Etikan et al., 2016).
The sample size for this study was determined using G*Power 3.1, an analytical software program for sample size calculations (Andrade, 2020). The power analysis method “a priori: Compute required sample size—given α, power, and effect size” was employed. The effect size was fixed at 0.15, alpha at 0.05, and power (1-error probability) was set at 0.95. With a total of five predictors in the current study, the minimum required sample size was set at 138.
Following the Institutional Review Board (IRB) protocol approved by the researchers’ university, data collection was conducted in October 2021 under IRB protocol IRB202071. The survey was first pre-tested and pilot tested. For the pilot test (N = 75), a call for participation was disseminated through social media platforms such as WeChat and Sina Weibo. The test confirmed the validity and reliability of all constructs. Subsequently, a total of 600 surveys were distributed online through Wenjuanxing (https://www.wjx.cn/), a popular survey website with 33.24 million users in China. After excluding incomplete/invalid responses and responses lacking regularity/continuity, the dataset comprised 537 valid samples, resulting in an effective response rate of 89.5%. All participants were audiobook users, with 316 males accounting for 58.85% and 221 females accounting for 41.15%. Table 1 presents the demographic details of the study respondents.
Demographic Information of Respondents.
Measures
The questionnaire was designed based on previous studies. The measurement of narrator performance comprised four items (NP1-4), background music was assessed through three items (BM1-3), narrative style was measured using four items (NS1-4), telepresence was gauged with three items (TP1-3), emotional connectedness was appraised with four items (EC1-4), and attitude was examined with three items (AT1-3). The detailed items of the questionnaire and their sources are listed in Table 2. All responses to the items were scored on a five-point Likert scale ranging from “totally disagree” to “totally agree.”
Measurement Items.
Data Analysis and Results
Common Method Bias
Before conducting further analysis, both a priori and post hoc procedures were implemented to address potential common method bias. Procedural remedies included pilot testing the survey instrument for both content validity and presentation quality to facilitate respondent understanding. In terms of post hoc procedures, Harman’s single factor test in SPSS was performed on the data to detect common method bias (CMV), which occurs when all variables in a study are measured using the same instrument. The test yielded a result of 38.8% for the maximum variance explained by a single factor, which was below the threshold of 50%. Consequently, CMV was not a concern in this study.
Measurement Model Assessment
This study first utilized SPSS version 26 for descriptive analysis. Subsequently, using SmartPLS 3.0 software, the partial least squares structural equation modeling (PLS-SEM) approach was employed to test the research model and its corresponding hypotheses. PLS-SEM is widely acknowledged as a prominent statistical technique across various disciplines, including communication (Ringle et al., 2015). It is known to be the preferred method for analyzing path models and examining relationships between constructs. PLS was chosen for this study due to the sample size (n = 537), the focus on each path coefficient, and the emphasis on variance explained rather than overall model fit (Henseler & Sarstedt, 2013).
Following the guidelines of Hair et al. (2017), data was analyzed and interpreted in two stages: the measurement model and the structural model. For the measurement model, a confirmatory factor analysis (CFA) was first conducted to assess reliability as well as convergent and discriminant validity. Convergent validity examines whether items effectively reflect their corresponding factor, while discriminant validity assesses whether two constructs are statistically distinct.
Table 3 illustrates the reliability and convergent validity results, which includes items’ factor loadings and constructs’ composite reliability, Cronbach’s alpha, and average variance extracted (AVE). Factor loadings for each item met the minimum requirement (loading > 0.60), while the Cronbach’s alpha and composite reliability values for all eight constructs exceeded the recommended threshold of 0.70. Similarly, the AVE for each construct surpassed 0.50 (Fornell & Larcker, 1981). Therefore, the constructs’ reliability and convergent validity were established.
Reliability and Convergent Validity of Constructs.
To conclude the measurement model analysis, the Heterotrait-Monotrait Ratio (HTMT) criterion was employed to assess discriminant validity (Henseler et al., 2015). As illustrated in Table 4, the HTMT scores for all the constructs were below the threshold value of 0.9, confirming the discriminant validity of the constructs in this study.
Analysis of HTMT Discriminant Validity.
Structural Model Assessment
The structural model for this research is depicted in Figure 2, where R2 represents the value of the coefficient of determination of any endogenous or predicted latent variable. The R2 is 0.502 for the dependent variable, listening attitude, indicating that telepresence and emotional connectedness collectively account for 50.2% of the variance in listening attitude. Likewise, narrator performance, background music, and narrative style explain more than 40% of the variance in telepresence and emotional connectedness. These values highlight the strong predictive power of the model.

Structural model results.
The bootstrapping technique was used to test the hypotheses by determining the statistical significance of the path coefficients (a = 0.01; two-sided test). Based on the results in Table 5, the effects of narrator performance on telepresence (t-value = 3.276, p-value = .001) and emotional connectedness (t-value = 4.721, p-value = .000) are statistically significant. Therefore, H1 and H2 were supported. Similarly, background music demonstrated a significant positive impact on telepresence (t-value = 6.146, p-value = .000) and emotional connectedness (t-value = 6.199, p-value = .000), validating H3 and H4. Narrative style was also found to significantly influence telepresence (t-value = 7.044, p-value = .000) and emotional connectedness (t-value = 6.297, p-value = .000); thus, H5 and H6 were accepted. Lastly, telepresence (t-value = 4.371, p-value = .000) and emotional connectedness (t-value = 12.962, p-value = .000) revealed significant effects on listening attitude towards audiobooks. Therefore, H7 and H8 were confirmed as well. In summary, all the research hypotheses were supported.
Hypothesis Testing Results.
Discussion and Implications
Discussion
This study examined the impact of listeners’ telepresence and emotional connectedness on their attitude toward engaging with audiobooks, considering predictors such as narrator performance, narrative style, and background music. The results verify the statistically significant associations of narrator performance, narrative style, and background music with both telepresence and emotional connectedness. In turn, listeners' attitude is positively influenced by their telepresence and emotional connectedness.
Previous research suggests that audiobooks, compared to visual depictions of the same content, offer a more stimulating and immersive experience. This is attributed to narrators’ ability to convey a range of emotions through specific accents, providing insight into characters’ feelings and the nuances of their words (Röber et al., 2006). Consequently, a positive correlation exists between superior narrator performance and perceived telepresence, aligning with prior research findings. Listeners in this study also reported an increased sense of telepresence with audiobooks that incorporated background music. Audiobooks, defined as recordings of text, can include music or ambient sounds to construct a 'story world' (Pinheiro et al., 2018). If listeners enjoy the background music, it is reasonable to conclude that their self-reported telepresence is greater.
Additionally, telepresence emerges when listeners perceive a match in narrative style. The publisher’s “ultimate goal is to mirror the voice that the author heard in his or her head, when they were writing the book.” The focal point in audiobooks centers on developing formal representations to characterize actors, including their role, personality, and emotional status. Audiobook publishers adjust the features of their voices based on characters and select the appropriate narrative style to enhance the roles in audiobooks. This alignment further reinforces listeners’ perception of telepresence.
Another key revelation in this study is interplay among the factors contributing to emotional connectedness in audiobooks. Narrator performance, background music, and narrative style are not isolated elements but integral components of the audiobook experience (Have & Pedersen, 2015). They collaborate to create a deep emotional connection that surpasses the confines of the audiobook’s virtual world. As audiobooks gain popularity, comprehending these elements is imperative for audiobook producers and narrators aiming to cultivate emotional bonds with their audiences. Ultimately, the emotional journey offered by audiobooks attests to storytelling power in its auditory form (Nicolini et al., 2017; Weber, 2021).
The findings reveal the pivotal role of narrator performance in heightening emotional connectedness among listeners. Narrators, beyond being mere conveyors of words, emerge as conductors of emotions. Expert manipulation of voice timbre transforms storytelling into an art that captivates the audience. This finding aligns with the notion that narration is not solely about imparting information; it equally encompasses the conveyance of emotion. Narrators, through their performance, establish a sense of belonging for listeners (Rodero & Lucas, 2023) that extends beyond a cognitive understanding of the story’s world; it represents a profound emotional attachment. Listeners forge connections not only with the characters and events within the audiobook but also with the voice guiding them through the narrative. This emotional connection transcends the boundaries of the audiobook’s virtual world (Rubery, 2011).
Similarly, audiobooks featuring background music prove more effective than text versions with static illustrations alone. The incorporation of background music enhances personal relevance and emotional arousal related to the story plots, sceneries, and characters, fostering a deeper emotional bond. The narrative style, on the other hand, serves as a crucial aspect of effective communication (Cahill & Moore, 2017). Audiobooks can alter the impact of a story by employing different narrative styles to enhance recipients’ emotional reactions and transport them into the narrative. If the narrative style provides more detailed and salient information about a story, it can heighten perceptions of connection and incentivize listener discussions, positively impacting potential emotional connectedness.
Finally, the study corroborates that telepresence and emotional connectedness can favorably influence audiences’ listening attitude. Listeners’ attitude significantly improve when they perceive that the audiobooks are narrating the “real world” and feel a connection with the audiobooks or other listeners. This empirical finding establishes that telepresence and emotional connectedness amplify listeners’ absorption and engagement, maximizing their interaction with the stories.
Implications
Theoretical Implications
This study contributes significantly to the academic literature in two key ways. First, it expands upon existing insights by empirically demonstrating that telepresence and emotional connectedness play crucial roles in shaping the listening attitude of audiobook audiences. While prior research has explored various aspects of audiobook engagement, few studies have explored the individual psychology angle of how telepresence and emotional connectedness impact listeners. This study fills this gap by constructing a theoretical framework that analyzes the effects of telepresence and emotional connectedness on audiobook listeners’ attitudes.
Telepresence is defined as the perception of “being there” in a seemingly natural environment created by technology. While existing literature has established that telepresence can positively influence user evaluations (Cowan & Ketron, 2019), there has been a lack of discussion about its impact on listeners’ attitudes toward audiobooks. This study affirms that telepresence has a positive effect on listeners’ attitudes, suggesting that individuals experiencing a sense of telepresence during audiobook consumption tend to develop a positive attitude. Emotional connectedness, on the other hand, refers to the extent to which audiobook listening fosters positive associations, identification, and connections. The study highlights the significant role of emotional connectedness in shaping positive listening attitudes, aligning with previous research emphasizing its substantial impact on customer behavior (Molinillo et al., 2018).
Second, the study explores the antecedents of telepresence and emotional connectedness among audiobook users, specifically focusing on the compositional features of audiobooks, namely narrator performance, background music, and narrative style. While previous reviews of audiobooks have mainly concentrated on readers’ advisory, literacy development support, and usage with specific demographics (Subagya, 2017), this study emphasizes the importance of considering audiobooks themselves. Notably, it identifies narrators’ performance as a unique and influential feature that enhances telepresence and emotional connectedness, with participants expressing a preference for human narrators over robotic storytellers for their ability to induce higher emotional engagement and transport the audience into the story (Steinhaeusser et al., 2021).
Additionally, the study reveals that background music, treated as an independent variable, positively impacts various listener variables, including affective states (mood, arousal pleasure, emotion), selective attention, and behavior. This concurs with existing studies demonstrating the influence of music on attitudes (Jang & Lee, 2014). Lastly, the research validates that narrative style significantly contributes to the success of an audiobook, confirming its role in stimulating telepresence and emotional connectedness, a point often discussed in prior reviews but seldom empirically proven (Rivas-García & Magadán-Díaz, 2022).
Lastly, this study is theoretically significant because it extends the applicability of the SOR framework to explain the determinants in affecting people’s listen attitude of audiobook, which is a novel and important context. The narrator’s performance, style, background music, coupled with telepresence and emotional connectedness, might affect people’s attitude, and decision-making mechanism.
Practical Implications
Despite the unique attributes of audiobook applications, their widespread adoption continues to present a significant challenge (Srivastava et al., 2022). In this regard, this study proposes practical implications for the industry based on the empirical findings obtained. First, audiobook publishers can enhance customers’ telepresence and emotional connectedness by elevating the storytelling quality of their audiobooks. Implementing measures such as engaging popular stars or renowned actors for storytelling, crafting complementary background music, and employing diverse narrative styles can contribute to creating captivating listening experiences. These efforts contribute to amplifying the sense of a “real world” and fostering emotional connectedness among audiobook listeners.
Second, the study suggests that listeners’ telepresence and emotional connectedness can be cultivated during audiobook consumption, influencing their overall listening attitude. Audiobook publishers should consider designing appealing and immersive physical scenarios aligned with the narrative plots to enhance the attractiveness of audiobooks. Simultaneously, publishers can foster emotional connectedness by increasing the visibility of listeners’ states and encouraging discussions. This approach serves as a reminder of the audiobook experience, encouraging listeners to engage more actively and read further. By implementing these strategies, audiobook publishers can address challenges associated with adoption and create a more engaging and immersive listening environment for their audiences.
Conclusion
Limited attention has been paid to understanding the listening attitude of audiobook users, despite the growing popularity of audiobooks. This research aimed to fill this gap by presenting and testing an empirical framework that explores the factors influencing listeners’ audiobook attitude. Unlike existing literature, which often overlooks these aspects, this study identifies crucial factors directly pertaining to the features of audiobooks. The results provide insights for both researchers and practitioners, offering recommendations on how publishers can enhance listeners' attitudes by fostering telepresence and emotional connectedness, ultimately attracting a larger audience.
However, this study does have some limitations. The sample consisted mainly of voluntary college students, representing a highly specific demographic. While this group constitutes the typical audiobook user, the findings may not be generalizable to all user categories. Future research could diversify the sample to test the robustness of the results. Scholars can also involve experts and policymakers to gain insights into designing audiobooks that cater to audience preferences, fostering brand loyalty, and maintaining long-term relationships.
Another limitation is that this investigation of listeners’ attitude was rooted in telepresence and emotional connectedness theories. Future studies could adopt a broader range of factors and incorporate additional theoretical models to provide a more comprehensive understanding of audiobook user attitudes. Lastly, the study employed a cross-sectional design. Given the dynamic nature of listeners’ attitudes when using audiobook apps, it remains uncertain whether telepresence and emotional connectedness exert a sustained influence over time. Future research could explore this phenomenon through a longitudinal design, offering a more nuanced perspective on the evolving nature of audiobook user attitudes.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Data Availability Statement
The data utilized in this study is confidential and includes information from dedicated computer applications at the local institution. Access to the data may be granted upon reasonable individual request, subject to approval from all authors involved.
