Abstract
This study explores how VTubers, or virtual YouTubers, redefine digital intimacy by blending human performance with digital artifice. Through a 9-month digital ethnography and 21 interviews with VTuber fans, we identified two key processes. First, the affective liquidity of VTuber personas, facilitated by vocal modulation, avatar aesthetics, and technology, blurs ontological lines between human/non-human and real/virtual. Second, the screen acts as both an immersive portal and a protective buffer, allowing fans to explore non-normative desires and subvert heteronormative gender performativity. We argue that VTubers, as embodiments of posthuman becoming, cultivate synthetic relationships that prioritize fluid belonging over biological essentialism, challenging traditional intimacy models. While acknowledging their transgressive possibilities, we caution against techno-utopianism, highlighting the ethical risks of platformed intimacies. This study advocates a critical posthumanist lens to map the contradictions of digitally mediated relationality, balancing optimism with scrutiny as human-technological assemblages redefine intimacy in the (post-)digital age.
Introduction: virtual influencer, virtual idol, or VTuber
The modern celebrity operates as a complex sociotechnical assemblage, but its stability has historically been contingent on the referent of a tangible, corporeal human. The rise of the virtual idol fundamentally destabilizes this connection. The proliferation of advanced digital media has introduced a range of virtual figures into our cultural landscape and everyday media practices, challenging conventional ontological categories. Virtual idols, such as the Crypton Future Media singer Hatsune Miku and the Bilibili singer Luo Tianyi, are animated characters with unique personas whose music is created through technological synthesis and often crowdsourced, often without a specific human performer. Virtual influencers, on the other hand, are typically CGI or AI-powered avatars designed to engage audiences on social media platforms for lifestyle and marketing purposes. A third category, the VTuber (virtual YouTuber), diverges from these models. The VTuber is a sociotechnical assemblage that inextricably links a digital avatar to the live, real-time performance of a human entertainer, creating a hybrid form of interactive content creation. These online entertainers use virtual avatars, often created with software like Live2D, to portray characters designed by online artists. Their key distinction from virtual influencers and virtual idols lies in their reliance on a human voice actor, known as a naka no hito (Japanese: 中の人). VTubers utilize real-time motion-capture software to map their physical movements and facial expressions onto their two- or three-dimensional avatars. This stable and unique human performer provides the voice for their livestreams, singing, dancing, and even offline events.
The VTuber phenomenon, which globalized rapidly from Japan in the mid-2010s to the 2020s, represents a significant evolution in digital performance and intimacy, particularly appealing to the anime and manga fandom. This study privileges the VTuber as a critical site for investigating the co-construction of digital intimacies. Unlike virtual idols and influencers, who often involve decoupled or automated performance, the VTuber is defined by a unique sociotechnical hybridity: the persistent, live coupling of a human operator (naka no hito) with a digital avatar. This blend of human creativity and digital persona through voices and visuals creates a uniquely engaging experience for viewers, fostering deeper connections and dynamic content. Furthermore, VTubers’ primary reliance on live-streaming platforms enables real-time fan interaction, enhancing engagement and building strong community loyalty. While originating in Japanese subculture, VTubers have successfully crossed into mainstream media and global markets, appealing to both niche and broad audiences. Through 9-month digital ethnography on and 21 in-depth interviews with VTuber’s fandom, we contend that VTubers reconfigure intimacies by collapsing the technology/human and virtual/real binaries through affective liquidity—a dynamic interplay of human performativity and digital artifice. By curating synthetic personas that dissolve boundaries (via vocal modulation, avatar design, and immersive technologies), VTubers shift authenticity from corporeal individuals to posthuman affective engagement, enabling fans to ascribe legitimacy to the digital persona rather than the human behind it. Simultaneously, the screen operates as a dialectical interface serving as both a hyper-immersive portal for connections and for transgressive desires and a protective buffer that safeguards VTuber–fan engagement from real-world vulnerabilities. This duality dismantles static idol-fan hierarchies, privileging relational belonging over biological essentialism and redefining the emerging more-than-human and more-than-digital intimacy as a fluid, posthuman practice that transcends anthropocentric frameworks, demonstrating how human subjectivity and connection are being reshaped within complex technological assemblages.
Situating digital intimacy: from human-to-human to more-than-human
The rapid proliferation of digital technologies is fundamentally reconfiguring intimate practices, eroding traditional distinctions between public and private spheres and challenging entrenched Western-centric notions of relationality. This shift has given rise to new and diverse forms of interpersonal connection that demand revised theoretical frameworks. Within communication studies, digital intimacy is understood through two core premises: that the medium itself shapes the content of a message, and that communication technologies are inextricably entangled with the social processes they mediate (Rambukkana and Wang, 2023). This suggests that digital intimacy is not merely an extension of offline relationships but a distinct communicative form with unique characteristics. However, as scholars like Evans and Ringrose (2025) and Murray (2020) argue, our institutions, regulations, and even our critical understanding of harms often lag behind this reality, clinging to a “digital dualism” that artificially separates “real life” from digital experience. The concept of postdigital intimacies helps resolve this, recognizing the seamless “slippage and indistinguishability of the digital and non-digital” (Evans and Ringrose, 2025: 5) and framing contemporary society as an entangled, more-than-digital context (Balfour et al., 2023; Taffel, 2016).
This entanglement is central to how intimacy functions today. Rambukkana and Wang (2023) illustrate that while intimacy can encompass various meanings (e.g. closeness, proximity, interconnectedness), digital intimacy can range from sexuality and kinship to all kinds of interconnectedness and international news publics. Intimacy can be seen as affect, labor, a mode of existence, and a structure of feeling that emerges through digital interactions yet is rooted in material reality. Berlant (1998) defined intimacy as “an aspiration for a narrative about something shared, a story about both oneself and others that will turn out in a particular way” (p. 281). She further contended that there is no clear division between public and private, or individual and collective subjectivities, with such spaces oriented toward and formed within a public audience. Consequently, the political sphere and public institutions can be viewed as “institutions of intimacy.” Thus, intimacy can be understood as a political experience and a structure of feeling with social and political implications that can propel movements or discourses. Moreover, McGlotten (2012) explores how digital media becomes a crucial site for the formation of queer intimacies and sociality by arguing that virtual spaces and media technologies are not merely neutral platforms for communication but are active sites where affect is cultivated, and normative scripts of intimacy are challenged. This work highlights how these digital interactions allow for the development of radical relationalities and a reworking of hegemonic norms, showing that virtuality can be a site of both abstract possibility and immanent social potential. A postdigital framework requires understanding intimate relationalities as always more-than-human. Agency is not located solely in human subject, but is distributed across complex assemblages of individuals, objects, feelings, data, algorithms, and technologies. In the world of VTubing, this is not an abstract theory but a lived reality. Intimacy is co-constituted through our intra-actions with the platforms, devices, and algorithmic architectures that shape our subjective worlds. No single element in this network is solely responsible for the “affective liquidity” we describe herein; rather, the connection emerges from the entire sociotechnical apparatus.
Many of previous scholarships on intimacy centered on physical co-presence, sex, romance, and passionate love, often within marriage (Attwood et al., 2017). Contemporary discourses now include non-sexual relationships within family life (Chambers, 2013). Intimacy can occur in both long-distance relationships and moments of co-presence. Jamieson (2013) notes that shared experiences and physical proximity can deepen intimacy, and in the digital age, spending time together online can mimic this to some extent. Intimacy often requires disclosure and meaningful interaction, such as through chat rooms, email, and Internet telephony. Academic scholarship on digital intimacy has expanded to address digitally mediated human-to-human interactions like dating apps, online relationships, and cybersex (Banerjee and Rao, 2021; Dalessandro, 2018; Hobbs et al., 2017; Jamieson, 2013; Miles, 2017; Subrahmanyam and Šmahel, 2011). Scholars have also paid increasing attention to human-to-machine intimacy, particularly between human users and AI companions (e.g. Depounti et al., 2023; Ge and Hu, 2025; Liu, 2023; Liu and Wu-Ouyang, 2022). On platforms like Replika and XingYe, users engage in the co-creation of AI companions, which challenges traditional notions of parasociality by creating reciprocal, albeit algorithmically mediated, exchanges where intimacy is industrially produced and sold as a user-generated commodity (Ge and Hu, 2025). This commodification of vulnerability and the deliberate engineering of emotional bonds through ludic mechanics exemplify the urgent need for research that moves beyond purely human-to-human frameworks. Ge, Hu and Ouyang (2025) investigates the human-nonhuman intimacy between women players and virtual male characters in women-oriented video games and demonstrates that these affective bonds are not frivolous fantasies but meaningful, embodied emotional experiences that structure everyday life and gendered subjectivites.
However, relationships between fans and virtual figures, remains underexplored despite its significance in actively reengineering the logic of connection itself and emotional engagement in virtual environments. Unlike virtual idols and AI companions, which rely on technological synthesis or decoupled performance, VTubers are defined by a singular sociotechnical hybridity: the persistent, live coupling of a human performer (naka no hito) with a digital avatar. From a philosophical standpoint, Yamano (2024) contributes to the Japanese debate by asserting a non-reductionist view (hi-kangen-shugi), which position maintains that a VTuber cannot be reduced to either the naka no hito or the fictional avatar, thus theorizing the existence of a unique, hybrid “third entity.” This unique blend of human creativity and digital persona creates a dynamic and immersive experience that distinguishes it from purely human-to-human or human-to-AI interactions. This study addresses this gap by examining how VTubers reconfigure digital intimacy through this specific form of posthuman engagement, exploring the nuanced relationships built in the space between the human-in-the-machine and the fan-on-the-platform.
Applying Braidotti’s (2019) theory of posthuman convergence, we view VTubers as posthuman subjects. This framework combines post-humanism—a critique of the universal “man”—and post-anthropocentrism—a critique of human exceptionalism (Braidotti, 2013, 2017). As the lines between digital, physical, and biological entities blur, Braidotti (2019) argues that our unprecedented intimacy with technology is at the core of our present condition. This posthuman convergence reshapes social and emotional landscapes, making this framework exceptionally suited for analyzing VTubers. It helps us understand how their performances, in a context where intimacy is gamified and subjectivity is entangled with non-human actors, contribute to posthuman knowledge production. To operationalize Braidotti’s (2019) posthuman theory, the subject must be reconceptualized through five interrelated qualities: as materially embedded, embodied, differential, affective, and relational. To be materially embedded is to reject abstract universalism, while embodiment requires decentering the notion of a transcendental consciousness separate from the body. Viewing the subject as differential means recognizing difference as a dynamic and productive category rather than a binary opposition. Emphasizing affectivity and relationality serves as a direct counter to the liberal humanist ideal of the autonomous, self-governing individual. Braidotti encouraged us to foster new ways of thinking about ourselves, enabling the formation of a transversal assemblage of human, non-human, and inhuman components. Posthuman knowledge is driven by transversality and heterogeneity, with multiplicity and complexity as guiding principles and sustainability as the ultimate goal.
Thus, we view VTubers as posthuman subjects—a “critical and a creative project” that challenges conventional understandings of what it means to be human (Braidotti, 2019: 49). As a new form of livestreamer, they are not bound by physical limitations and can embody diverse, non-human traits, blurring or combining races, genders, and species (Dazon, 2021; Hernandez, 2021). As Galbraith and Bookman (2025: 246) illustrated, VTubers occupy a “betwixt and between” space, blending the cultural logic of performance (constructing social selves) with animation (projecting human-like qualities into the environment). They serve simultaneously as animated idols (fictional characters) and idolized celebrities (real people). By sharing intimate details about living with a disability, the person inside becomes entangled with, and recognized through, the animated character. This self-representation is multifaceted, relying on a human naka no hito for performance and audience interaction, which facilitates smooth social engagement and creates a unique fusion of visual appeal, human-like performance, and non-human identity. Braidotti’s (2019) concept of structural relational capacity is central to understanding VTubers, as this framework argues that posthuman beings are defined by their ability to affect and be affected, forming connections with themselves, others, and the world. Posthuman subjectivity is an assemblage of biological, geological, and technological elements—a “zoe/geo/techno ensemble” (Braidotti, 2019), through which lens, VTubers embody this relationality, fostering a shared sense of belonging and intimacy with the world (Braidotti, 2006). In this sense, the VTuber is a quintessential posthuman subject with a dual nature: material (the human performer’s labor) and virtual (the digital avatar), whose performance exists within a more-than-human, more-than-digital context, where the lines between online and offline, human and non-human, are blurred and mutually constitutive. By emotionally engaging viewers, VTubers cultivate an intimacy that transcends traditional human relationships, reflecting the fluid dynamics of posthuman subjectivities and exemplifying the convergence of human creativity and technology.
This virtual embodiment distinguishes such figures from human livestreamers. Research shows that audiences perceive virtual and human influencers similarly in terms of visual and auditory capacities but view virtual influencers as having lower haptic, olfactory, and gustatory capacities (Zhou et al., 2024). Black (2008) argues that virtual idols literalize the idea of the body as a set of information. Specifically for female virtual idols, Black (2012) explains that they embody otaku culture’s (a subculture originating in Japan that revolves around obsessive interests especially in anime, manga, video games, and related forms of media and fandom) desires for technology, femininity (e.g. kawaii bishōjo/cute girl aesthetic), and media recycling. Virtual idols, as purely digital bodies, connect otaku obsessions with technology and artificial bodies, and fulfill fan desires through their existence as digital data, unlike corporeal idols who must be digitized. Both corporeal and virtual idols are digitally encoded, but virtual idols represent a more comprehensive translation of the corporeal body into digital information. Digital technology enables this encoding, digitizing, and transcoding of real-world elements (Thacker, 2003).
While existing research on virtual figures often focuses on AI-powered entities like virtual idols and influencers, we argue that VTubers represent a unique and compelling hybrid. They share sensory similarities with both human and virtual influencers but transcend the limitations of traditional virtual figures by incorporating live human performance through their naka no hito, which adds a crucial layer of authenticity and emotional depth. VTubers are not just digital puppets or human streamers in disguise; they are more-than-human, more-than-digital assemblages that merge the creativity of a human performer with the aesthetic and technical affordances of a digital persona. This blend creates a multifaceted experience, bridging the gap between the purely digital existence of virtual idols and the corporeal presence of human influencers. VTubers occupy a space where human agency and digital technology converge to create a distinct mode of engagement and connection.
Therefore, we ask: How is digital intimacy co-constructed in the more-than-human and more-than-digital context of VTubing, emerging from the interplay between the naka no hito’s live performance and the avatar’s technological mediation? And in what ways do fans perceive and engage with the relational dynamics of VTubers as posthuman subjects, and how does this interaction shape their sense of community and belonging within digital ecosystems? To answer these questions, Braidotti’s (2019) new materialist posthuman theory allows us to analyze how digital intimacy in VTubing is co-constructed as a more-than-human practice by viewing VTubers as posthuman subjects—assemblages of the human naka no hito and the digital avatar, in which way we can explore how the interplay between live performance and technological mediation shapes fan engagement. Braidotti’s framing of relationality is particularly valuable for analyzing our sample of female-identifying fans, whose voices are often marginalized in VTuber research that tends to focus on male audiences or general demographics, particularly helping us to understand how female fans perceive and engage with VTubers, shifting authenticity from biological identity to a fluid, collective sense of belonging. This approach enables us to map how this unique form of intimacy reconfigures community and belonging within digital ecosystems, transcending anthropocentric notions of connection.
Scholars like Galbraith (2021) suggest that the virtual idol model in Japan extends a character culture that leverages television’s capacity to foster parasocial relations, populating digital and physical worlds with diverse actors (Lamarre, 2018), which allows fictional characters to achieve a “paradoxical reality” through fan desire and consumption (Saitō, 2011). This study moves beyond parasociality by incorporating live, reciprocal interaction between fans and the human performer. This two-way communication transforms the relationship from a one-sided fantasy into a co-constructed exchange. Furthermore, it differs from traditional affective labor because the intimacy is not solely the result of the human performer’s emotional work but is co-produced by the entire more-than-human assemblage, including the avatar, technology, platform, synthesized performances, and the intensified affective engagement of human users (both the human operator and the fans). Our exclusive focus on female fans thus provides an essential, critical perspective on how gendered experiences of vulnerability and desire are channeled and protected by the VTuber’s technological mediation. This more-than-human and more-than-digital approach allows us to answer our research questions by viewing VTubers as posthuman subjects and understanding how authenticity shifts from corporeal identity to a fluid, collective sense of belonging, ultimately redefining community in digital ecosystems.
Methodology
The VTuber phenomenon truly began with Kizuna AI in late 2016, sparking a trend in Japan that led to the formation of specialized agencies like Hololive Production, Nijisanji, and VShojo, which operate with business models akin to Japanese idol agencies. These agencies develop the characters and commercialize them through merchandising and promotional appearances. While “VTuber” references YouTube, these entertainers also use platforms like Niconico, Twitch, Facebook, Twitter, and notably, Bilibili. Initially dominated by English and Japanese-speaking VTubers, there is a growing presence of Chinese-speaking VTubers, especially on Bilibili, which has also attracted many English and Japanese-speaking VTubers. By 2020, Bilibili hosted approximately 32,400 virtual idols (iiMedia Research, 2021), predominantly followed by a cohort born between the mid-1990s and the earliest-2010s who fully immersed in digital technologies, social media and mobile connectivity (Ahn and Jung, 2016), and who therefore shows a unique appetite for virtual figures. For example, in August 2021, virtual singer Eileen’s birthday concert on Bilibili garnered significant fan contributions (Ge, 2022). Our study investigates human-to-posthuman digital intimacy between fans and VTubers on Bilibili, a major Chinese video-sharing platform. We chose Bilibili for its significant transcultural importance, stemming from its roots in ACG (anime, comics, and games) fandom and its role as a multicultural hub hosting diverse VTubers. Bilibili’s interactive features and content, including anime, gaming, and music, facilitate cultural exchange and attract an international audience, making it ideal for examining how digital personas connect different cultural landscapes.
We employed a mixed-methods approach, combining digital ethnography with in-depth interviews. Our digital ethnography on Bilibili, conducted from September 2024 to May 2025, involved deep immersion and participatory observation. We began with a 3-month period of lurking or passive observation, following key VTubers and joining fan groups to understand community practices and platform dynamics. Following this, we transitioned to active participation, engaging in chats during livestreams and contributing to fan discussions. Data gathered included archived interaction logs (textual, visual, and audio exchanges), detailed fieldnotes documenting platform-specific dynamics, and a self-reflexive journal to track the researchers’ dual roles.
Complementing this ethnographic fieldwork, we conducted 21 semi-structured interviews with female fans resident in China from January to March 2025. Participants, aged 18–25, were recruited through Weibo, RedNote, and snowball sampling. All of our informants are female at birth assignment and had consumed VTuber livestreams on Bilibili for at least 1 year, with weekly engagement ranging from 2 to 25 hours (see Table 1). Interview themes covered affective and sensory engagement, gendered representation, interaction and relationality, and perceptions of liveness and hybridity. Interviews were conducted face-to-face, via video calls or synchronous chats on WeChat, recorded, and transcribed. The transcripts and observational data were analyzed using reflexive thematic analysis, recognizing that themes are constructed through researcher interpretation, not simply discovered. It is important to note that our recruitment strategies yielded a sample of 21 participants who all identified as women. Consequently, this study does not claim to represent the entirety of the diverse VTuber fandom. Instead, it offers a focused, in-depth exploration of how a specific and significant demographic—young, female-identifying fans—perceives, negotiates, and co-constructs digital intimacy with posthuman performers. This specific focus allows for a nuanced analysis of gendered desires, the negotiations and subversion of heteronormative scripts, and the cultivation of “safe spaces” for affective engagement online. To properly frame the structural nature of fan-VTuber intimacy we focus on, we need to note that the VTubers in our study were primarily affiliated with agencies (e.g. Nijisanji, Hololive). This organizational distinction is crucial because it ensures the curated authenticity necessary for fan investment: the corporate structure acts as a protective firewall, guaranteeing the safe distance by strictly separating the avatar’s intellectual property from the naka no hito’s vulnerable, real-world identity. This differs significantly from the negotiated intimacy often found in the independent VTuber sphere, where the relationship hinges more on the performer’s authentic self-disclosure rather than corporate image management.
VTuber project interviewee list.
Our reflexive thematic analysis followed the process outlined by Braun and Clarke (2019). First, we achieved data familiarization by repeatedly reading the interview transcripts and our ethnographic fieldnotes, making initial annotations. Second, we engaged in a systematic process of initial code generation, working inductively from the data. Using qualitative data analysis software NVivo, we applied descriptive codes to segments of text to capture their semantic content. Throughout this stage, our self-reflexive journal was crucial for interrogating our own assumptions and analytical lens. Third, these initial codes were collated and mapped to identify broader patterns, which formed candidate themes. The fourth phase involved reviewing and refining these themes. This was a two-level process: we checked the coherence of the coded data within each theme and then evaluated how well the thematic map represented the dataset as a whole. To ensure consistency and credibility, this phase also involved peer debriefing, where emerging themes were discussed with a colleague familiar with the field. Fifth, once the thematic structure was finalized, we defined and named each theme, writing a detailed analysis of its scope and essence. Finally, these themes formed the basis of our findings. This systematic and reflexive approach ensures our interpretations are not just plausible but are also deeply grounded in the data and transparent in their construction. By integrating the themes derived from our interviews with insights from the digital ethnographic fieldwork, a form of investigator triangulation, our study offers a robust and multifaceted analysis of how postdigital intimacies are experienced and understood in the VTuber fan community.
Affective liquidity: virtual-real entanglement
This section explores how fans cultivate digital intimacy with VTubers as posthuman subjects, illuminating the emergent dynamics of human–posthuman relationality. While VTubers are hyper-mediated virtual constructs with meticulously designed avatars, fans find authenticity not in a hidden human identity but in the hybrid entity’s voice, performance, and affective interactions, particularly their provision of emotional support. Our ethnographic data illustrates a divergence from traditional fan-streamer bonds, which are grounded in shared human experience. Instead, VTuber–fan bonds often stem from a strategic fetishization of the “non-human,” reflecting posthumanism’s rejection of anthropocentric intimacy paradigms in the more-than-human and more-than-digital era. Central to this dynamic is fans’ investment in the transgressive aesthetics of VTuber avatars, which they describe as catalysts for affective resonance. These avatars use overload symbolic representations and stylistic hyperbole to deliberately break from anthropomorphic realism, privileging non-human artifice as the locus of emotional connection. For Chinese VTuber fans, the VTuber’s visual persona is often referred to as the “skin vessel” (pi tao). For example, our participant Onyx favored a character with lamp-shaped skin vessel over ubiquitous animal avatars, finding affective resonance in its hybrid mechanical-organic design (Figure 1). The participant’s characterization of “warming and naïve feelings” highlights how this digital intimacy with non-biological entities provides therapeutic escapism.

Onyx’s favorite VTuber, a lamp-shaped male character.
A notable tension in our findings is that while participants valued the VTubers’ non-human alterity, they simultaneously privileged the authenticity of the naka no hito in sustaining immersive roleplay. This authenticity is tied to the performer’s adherence to diegetic consistency and their ability to embody non-human traits, which fans see as critical for fostering immersion.
Our participant Lumen’s reflection encapsulates this dual existence: “I’m aware of the naka no hito behind the avatar but there must be a deliberate differentiation from real-life mundanity.” Another participant, Cypress, highlighted VTuber Vox Akuma, a 400-year-old demon skin vessel (Figure 2). 1 Cypress noted how Vox’s virtuosic use of lore, vocal performativity in ASMR streams, and soundscaping builds a cohesive, novelistic diegesis that allows fans to inhabit the narrative, demonstrating how the human performer’s skill sustains the illusion.

Vox Akuma under Nijisanji (Anycolor Inc.).
The VTuber audience’s valuation of immersive roleplay exists in tension with a countervailing desire to interrogate the human-performer hybridity underlying digital personae of both the virtual and the real. As Chen and Hu (2024) theorize through their concept of “virtual breaking”: performative acts where streamers momentarily disrupt avatar immersion to reveal their human-corporeal identities, audiences negotiate a dialectical engagement with synthetic embodiment. This tension manifests as a paradox: while fans demand rigorous adherence to avatar personae to preserve mystique, they simultaneously crave glimpses of the “real” performer. For instance, the Nijisanji VTuber Vezalius Bandage exemplifies this dynamic through his meta-performative “transparent hand” streams, where he theatrically manipulates real objects (e.g. croissants, see Figure 3) while declaring, “I’m not real, guys; I’m a ghost.” 2 This staged unmasking, which is achieved through augmented reality techniques (e.g. layered glove effects), does not rupture immersion but rather heightens it by foregrounding the artifice of virtual embodiment. As our interviewee Kairo observed, such acts are celebrated as “genius” precisely because they satisfy dual fan demands: affirming the performer’s technical ingenuity (exposing the “how” of illusion) while preserving narrative coherence (the virtual persona remains intact). Here, virtual breaking operates as a controlled disclosure, reconciling audience curiosity with the preservation of fictional integrity through strategically mediated revelations. This dual desire for both realism and fictionality is at the heart of the VTuber experience, where the audience actively participates in the construction of the illusion, celebrating the moments that both acknowledge the human behind the screen and reinforce the magic of the virtual character.

Vezalius grabs the croissants by the “transparent hand.”
Streaming platforms like Bilibili and Twitch are the primary sites for VTuber–fan engagement, but these interactions extend into physical spaces such as anime conventions, which events serve as hybrid sites where the virtual and real merge. They use interfaces like large screens and motion-captured projections to allow fans to interact with the VTuber’s virtual persona, maintaining the fiction of their non-corporeal existence. This is a performative synthesis that deliberately erases the virtual-real binary. For example, our participant Alder’s account of a 2024 convention appearance by the Chinese VTuber Aza describes a “techno-mimetic spectacle” where a large screen acted as a dynamic interface. Through this portal, Aza’s avatar interacted with fans in real time in a local fan-meeting event (e.g. collaborative games, virtual selfies; see Figure 4). The screen here functioned not as a barrier but as a dynamic interface, a site where motion-capture systems and voice modulation technologies preserved Aza’s synthetic authenticity while allowing for contingent, embodied fan responses. This offline event amplifies the paradoxes of virtual intimacy: by being embedded in a shared physical space while remaining detached from human corporeality, the performance sustains the illusion of co-presence. Fans do not “touch” the VTuber but instead participate in a ritual of synthetic immediacy, where the screen becomes both the medium of virtuality and the site of its temporary suspension.

Aza’s offline event, taking selfie with a fan.
These examples underscore the mutually constitutive nature of intimacy between fans and VTubers, wherein the demarcation between virtuality and reality dissolves into what we called affective liquidity. This phenomenon aligns with posthumanist critiques of rigid ontological binaries, as articulated by our participant Jade’s critical thinking. Through a posthumanist lens, Jade believed that her preference for non-human characteristics manifests as deliberate disruption of human identity norms: her attempt to decenter anthropocentrism by embracing avatars that hybridize machinic and organic elements. Jade’s idea resonates with Hayles’ (1999) conceptualization of posthuman subjectivity, where the participant’s desire to challenge existing biological/social systems through virtual embodiments reflects critical engagement with transhumanist discourses. The blurred reality–virtuality binary in VTubing creates liminal space for reconceptualizing human exceptionalism, particularly through what Jade called “performance beyond innate ‘human’ sense”: a demand for radical de-anthropomorphism that challenges Butler’s (1988) performance theory by privileging synthetic over biological performativity. Moreover, Jade’s reflection invites a critical re-examination of Barad’s (2003) conceptualization of agential realism, foregrounding its challenge to passive ontologies of matter. By positioning materiality as an agential force entangled with discursive practices, Barad’s framework destabilizes anthropocentric exceptionalism, redistributing agency beyond human actors to encompass the dynamic intra-actions of phenomena. This reconceptualization disrupts categorical boundaries, such as human/nonhuman, nature/culture, and subject/object, reframing them not as fixed binaries but as fluid, contingent outcomes of iterative material-discursive practices. Agency, in this view, is neither a property of discrete entities nor confined to intentional human action; rather, it emerges through the co-constitutive processes of becoming (Braidotti, 2019), where matter and meaning are perpetually reconfigured within intra-active assemblages. Such an approach compels a shift from representationalist epistemologies toward an onto-epistemological account of how boundaries are enacted, contested, and re-negotiated in the very practices that materialize the world.
Thus, our concept of affective liquidity seeks to capture the unique quality of intimacy co-produced between VTubers and their fans. It describes a state where the boundaries between human performance and digital artifice dissolve, allowing affect to flow seamlessly across a more-than-human assemblage. This framework moves beyond speculative futures of transhumanism—which focuses on technologically enhancing the biological body, to address the lived reality of the more-than-human and more-than-digital intimacies. The affective liquidity embedded and embodied in such intimacies reveals that fact that we inhabit an era where the digital and non-digital are already indistinct, creating the conditions for altogether different modes of embodiments and connections. Affective liquidity describes the present condition of relationality in a postdigital world, where the very substance of intimacy—our affects, desires, and sense of self—becomes a liquid medium, flowing between and co-constituted by human performers, digital avatars, algorithmic systems, and fan communities. It offers a more precise lens than transhumanism (see Fuller, 2017) for understanding how technology is not just enhancing the human body, but fundamentally reconfiguring the experience and circulation of affect itself.
Liquid persona, safe distance: a digitally mediated liquid affective intimacy
While fans are captivated by the non-human alterity of VTubers and the fluid affective bonds they form with these synthetic personas, a critical question emerges: How does VTuber–fan intimacy differ essentially and relationally from fan engagements with algorithmically curated virtual idols or corporeal human celebrities? Whereas marketing scholarship predominantly assesses human-like virtual influencers through anthropomorphic fidelity metrics (Arsenyan and Mirowska, 2021; Stein et al., 2024), otaku-oriented 2D-style VTubers are conversely valorized through their distance from human referents, privileging hyper-stylized artifice over mimetic realism. Black’s (2012) analysis of Japanese virtual idols elucidates this dichotomy, arguing that such entities epitomize the otaku culture’s techno-fetishistic synthesis of artificial bodies, computational mediation, and curated femininity. Unlike corporeal idols whose commodification hinges on digitization, virtual idols exist as digital data, their bodies materially constituted through transcoding processes that fragment and reconfigure corporeality into discrete, recombinable units (Thacker, 2003). These digitally encoded bodies function as intertextual databases, selectively sampling and remixing preexisting cultural signifiers to simulate affective resonance (Black, 2012). However, VTubers complicate this paradigm through their hybrid ontology. Unlike the purely algorithmic virtual idol, VTubers operate as heterogeneous assemblages: mediated 2D/3D avatars, fictional lore, and the real-time labor of a human performer. This human component, the anonymized performer’s voice and movements, is what fundamentally differentiates VTubers from purely digital characters like virtual idols. Instead of being a pre-programmed or machine-synthesized entity, the VTuber is a liquid interface where human agency and a synthetic persona coexist. This dynamic intra-action allows the performer’s live, embodied presence to dialectically negotiate with the virtual facade, creating a unique subjectivity that is both representational and genuinely embodied. This configuration resists stable categorization, highlighting the porous boundary between the real and virtual in digital performance.
Black’s (2012) analysis distinguishes virtual idols from human celebrities by emphasizing their dependency on algorithmic control: as digital data, virtual idols exist under the consumer’s material and affective dominion. Azuma (2009) frames this dynamic as database consumption—a mode of engagement where otaku audiences dissect media into modular components (hairstyles, vocal tropes, accessories) stored as recombinable data. Consumers oscillate between surface-level immersion in curated narratives (emotional satisfaction) and deep-layer deconstruction of systemic frameworks (creative reassembly). This process generates what Azuma (2009: 84) terms simulacra: new affective assemblages derived from rearranging commodified fragments (e.g. “beautiful girl game” characters). However, VTuber fandom hybridizes affective engagement with platform-enabled techno-economic agency. Unlike the static database consumption of virtual idols, VTuber audiences enact “parakin” reciprocity through financialized interactions: purchasing super chats to prioritize messages, subscribing for exclusive content, or acquiring branded merchandise.
Yan and Yang (2021) conceptualized parakin reciprocity as an interaction model where fans and idols on social media have a reciprocal relationship that mimics family kinship in the Chinese entertainment context. In this relationship, fans collectively cultivate their idols and assume a significant degree of influence over their careers and personal lives. For Yan and Yang, this is distinct from a parasocial relationship, which is a nonreciprocal, imaginary interaction where fans only receive content from media figures without any expectation of a response. In contrast, financialized relationships in otome games are nonreciprocal as they involve a player’s emotional and financial investment in a digital character, but the player’s influence is limited to in-game choices and does not extend to the character’s existence or the game’s overall narrative (Hu and Ge, 2025). Based in our data collected in the Chinese context, monetized acts by VTuber fans grant them influence over a VTuber’s performance, while VTubers reciprocate with affective labor, co-constructing a parakin intimacy. This entire dynamic can be understood through the lens of oshikatsu culture, where the act of dedicated support defines the fan’s identity and investment. Unlike traditional idol–fan relationships, this bond is characterized by curated detachment and a structural “safe distance” mediated by the avatar. As participants Lumen and Briar noted, this “safe distance” insulates fans from the potential disillusionment caused by a human performer’s real-world fallibility or controversies.
The curated authenticity is vital here, as fans are not deceived, but rather choose to authenticate the persona’s consistency and affective labor over the naka no hito’s life. Interviewee Helios added that a VTuber’s persistent mediation via a digital avatar eliminates vulnerabilities inherent to embodied self-presentation, such as fluctuations in physical appearance (e.g. makeup inconsistencies, fatigue-induced facial expressions) or contextual contingencies (e.g. lighting conditions, wardrobe malfunctions). The screen thus acts as both an interface and an affective buffer, creating a secure parakin space where fans can receive emotional support without the risks of over-identifying with the human performer. The decision by fans to invest their time and money in the VTuber—a key mechanism of oshi culture—is an intentional choice to reward this fabricated reality. This “safe distance” is particularly significant given our sample of female fans. It functions as a crucial affective buffer against the social and physical vulnerabilities, such as harassment, unsolicited advances, and the pressure to perform emotional labor, that women often face in both online and offline intimate encounters. The avatar’s artifice and the screen’s mediation allow for an exploration of intimacy and even eroticism without the embodied risk that accompanies corporeal interactions in a patriarchal society. The impossibility of physical co-presence is thus not a limitation but a feature, guaranteeing a space for connection without the threat of unwelcome physical or emotional encroachment.
Such “safe distance” in VTuber–fan intimacy enables transgressive engagements by allowing fans to explore non-heteronormative and eroticized interactions. The avatar’s synthetic nature and lack of a corporeal body provide a safe space for fans to adopt roles like mother, sister, or lover, subverting hegemonic gender norms through fictive kinship. This fluidity is evident in participants’ preference for avatars with gender-neutral aesthetics over hypersexualized or hypermasculinized designs, as articulated by our interviewee Kairo, who finds such designs “alienating.” This aversion to binary-coded corporeal signifiers was widely shared: the majority of women-identifying interviewees rejected both hypersexualized female avatars (e.g. Lolita infantilized designs) and hypermasculine male avatars, critiquing them as reinforcing heteropatriarchal gaze economies (Mulvey, 1975), and toxic and hegemonic masculinity (Connell, 2005). This rejection is not merely an aesthetic preference but can be interpreted as a political act of resistance. Participants resist legible categorization through deliberate ambiguity in voice modulation, sartorial choices, and narrative lore, rejecting of the rigidity of the male/female binary and advocating for fluid gender expressions.
For instance, Iris shared her feeling about Shu Yamino (Figure 5), a black-haired, young boy figure with elongated nails and floral motifs, who is called “son” by Iris. She described him as “a soft, comforting presence untethered from gendered aggression.” 3 Similarly, Seren highlighted Genzuki Tojiro, a long-haired male avatar whose streams on skincare routines and emotional vulnerability subvert toxic masculinity’s performativity (Pascoe, 2007). These avatars’ popularity hinges on their queer-coded performativity, as seen for Finn, a top male Bilibili VTuber, merging mermaid lore with gender-neutral vocals and fluid choreography, his recent avatar (Figure 6) featuring winged tattoos and attire blending traditionally masculine (slim-fit shorts) and feminine (high ponytail) signifiers. The appellations “lovely wife” and “lovely baby” predominantly emerge as non-erotic affective signifiers within the live-streaming context of Finn, encapsulating fans’ attachment to him. This linguistic phenomenon reflects negotiated intimacy that strategically avoids sexualized connotations through infantilizing metaphors—a common mechanism in digital fandoms to establish pseudo-familial bonds. When probed on their rejection of hypersexualized designs, participants framed such avatars as complicit in “commodifying women into objects of male desire” (Onyx) and perpetuating “regressive fantasies of dominance” (Briar). For these fans, hyper-feminine avatars are seen as complicit in the objectification of women in mainstream media, while hypermasculine avatars embody a form of masculinity they find “intimidating” and “emotionally inaccessible” (Orion). Their gravitation toward avatars embodying gender opacity therefore reflects a desire for what Sedgwick (1997, 1999) calls reparative relationality, a mode of engagement that prioritizes care, emotional vulnerability, and connection free from the objectifying or threatening dynamics they associate with cis-heteronormative intimacies. This collective preference underscores VTubers’ role as sites for reimagining embodiment beyond cis-heteronormative constraints, where gender neutrality operates as both aesthetic principle and ideological resistance.

Shu Yamino under Nijisanji, shared by the participant.

Finn, a member of Afaer under 729 Vocal Studio, a Beijing-based company, screenshot by the authors.
Second, besides the subversion of heteronormative gender performativity, the affective consumption of transgressive sexual fantasies through VTubers’ ASMR services also constitutes a dual mechanism of sociomoral escapism and subversive praxis. It exemplifies a form of paradoxical intimacy—simultaneously satisfying transgressive desires (BDSM roleplay, taboo scenarios) while upholding physical and psychological boundaries. The avatars’ non-human ontology facilitates what Bataille’s (1986) discussion on transgression of prohibition: by situating fantasies within fictional universes unbound by human social mores, VTubers create a space where users can explore deviant eroticism without accountability. For instance, the VTuber Vox demonstrates this mechanism through his performative personae (policeman, priest, physician) employ hyper-stylized vocal affect—breathy whispers, punitive sound effects (slaps, restraints), and narrative scenarios (interrogation, imprisonment)—to simulate BDSM dynamics through auditory hyperreality. 4 As Smith and Snider (2019) illustrated, the somatic resonance of ASMR’s “tingling” sensations emerges through intentional engagements with hyper-specific digital objects (e.g. whispering videos, tapping simulations). This phenomenon occupies an interface between feeling as a “sense of things” (Anderson, 2006)—simultaneously embodied (internal neural responses) and technologically mediated (external audiovisual stimuli). Thus, ASMR’s affective charge transcends anthropocentric frameworks: it materializes through non-human assemblages (microphone vibrations) and non-co-present encounters (asynchronous viewership), challenging the primacy of human-to-human interaction in affect studies.
As our participant Jade noted, this disembodied eroticism parallels mechanics found in otome games, where the erotic gaze of female players is directed at a sexualized male character (Lai and Liu, 2023). In both cases, users safely project their desires onto fictional characters. The absence of physical reciprocity (“no physical contact,” as Jade emphasized) intensifies engagement by alleviating social vulnerability—users indulge carnal curiosity while maintaining emotional detachment, as our participant Kiaro’s distinction between streamer–audience relations and real-world intimacy confirms. This phenomenon exemplifies a digitally mediated liquid affective intimacy—particularly observed among female participants—wherein arousal arises not through corporeal engagement but via the orchestrated dissolution of virtual-physical thresholds. Here, users exercise imaginative sovereignty (fantasy projection onto synthetic avatars) while maintaining emotional distance through nonreciprocal spectatorship, a dynamic that reconfigures desire as a hybrid artifact of technological mediation and psychic agency. The observation that this form of intimacy was particularly pronounced among our female participants is significant. It suggests that the “imaginative sovereignty” afforded by the VTuber–fan dynamic offers a unique appeal to an audience socialized to be cautious about desire and self-expression. The ability to engage with transgressive or erotic content (such as BDSM-themed ASMR) while maintaining emotional distance provides a space to explore fantasy on their own terms, free from the judgment, shame, or non-consensual dynamics often present in mainstream pornography and other male-gaze-oriented media. It is a form of intimacy that centers their agency and safety, a dynamic not always available in corporeal relationality.
As Smith and Snider (2019) noted, affect can emerge in encounters between bodies, not necessarily human or strictly co-present. ASMR’s efficacy hinges on a technology of distance. The absence of corporeal co-presence allows viewers to immerse themselves in ASMR’s auditory-tactile stimuli without negotiating the performative demands of social reciprocity. Paradoxically, this mediated distance even intensifies affective engagement, enabling a safe and deepening of sensory experience precisely because physical presence is abstracted through digital interfaces.
The above-discussed reconfiguration of both erotica or non-erotica transgressive relationality resonates with McGlotten’s (2012) discussion of intimacy as both “embodied and carnal sensuality” and a “vast assemblage of ideologies” (1). The participants’ curated engagements with VTubers exemplify how digital intimacy operates as affective immanence—not merely emotional connection but a generative matrix where virtuality enables incipient social worlds (McGlotten, 2012) to materialize. When fans reject hypersexualized designs while cultivating parakin bonds through gender-neutral avatars, or enjoy the taboo sexual scenarios through the ASMR services, they actualize McGlotten’s contention that intimacy constitutes affect’s own immanence. Here, the virtual space becomes a comfy zone where normative pressures are reconfigured through intra-active practices: avatar design and performance choices dialectically shape and are shaped by users’ desires to transcend biological essentialism.
While this mediated distance provides safety for fans, it also introduces complex ethical questions regarding consent and the commodification of intimacy that warrant critical interrogation (Corren, 2023; Ge and Hu, 2025; Illouz, 2007; Varon and Peña, 2021). First, the safe distance per se moves beyond the notice and consent model which ignores structural power imbalances and reduces consent to a neoliberal mechanism of self-management, often meaningless in practice. Instead, it confers the digital consent as a retractable process. Second, this “safe distance,” while protective, complicates notions of digital consent and what Hochschild (1983) termed emotional labor—the management of feeling to create a publicly observable facial and bodily display. The parakin reciprocity built on financial transactions, such as super chats, creates an affective economy where engagement is explicitly commodified, transforming intimacy into a product for sale. While fans gain influence, the human performer is placed under immense pressure to perform a specific kind of continuous, monetized intimacy. This dynamic risks creating an exploitative environment, reflecting the precarity common in digital cultural industries where performers’ well-being is secondary to audience demands for constant emotional availability.
Conclusion
This study interrogates how VTubers reconfigure intimacy through the liminal interplay of behind-the-scene human performance and digital artifice, revealing two interconnected dynamics: the affective liquidity of their performances and the screen’s role as a dialectical interface. Our empirical findings show that fans value the curated sincerity of the digital persona over the human performer’s biological identity, challenging anthropocentric models of intimacy. On the one hand, the concept of affective liquidity advances existing frameworks by demonstrating how authenticity is not tied to a stable human subject but is instead co-constructed by a more-than-human assemblage, where vocal, visual, and technological elements fluidly dissolve traditional binaries of real and virtual, human and posthuman. On the other hand, the screen acts as a dialectical interface, serving as both an immersive portal for fans to explore fantasies and subvert gender norms, and a safe buffer zone. By engaging with VTubers as posthuman subjects, this digitally mediated intimacy challenges anthropocentric models and prioritizes relational belonging over biological essentialism.
While intimacy is always-already scripted through normative ideologies (e.g. heteronormativity, familialism), its virtuality—as both digital abstraction and immanent potentiality—enables interventions in these scripts. Hence, virtuality here signifies intimacy’s capacity to exceed its material instantiation, operating as a site where hegemonic relations might be reworked or radical relationalities cultivated (McGlotten, 2012). By framing VTubers as embodiments of posthuman becoming, this study urges scholars to critically interrogate how human-technological assemblages expand the spectrum of intimacy, destabilizing rigid binaries such as human/non-human and virtual/real. Such reconfigurations challenge conventional boundaries, inviting interdisciplinary dialogue on how synthetic personas and algorithmic mediation might refigure relationality in the postdigital age. However, we must resist a purely techno-utopian reading of these synthetic bonds. Intimacy per se is not inherently emancipatory; it can be ambivalent and even exploitative and it may manifest as non-consensual, violent, or banal, necessitating vigilance against techno-utopian narratives. This duality demands a redefinition of intimacy not as a static ideal of warmth or reciprocity, but as a contested terrain of relational entanglements—negotiations between selves, others, and increasingly collapsed digital-physical environments.
This study exclusively focuses on female fans of VTubers, while this has allowed for a deep analysis of their specific gendered experiences. Future research is urgently needed to explore how gender-diverse fans engage with VTuber personas in varied modes of affective engagement. A comparative analysis would be particularly valuable, examining how different gender identities may shape the negotiation of authenticity, the desire for safe distance, and the consumption of transgressive content within these more-than-human intimate assemblages. Moreover, Turkle’s (2021) critique of digital mediation as a threat to empathy and authentic connection—particularly for marginalized groups—underscores the urgency of balancing optimism with critical scrutiny. While VTubers exemplify the potential for technology to enable transgressive or hybridized intimacies, their platformed nature risks commodifying emotional labor or exacerbating isolation. Scholars should investigate how platform governance and algorithmic curation on sites like Bilibili and YouTube actively shape the nature of VTuber intimacy. Future research could also conduct cross-cultural comparisons of VTuber fandoms, examining how different socio-political contexts influence the negotiation of authenticity and relationality. This bifocal lens would both celebrate the creative potential of these hybrid intimacies while rigorously scrutinizing the structural and ethical implications of their platformed nature. In sum, this study advocates for a critical posthumanism—one that neither fetishizes nor dismisses digital intimacy, but instead maps its contradictions, potentials, and perils as we navigate an increasingly hybridized social landscape.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Ethical statement
Ethics approval was granted by the Ethics Committee of Xi’an Jiaotong-Liverpool University on 07 December 2022 (No.: ER-HSS-11000072320221202105254). The procedures employed in this study comply with the principles outlined in the Declaration of Helsinki.
Informed consent
All participants are required to thoroughly review the provided information detailing the project’s aim and procedures, and to sign the consent form prior to participating in the interview.
AI usage statement
In the preparation of this article, we utilized generative AI tools to assist with several tasks. Specifically, AI was employed for organizing interview materials, and for comprehensive typo correction and proofreading of the manuscript. It is important to note that generative AI was not used for the development of any ideas, arguments, or analytical insights presented in this research. All conceptual content and original thought originated from the human authors.
