Abstract
The video game Baldur’s Gate 3 adapts the fictional setting and rule systems of the classic tabletop roleplaying game Dungeons & Dragons. For most of its branching three-act story, the music of Baldur’s Gate 3 serves the conventional nondiegetic purpose of supporting narrative immersion and communicating key gameplay information to players. However, Baldur’s Gate 3 also employs a range of metareferential musical devices which foreground the artificiality of the medium with varying impacts on the player experience. This article explores how the use of ‘meta-music’ in Baldur’s Gate 3 contributes to its digital representation of the tabletop roleplaying experience while also navigating the anti-illusionistic risks that this entails for immersive storytelling. I begin by reviewing the established relationship between music and immersive storytelling across video games and tabletop roleplaying games (TTRPGs) alike. I next consider the metacommunicative nature of TTRPGs and how this aligns with the unique potentials of metareferential music within interactive contexts. The article then proceeds with a detailed accounting of the diverse meta-musical techniques encountered in Baldur’s Gate 3, from musical easter eggs and transdiegetic leitmotifs to villains who perform their own nondiegetic theme music. I argue that these strategies evoke the aesthetic experience of playing Dungeons & Dragons by eliciting states of heightened meta-awareness and open creative experimentation that characterise TTRPGs. In doing so, the ‘meta-music’ of Baldur’s Gate 3 bridges the extra- and intratextual worlds to support a specific interactional immersion that succeeds, in most cases, to deepen the storytelling experience. I conclude by arguing for further research into player perceptions of metareferential music and considering what the case study of Baldur’s Gate 3 reveals about the study of video game music at large.
Keywords
Introduction
Following 6 years of development, the roleplaying game Baldur’s Gate 3 (Larian Studios, 2023a) released to wide critical and cultural acclaim. It attracted the highest score from PC Gamer magazine in 16 years (Savage, 2023), garnered a host of prestigious accolades from the industry awards circuit (Chalk, 2024; Gerken and Cieslak, 2024; Jones, 2023a), and even shaped pop-cultural discourse on topics as diverse as trauma (Hernandez, 2023), romance (Brierley-Beare, 2024), and representation (Carpou, 2023). In one respect, this far-reaching impact can be attributed to the remarkable scope and complexity of its interactive narrative and gameplay systems. Sophistication in these areas is expected of contemporary RPGs (Schules et al., 2024), yet Baldur’s Gate 3 has for many established a new standard in the genre (Jones, 2023b). Also notable, however, is that Baldur’s Gate 3 represents an accessible, high-budget adaption of Dungeons & Dragons (Wizards of the Coast, 2014), the tabletop roleplaying game (TTRPG) regarded as the progenitor of the genre (Mäyrä, 2017) and which now enjoys renewed cultural relevance following the popularity of TTRPG streamers (e.g. Critical Role, High Rollers) and licenced media such as Dungeons & Dragons: Honor Among Thieves (Daley and Goldstein, 2023).
Players of Baldur’s Gate 3 (BG3 herein) are thrust into the expansive high-fantasy universe of the ‘Forgotten Realms’, a Tolkienesque world of magic, monsters, and mystery considered the most popular fictive setting from Dungeons & Dragons (D&D herein). After creating a custom avatar or assuming the role of one of seven pre-defined protagonists, they chart a personal path through a branching three-act story comprising a rich cast of characters, environments, and an astounding 174 hours of potential cinematic dialogue (Larian Studios, 2023b). Along this journey, players can make or break alliances, fall in love, and explore diverse moral personas as they shape a dynamic narrative which culminates in one of many possible fates for their avatar, companions, and the realm itself. By affording this high degree of narrative agency, BG3 aspires to emulate – at least to the extent possible within a bounded computer game – the freedom of storytelling engendered by the ‘pen and paper’ format of D&D (White et al., 2024), wherein narrative outcomes are constrained only by the collective imagination and willingness of its participants.
The challenges of providing musical accompaniment for such vast works of interactive fiction are well understood in the study of video game music (Collins, 2008; Gibbons and Reale, 2019; Summers, 2016), also known as ludomusicology. Total playtime for story-driven RPGs can exceed 100 hours (HowLongToBeat, 2024), exacerbating the resource constraints already placed on pre-composed audio and risking musical repetitiveness (Worrall, 2024). For each indeterminate moment of this experience, players then import the filmic expectation (Chion, 1994; Gorbman, 1987) that nondiegetic music should seamlessly adapt to support the present emotional context (Van Elferen, 2016), no matter the choices made by the player nor the length of idle deliberation before they make them (Walden, 2023). In Western fantasy RPGs, the prevailing ‘musical literacy’ (Van Elferen, 2016) also includes a challenging stylistic expectation for ‘dynamic symphonies’ (Rauscher, 2013: 99-102) and ‘thunderous orchestral scores’ (Phillips, 2014: 88) replete with ‘full symphonic textures, leitmotif systems, and late-Romantic harmonic language’ (Gibbons, 2017: 415). This tradition is jointly influenced by the classical stylings of Hollywood fantasy film music and a concert repertoire of late-Romantic and Modernist composers (Gibbons, 2017: 415), resulting in a ‘strongly tonal and expectation-based form almost uniquely unsuited to the temporal uncertainty of games’ (Stevens and Raybould, 2014: 149). Composers of RPGs have nonetheless risen to these challenges for over 20 years (see Gibbons, 2017: 423-425), producing emotive cinematic soundtracks capable of adapting to unpredictable player interaction. In contemporary RPGs, this is made possible through the work of sizeable audio teams (Walden, 2023) using advanced audio middleware such as WWise (Audiokinetic) to implement a range of dynamic music techniques (see Collins, 2008; Phillips, 2014; Stevens and Raybould, 2014).
Borislav Slavov’s award-winning music for Baldur’s Gate 3, in most respects, rests firmly in this tradition. It invokes the aesthetic stylings of prominent fantasy RPGs such as Dragon Age: Origins (Bioware, 2009: composer Inon Zur) and The Elder Scrolls V: Skyrim (Bethesda Game Studios, 2011: composer Jeremy Soule), deploying a wide orchestral palette, driving percussion, and a mixed choir to colour the game’s action, environments, and narrative arc across a collection of recognisable themes. Unlike its predecessors, however, BG3 aspires to imbue its game and narrative design with the idiosyncrasies of the D&D tabletop experience. Players roll virtual dice to resolve key dialogue junctures, engage in turn-based strategic combat per the rules of D&D’s fifth edition (Wizards of the Coast, 2014), and are accompanied by an omnipresent narrator who comments on events and character perceptions in the manner of a ‘dungeon master’ (Zagal and Deterding, 2024: 29). It is here that the music of Baldur’s Gate 3 departs from convention, exhibiting a small but novel assortment of ‘metareferential’ (Wolf, 2009) musical devices ranging from minor musical ‘easter eggs’ (Mago, 2019) to explicit transgressions of the diegesis (Pier, 2014a). In this article, I explore how these meta-musical strategies can contribute to BG3’s authentic representation of a TTRPG experience and the player’s engagement therein.
I begin with a brief overview of the role of music in RPGs and its relation to notions of storytelling, immersion, and the diegetic boundary. I then explore the metacommunicative nature of gameplay and its particular significance to the TTRPG aesthetic that BG3 seeks to emulate. After explicating a conception of ‘meta-music’, I proceed through an autoethnographical account of the meta-musical devices in BG3, highlighting media precedents and discussing the risks and benefits posed to the intended immersive experience. I conclude by evaluating the broader strategy of musical metareference within Baldur’s Gate 3 and discussing what such analyses reveals about contemporary approaches to the study of video game music.
Background
Music and immersion in RPGs and TTRPGs
Examples of terminology used to contrast the narrative and functional dimensions of games. Texts focused on the role of music and sound in video games are marked with an asterisk.
The first of these roles – supporting the fictive setting, narrative, and emotional experience – assumes particular importance in the context of story-driven RPGs. As BG3 epitomises, RPGs strive to establish a rich and coherent fictional world populated by engaging characters, events, and environments so that players can immerse themselves into the lives and journeys of their avatars over multiple extended play sessions. The nondiegetic music of RPGs contributes significantly to upholding this fictional immersion. It anchors the player to an imagined time and place, interprets emotional context, sutures breaks in the narrative experience (Kamp, 2016), and unifies the narrative themes and structure (Gorbman, 1987; Summers, 2016). Nondiegetic music can be utilised for similar functions during TTRPG sessions (Borecky, 2021; Mariucci, 2024). The D&D player’s handbook suggests the use of music or sound effects to ‘set the mood’ (Wizards of the Coast, 2014: 6), which players often oblige by curating personalised playlists (Knox, 2018) or operating web-based tools such as ‘Tabletop Audio’ (Roven, 2015) or ‘BattleBards’ (BattleBards, 2024). As this process can be mechanically demanding for players, researchers have developed intelligent systems for automatically selecting or generating background music aligned with the detected emotional state of real-time player speech (Ferreira et al., 2020; Padovani et al., 2021). Others have even designed physical music-generating interfaces, styled as dodecahedral game dice, to replace anachronistic media devices (e.g. laptops, mobiles) on the tabletop with the aim of better supporting player immersion (Berndt et al., 2017).
Despite being a crucial responsibility of RPG music, the fictional experience offers only a partial accounting of player ‘immersion’. The notion of a player’s engrossment in a fantasy world has been explored as ‘imaginative’ immersion (Ermi and Mäyrä, 2005), ‘narrative’ and ‘affective’ involvement (Calleja, 2011), and immersion into an ‘environment’ or ‘character’ (Bowman, 2018). Immersion, however, is a much broader psychological phenomenon encompassing one’s attention and engagement with an activity itself: the playing of a game, not just the roleplaying of a fiction. Game scholars have described this experience as both a ‘challenge-based immersion’ (Ermi and Mäyrä, 2005), grounded in the desirable alignment of skill and difficulty that characterises the ‘Flow’ state (Csíkszentmihályi, 1990), and as a ‘ludic involvement’ (Calleja, 2011), in which players derive satisfaction from expressing agency within a rule system to achieve their goals.
RPG music supports this ludic sense of immersion through its second role of communicating gameplay information to players. While video games primarily rely on visual cues and informative sound design for this purpose (Ng and Nesbitt, 2013), nondiegetic music still imparts meaningful feedback on the player’s performance and progress within the game. In BG3, for instance, a victorious end to combat is almost always denoted by a triumphant brass stinger which affirms the player’s success. TTRPGs also deploy music for comparably informative functions, though are less inclined to rely on such techniques due to inherent tensions within their format. Tabletop sessions often take place in noisy environments such as conventions or comic book stores (Mariucci, 2024), and must in any case contend with the chattering of players themselves which frustrates the effectiveness of informative musical cues. Further, to operate a sound system more complex than an ambient music playlist – such as a live soundboard or voice modulation (Jensen, 2020) – requires greater preparation, knowledge, and resources than are easily accessible to many. The result is that directly informative uses of music are less frequently employed during TTRPG sessions, leaving fictional immersion as a much clearer point of musical alignment between the two genres.
Given the strong ties between music, narrative, and immersion in both RPGs and TTRPGs, an overview of how this relationship interacts with the diegetic model (Gorbman, 1987; Neumeyer, 2009) is warranted. The conceptual distinction between diegetic and nondiegetic music in film is neither rigid nor stable (Stilwell, 2007), but has survived repeated challenge and revision (Sbravatti, 2016; Stokes, 2019) as an elegant heuristic for interpreting the narrative functions of music. Video games further complicate this model, not only by introducing extra-narrative musical functions, but via the participatory nature of the medium itself (Jørgensen, 2011). Ostensibly nondiegetic music can influence the player’s actions within the diegetic world (e.g. by alerting them to danger) and vice versa, rendering the boundary between the two quite permeable. Despite these nuances, players have developed an acute ‘musical media literacy’ (Van Elferen, 2016) and face no difficulties in situating music within or without the diegesis, should they consciously acknowledge it all. This is what enables Baldur’s Gate 3 and Dungeons & Dragons to make judicious use of nondiegetic music for discursive and immersive purposes without confusing or interrupting the player experience.
Of course, the music of Baldur’s Gate 3 does not strictly obey this conventional model. Without neglecting its primary responsibility to support the game’s fictive setting, the score for BG3 commits several playful transgressions of the diegetic boundary (explored shortly). These ‘meta-musical’ moments vary in purpose and intensity, but can be unified under a common conceptual strategy: to evoke the uniquely metacommunicative aesthetic of playing Dungeons & Dragons.
Metacommunication and the TTRPG experience
Play is a process of ‘metacommunication’ (Bateson, 1976; Neitzel, 2007; Salen and Zimmerman, 2003). Each act of play communicates meaning (e.g. ‘I’m attacking you!’), but also includes the message that ‘this is merely play’ and so delimited from the rules and rituals of the serious world. As Bateson has observed of animal play-fighting: ‘the playful nip denotes the bite, but it does not denote what would be denoted by the bite’ (1976: 69). Game scholars have drawn on these insights to describe the ‘self-referential’ nature of playing games (Neitzel, 2007; Santaella, 2007), suggesting that players necessarily enter into a ‘double-consciousness’ (Salen and Zimmerman, 2003: 449) where they are at once a character within the gameworld and an external participant aware of its artificiality. Even in single-player games, players inhabit both cognitive frames and engage in an interplay of meaning-making between the two. The music of BG3, like most RPGs, aids these processes by the conventional means outlined prior: enriching the fictional world (i.e. supporting the character), and communicating game information (i.e. supporting the player).
Playing a TTRPG engenders similar metacommunicative and self-referential processes, though with greater complexity and to an arguably more vivid degree. This is due to the many distinct ‘frames’ (Fine, 1983) or ‘levels’ (Mäyrä, 2017) of interaction at which players must operate to uphold the heavily dialogic and imaginative format of TTRPGs. As Mäyrä (2017) outlines, players variably communicate with one another at: (1) the social level, the outermost frame of everyday conversation external to the game; (2) the gameplay level, where players discuss mechanical matters such as when and which dice to roll; (3) the narration level, where players describe the actions or thoughts of their fictional characters; and (4) the character level, where players speak as their fictional characters directly. Player focus can oscillate quite rapidly between these frames in any given moment of a TTRPG session; for a transcribed example, see Mäyrä (2017: 275-276). Exacerbating this further is the tendency for TTRPG sessions to attract humorous interjections, inside jokes (Adams, 2013), and cross-cultural references (Scroll for Initiative, 2021) from its participants, which can originate from any frame of play (e.g. introducing an in-game character that satirises a real-world individual) before rippling amongst the others and redirecting the experience in unpredictable ways.
The activity of playing Dungeons & Dragons is thus one of heightened metacommunicative awareness and interaction across distinct frames of play. This is not to suggest that maximal alternation between communicative levels represents an ‘ideal’ D&D experience. Players have diverse motivations for engaging with TTRPGs (Bartle, 1996; Martinho and Sousa, 2023), and those who prioritise narrative or embodied immersion (Pintar, 2023) would likely prefer to minimise disruptions to the enacted diegesis. It simply highlights that the average D&D experience, when compared to most RPGs, is characterised by a more frequent and explicit acknowledgement of distinct narrative frames and the player’s mobility between them. Recalling that BG3 aims towards a digital adaption of the D&D experience, these metacommunicative tendencies in turn afford a licence for BG3’s music to play beyond the bounds of its usual diegetic and nondiegetic roles.
Musical metareference, metalepsis, and illusion
As I have prefaced, the music of Baldur’s Gate 3 employs ‘metareferential’ (Wolf, 2009) strategies to establish conceptual links with the metacommunicative aesthetic of playing Dungeons & Dragons. This bears some terminological disambiguation, beginning foremost with the notion of ‘meta’ in itself. In his seminal volume on the topic, Wolf (2009) explicates the phenomenon of ‘metareference’ as a ‘transmedial form of self-reference’ which ‘actualises a secondary cognitive frame in the recipient’ (Wolf, 2009: 31). Wolf prescribes this as a broad, media-agnostic category encompassing all ‘meta-phenomena’ in the arts and media, and it is with the same breadth of scope that I invoke the terms ‘metareference’ and ‘meta’ herein. In the context of BG3, I consider a musical strategy to be ‘metareferential’ if it gestures intentionally, by any means, towards the artificiality of the game. Metareferential music need not draw specific attention to its own construction, such as the underlying processes of composition, production, or implementation into the game engine; it need only contribute to eliciting the player’s ‘meta-awareness’ (Wolf, 2009: 31) of any facet of the game experience. Likewise, metareferential music is not confined to overt breaches of the fourth wall. In a game where some villains perform their own nondiegetic boss music (see section ‘Raphael's final act'), Larian Studios’ inclusion of music from their past franchises (see section ‘Musical easter eggs and intertextuality') elicits a comparatively gentler intertextual awareness which nonetheless foregrounds BG3’s mediality. Both gestures, though differing notably in degree, represent ‘meta-musical’ devices.
Within this category of metareference, the technique of ‘metalepsis’ (Genette, 1980) bears particular significance to the music of BG3. Metalepsis is a narratological phenomenon describing any transgression between ‘narrative levels’ (Pier, 2014b) or ‘ontologically distinct subworlds’ (Wolf, 2005: 91), usually committed with intention. Metalepsis can occur in either direction (Fludernik, 2003; Wagner, 2002), be it narrators ‘descending’ into the world of their diegesis or fictive characters ‘ascending’ to the ‘extradiegetic’ (Pier, 2014b) level of narration (i.e. the level of nondiegetic music). Notable pop-cultural examples include the film Stranger Than Fiction (Forster, 2006), whose protagonist discovers that he is a character in a novel and eventually meets his author, or the more recent film series Deadpool (Miller, 2016), whose self-aware protagonist frequently quips to the audience about the film’s production. Metaleptic gestures can also be differentiated by the severity of their impact on the narrative hierarchy (Campora, 2014; Ryan, 2006). ‘Rhetorical’ metalepsis represents a ‘quick glance’ (Ryan, 2006: 207) across levels before the diegetic boundaries are reasserted, whereas ‘ontological’ metalepsis ‘opens a passage between levels that results in their mutual contamination’ (Ryan, 2006: 207). To explicate the latter, Campora (2014: 125) cites the film Ringu (Nakata, 1998), in which a character from a film-within-a-film crawls out of a television into the diegetic world of the main characters.
The capacity for music and sound to act as a metaleptic agent has been widely discussed in film (Bouvrie, 2023; Campora, 2009; Heldt, 2013d; Luko, 2021; Wolf, 2019). This is less the case in video games despite a slew of canonised precedents. In one exception, Ivănescu (2024) contrasts two cases, Portal (Valve, 2007) and Doki Doki Literature Club! (Team Salvato, 2017), where the game’s antagonist performs a song for the player during the closing credits. The broadest conceptions of musical metalepsis encompass musical anachronisms, clichés, and other weak allusions to the artificiality of the medium in which the music is situated (Bouvrie, 2023). For clarity, I will situate such devices as more generally ‘metareferential’. In the context of discussing BG3, I reserve the term ‘musical metalepsis’ to denote musical strategies which intentionally cross or confuse the boundary between the diegetic and nondiegetic levels. For instance, when the player discovers a bartender idly humming the main theme from the BG3 soundtrack (see section ‘The lyre, the bard, and the leitmotif'), this constitutes a ‘descending rhetorical metalepsis’ (Heldt, 2013c: 57; Luko, 2021: 248). The nondiegetic score reaches down into the awareness of the diegetic characters, though only as a passing whimsical reference which otherwise maintains the boundaries of the fiction and its normative musical roles.
Such movements are not at all uncommon in the music of visual media. As Stilwell (2007: 196) remarks, diegesis is often ‘sound-permeable’ at its boundaries, and Heldt (2013c) recounts several filmic examples where nondiegetic music osmoses freely into the narrative world. For video games, Jørgensen (2011) suggests that much of game sound is in fact ‘transdiegetic’, a new communicative frame which transcends the diegetic division by merging ‘usability value’ with the ‘represented universe’ (Jørgensen, 2011: 96). She argues further that game sound even plays a crucial role in connecting nondiegetic gameplay information, or the ‘gamespace’, with the diegetic setting, or ‘gameworld’ (Jørgensen, 2011: 91). The transdiegetic relationship becomes clear in sequences where nondiegetic music informs the player about the game state (e.g. combat has begun), which then influences their character’s diegetic actions (e.g. fight or flee), in turn determining future nondiegetic music, and so on. This again highlights that musical notions of diegesis are not always stable, and that the metaleptic language of ‘transgressing boundaries’ is perhaps not always the most appropriate metaphor (Stilwell, 2007: 186). Nonetheless, musical metalepses leave a distinct perceptual mark on the gameplay experience, alerting one’s ‘meta-awareness’ to the fact that something peculiar has occurred within the artifice which runs contrary to one’s medial expectations.
Crucially, this metaleptic effect can be ‘illusion-building’ or ‘illusion-breaking’ (Pier, 2014a: 338) depending on the medium and conditions by which it arises (Ensslin, 2011). By default, metalepsis is presumed to be ‘anti-illusionistic’ (Pier, 2014a: 338), ‘paradoxical’ (Klimek, 2010), and likely to cause ‘laughter, bewilderment, or…surprise’ (Ensslin, 2011: 6) due to its characteristic rupturing of established narrative boundaries. Ensslin (2011) describes this as ‘divergent’ metalepsis, a subversive device with a rich history in the music of comedy films (Bouvrie, 2023) and television (Stokes, 2019) where the aim is to amuse or expose clichés in the form. In interactive contexts, however, metalepsis can conversely strengthen the ‘illusory connection between the extra- and intratextual worlds’ (Ensslin, 2011: 16), contributing to immersion via one’s feeling of ‘experientially participating in a representation’ (Wolf, 2013: 121). Ensslin (2011) instead describes this as ‘convergent’ metalepsis, blurring diegetic boundaries to further the illusion of ‘co-presence, colocation and conversation between the virtual and the actual’ (Ensslin, 2011: 16). It is in this respect that the music of Baldur’s Gate 3 can contribute to the representation of a D&D tabletop experience: by working as a metaleptic agent to unify the metacommunicative aesthetic of TTRPG interaction with the diegetic hierarchy and gameplay systems expected of BG3 as a modern RPG.
Of course, the success of any metaleptic strategy is as much a matter of player perception as authorial intent, particularly when the unintended effects of ‘disruption’ (Wolf, 2005: 103), illusion-breaking, or simply bemusement are so readily associated with the technique. As such, the following account of meta-musical devices in BG3 derives heavily from my own subjective experience with the game over multiple complete playthroughs. Where relevant, I also draw on an informal digital ethnography of BG3 fan communities, primarily the Reddit community ‘r/BaldursGate3’, to capture qualitative player perceptions of each musical strategy and elucidate the risks they entail.
The meta-music of Baldur’s Gate 3
The lyre, the bard, and the leitmotif
The first music heard by players in Baldur’s Gate 3, and which greets them at the main menu with each subsequent play session, is ‘Main Theme, Pt. 1’ (Slavov, 2023e) from the official soundtrack. The cue begins with three low, lamenting vocal chants, interspersed by ominous string glissandi, before introducing material that players will grow to recognise beyond all else in the game’s score: a short, descending leitmotif in the Aeolian mode (see Figure 1), rendered in this instance by emphatic drums, brass, and a mixed choir. To describe this leitmotif as pervading the nondiegetic music of BG3 feels almost an understatement. It arises in varied harmonic, metric, and timbral contexts (see Figure 2–4 for examples), underscoring moments as diverse as the initial character creation, narrative cinematics, combat, idle exploration, dialogue, and sex scenes. The leitmotif also concludes the player’s long journey, sounding the final, triumphant musical gesture before the credits roll. To belabour the point, even the credits themselves commence with a contemporary, lyricised arrangement which takes the leitmotif as its chorus (see Figure 4). In its nondiegetic usage, this leitmotif does not denote a specific character or event, but rather the gameworld, narrative, and the player’s journey itself. Abstraction of the main leitmotif from Baldur's Gate 3, as heard in ‘Main Theme, Pt. 1’ [starting 0:19] (Slavov, 2023e) Melodic passage from ‘Who Are You’ [starting 1:27] (Slavov, 2023l), concluding with the main leitmotif of Baldur’s Gate 3 (bars 7-9) Melodic passage from ‘Down by the River’ [starting 0:16] (Slavov, 2023a), a metric variation of the passage from Figure 2 which concludes with two repetitions of the main leitmotif (bars 4-7) Melody for the first chorus of ‘The Power – Credits Song’ [starting 1:06] (Slavov, 2023h), comprised entirely of the main leitmotif for Baldur’s Gate 3



Of interest here is that the leitmotif passes into the diegesis of BG3 on several occasions. As with film (Heldt, 2013d), this metaleptic manoeuvre is not uncommon in video game music. In The Legend of Zelda: Ocarina of Time (Nintendo, 1998), for instance, players must find, memorise, and perform simple melodies which are both prefaced and elaborated on by the nondiegetic score for each respective game area (Summers et al., 2021). Notable in BG3, however, is the varying purpose and boldness with which these transgressions occur. Some players will encounter this as early as creating their character. One of the many classes available to players is the ‘Bard’, a magical minstrel whose modern fantasy archetype centres on diverse forms of musical roleplaying (Cook, 2019; Johnson, 2024). When selecting the Bard, players can audition up to five starting instruments to hear them perform either the leitmotif itself or its close harmonic accompaniment. Already the player may notice a subtle metalepsis here, as the leitmotif can also be heard throughout the looping playlist of background music used during character creation.
Should the player continue as a Bard, they will soon discover the ability to perform music at will within the diegetic world. Systems for diegetic music performance are a rich tradition in RPGs (Olivetti, 2015). The most complex of these, found in The Lord of the Rings Online (Turbine, 2007), offers both freestyle melodic input and the automated playback of customisable sound files (Cheng, 2012). BG3’s comparatively simple system only enables players to start or stop performances from a small pre-defined repertoire, yet each of these acts are metaleptic. Of the three songs included in BG3’s standard edition, ‘The Power’ takes the leitmotif directly as its chorus, while the remaining pieces (‘Old Time Battles’ and ‘Bard Dance’) are other recognisable cues from the nondiegetic score. Notably, players don’t need to engage with the Bard class or BG3’s musical performance system to draw the leitmotif into the diegetic world. In one narrative pathway, players of any class can acquire the ‘Spider’s Lyre’, a special instrument which serves as one solution to an early-game plot obstacle. If players use this lyre in the appropriate location, their character will perform a chordal rendition of ‘The Power’ during a short cutscene, once more invoking the leitmotif by association.
Diegetic expressions of the leitmotif are not only uttered by the player-character themselves, who is already a metaleptic agent by default (Neitzel, 2007). They can be encountered serendipitously throughout the gameworld under conditions that require nothing more of players than their presence to bear witness. In one such instance, players stumble upon a bartender who idly hums the leitmotif in the course of their duties (see Video 1). In another, the player is audience to a funeral in which the attendees perform a sombre, choral rendition of ‘The Power’ (see Video 2), listed on the soundtrack as ‘The Power – Choral Version’ (Slavov, 2023g). In both scenarios, the music is firmly embedded within the diegesis. Not only is the audio appropriately spatialised so as to clarify its source, but clear visual and dialogic cues serve to reinforce the association. The bartender is captioned with the subtitle ‘*humming*’ to narrate their actions, while the funeral performers move their mouths and bodies sympathetically with the music as their leader proclaims: ‘Sing, sisters! Sing in Umberlee’s name!’.
These metalepses are each of a ‘rhetorical’ nature (Ryan, 2006). They do not suggest an open rupture between narrative levels of which the diegetic characters are aware but a set of contained interstitial references with differing purposes. In the cases of the lyre, the bard, or the funeral song, the leitmotif’s metalepsis is ‘convergent’ (Ensslin, 2011), intending to deepen the interactional immersion by unifying player input, the diegetic world, and the nondiegetic discourse. The bartender’s humming, by contrast, feels whimsically ‘divergent’ (Ensslin, 2011), offering a brief authorial intrusion which metareferentially acknowledges the artifice and recalls the humorous interjections common to D&D sessions. Considered together, one interpretation of these metalepses is that the leitmotif is in fact native to the fictional world and ‘ascends’ (Wagner, 2002) to the nondiegetic score as a discursive device. John William’s score for Close Encounters of the Third Kind (Spielberg, 1978) exemplifies this technique by uplifting an otherwise diegetic five-note motif into the nondiegetic orchestration of the film’s closing sequence (Schneller, 2014: 120-121). More relevant than the ontological status of the leitmotif, however, is its cumulative effect on player perception. Though subtle, it establishes an early and continued softening of the diegetic membrane which sets the stage for more daring metareferential strategies further in.
Musical easter eggs and intertextuality
Baldur’s Gate 3 contains a number of ‘easter eggs’ (Mago, 2019), small novelties or references which developers hide throughout their games to reward inquisitive players. Easter eggs are often metareferential and intertextual. For instance, the dialogue in BG3 jests about the difficulty of organising friends for regular D&D sessions, and further inserts references to a promotional YouTube series in which the main voice-acting cast played D&D as their BG3 characters (High Rollers DnD, 2023). BG3 extends this practice to a handful of musical easter eggs. If players interact with the miscellaneous ‘music box’ item, it interrupts the nondiegetic music to produce a spatialised, diegetic rendition of ‘Down by the River’ as yet another metaleptic expression of the main leitmotif (see Video 3). Mago (2019: 52) would describe this as a ‘metatextual’ easter egg, as it alludes self-referentially to the game’s own construction. Depending on the game edition purchased, players can also perform three pieces of music from Larian Studio’s previous title, Divinity: Original Sin II (Larian Studios, 2017). This instead constitutes an intertextual allusion (Genette, 1997a: 2) which likewise elicits one’s meta-awareness of the gaming medium at large.
The most potent musical easter egg is hidden in BG3’s ‘camp’ environments, which players regularly visit as a respite from their adventure. Camps are underscored by a revolving playlist of pensive, instrumental music, frequented by the tracks ‘I Want to Live – Instrumental Version’ (Slavov, 2023d) and ‘I Want to Live – Classical Version’ (Slavov, 2023c). In certain camps, if players stand in a specific location, the nondiegetic music will shift to a lyrical arrangement of ‘I Want to Live’ featuring a vocal performance by composer Borislav Slavov (see Video 4). Should the player stray back and forth from the location, the vocal layers will then fade in and out of the score accordingly, leaving no doubt that their movements are controlling a dynamic remixing of the nondiegetic music. This playful reactivity clarifies for players that the musical interaction is an intentional easter egg, planted in fact by Slavov himself (Dan Allen Gaming, 2023: 54′00″) who was delighted at the first news of its discovery. Players similarly find joy in unearthing and sharing these secrets with the wider fan community (Dirtmooth, 2023; SpookusDookus, 2023), and so their inclusion opens a direct metacommunicative bridge between the player and composer. Beyond this contribution to the game’s broader metareferential aesthetic, the musical easter eggs also preface a further strategy in BG3’s representation of the D&D experience: the use of sound and music to incentivise player exploration.
Harpy songs, silence, and player exploration
In the early hours of Baldur’s Gate 3, players who explore behind the first main settlement can discover a secluded cliffside path leading down to a riverbank. This small area hosts one of the game’s many optional ‘side-quests’ (Onuczko, 2007) and among its most interesting musical encounters. As players first approach the cliffs, the settlement’s nondiegetic music recedes to reveal faint, diegetic singing in the Dorian mode. Should the player indulge their curiosity and continue down the path, the singing will grow louder as their companions begin to remark on the song’s beauty. Upon reaching the water’s edge, the player finds a young child visibly bewitched by the supernatural music and seemingly intent on wading deeper into the river. By now the choral piece has swelled to its full volume and texture, including a harp, lyre, strings, flute, and light digitised percussion. At last, the trap is sprung, and the player is beset by the song’s malicious source: a hostile flock of ‘harpies’ hoping to lure prey to their demise. The lead up to this encounter and transition into combat is shown in Video 5.
The track ‘Harpy Song’ (Slavov, 2023b) continues to play as combat begins, but now enters an ambiguously transdiegetic (Jørgensen, 2011) state which Stilwell (2007) describes as a ‘fantastical gap’ between diegetic and nondiegetic music. On one hand, the narrative context, visual interface (see Figure 5), and the harpy’s moving mouth all clarify that the diegetic creature remains the source of the music. On the other, the song is no longer spatialised to the harpy’s position (see Video 6) and seems to have assumed the additional function of nondiegetic combat music, particularly with the introduction of anachronistic digital timbres (starting from 1:07 of ‘Harpy Song’). In either case, the song signifies a serious threat to the player’s party. With each turn that the magical singing persists, there is a chance for players to lose control over their characters, rendering them defenceless to the harpies’ lethal attacks and potentially ending their game. Cropped screenshots showing how the visual interface conveys whether a harpy is currently singing. Left: the condition ‘Singing’ is shown under the harpy's health bar. Right: the full description of the ‘Singing’ status when inspected by players
It is here that a set of musically metaleptic interactions becomes available to the player. The game interface first provides the hint that incapacitating the harpy will disrupt its song (see Figure 5: right). If the player succeeds, the track ‘Harpy Song’ abruptly ends and the music returns to a firmly nondiegetic score comprising the driving percussion and brass-heavy orchestral palette that typically accompanies the combat of fantasy RPGs (Gibbons, 2017). Any other harpy can then begin singing, prompting a transition back to the transdiegetic ‘Harpy Song’, which the player can again disrupt, and so on until all harpies are defeated. Even after combat ends, an extended, nondiegetic arrangement of the motifs from ‘Harpy Song’ persists in the area as idle exploration music, retriggering any time the player returns to the riverbank as a sonic memento of the encounter. The harpy song thus acts as a potent agent of convergent metalepsis, weaving together the narrative world, gameplay systems, and the player’s meta-awareness to engender a holistic, interactional immersion into the D&D experience.
As with the musical easter eggs, however, this encounter plays a further role in BG3’s representation of the D&D experience. Perceptive players, or those well-versed in the mechanics of D&D, may realise that there is a more efficient means of disrupting the harpies’ song than trying to incapacitate them (which relies on the luck of one’s dice rolls over potentially multiple turns). This alternative is a unique spell called ‘Silence’, one of 252 spells available in the game (Baldur’s Gate Wiki, 2024), which creates a sound-proof sphere at a chosen location (see Figure 6). If players cast this silencing sphere over a singing harpy, the bewitching effect will end and the ‘Harpy Song’ is replaced by conventional combat music, just as before (see Video 7). Alternatively, players can cast the silencing sphere over their allies, preventing them from ‘hearing’ the song within the diegetic world and thus saving them from its luring effects. In this second case, the harpy has not stopped singing, and so the player themselves continues to hear the transdiegetic ‘Harpy Song’ (see Video 8). Cropped screenshot showing the in-game description for the ‘Silence’ spell
Ensuring intuitive and logically consistent outcomes for these niche musical interactions is crucial to BG3’s design philosophy. Through meta-musical feedback, it signals to players that they will be rewarded for wielding their imagination, ingenuity, and knowledge of D&D systems to devise lateral solutions to game challenges. For many players of Dungeons & Dragons, this sense of open creativity and ‘out-of-the-box thinking’ (Bowman, 2010) is core to the identity of TTRPGs and serves as a key motivator for engaging with the genre (Coe, 2017; Katifori et al., 2022). It also recalls the player archetype aptly labelled as ‘explorers’ (Bartle, 1996), who derive great joy from the ‘discovery’ (Arrasvuori et al., 2010; Martinho and Sousa, 2023) of novel interactions, possibilities, and secrets within game systems and fictional worlds. Although BG3 is bounded inexorably by its programmatic nature, it aspires to evoke the same interactional aesthetics of its tabletop predecessor: a sense of open possibility which at every turn incentivises free exploration and experimentation. The harpy encounter and musical easter eggs thus represent two distinct approaches to the musical representation of this facet of the D&D experience. With the harpies, music acts as both a guide and feedback system, directing the player’s exploratory efforts and clarifying the outcomes of their creative experimentation. With the easter eggs, music is itself the metareferential reward for player inquisitiveness, thereby encouraging the exploratory mindset that for so many characterises D&D.
Retrospective prolepsis in the Song of Balduran
Departing somewhat from the strategies discussed thus far, Baldur’s Gate 3 also deploys metareferential music with a more delimited focus on storytelling alone. In the latter half of the game, players encounter a significant twist within the main plotline. The player’s enigmatic mentor figure, who has guided them since the opening moments of the adventure, is revealed to be ‘The Emperor’, a member of an antagonistic alien species known as ‘mind flayers’. The dialogue sequence that follows is underscored by a nondiegetic composition that players have not yet heard, and which otherwise bears no notable distinction from any previous music in the game (see Video 9). As such, this unassuming music is unlikely to garner any critical attention given the more pressing narrative revelation at hand. Putting this aside, players then progress through the story for several more hours until they reach the titular city of ‘Baldur’s Gate’ and gain access to a luxurious tavern room. Each time the player returns to this room to rest and recover, they hear a lyricised, nondiegetic piece of music entitled ‘Song of Balduran’ (Slavov, 2023j). The song’s lyrics recount the tragic fate of Balduran, the eponymous founder of the city who, like ‘The Emperor’, was transformed into a ‘mind flayer’ and enthralled into the service of the game’s arch villains (see Figure 7). The similarity of these two tales may lead some players to correctly predict a further twist: that The Emperor and Balduran are in fact the same individual. However, given how much time has passed since The Emperor’s introduction, even perceptive first-time players are unlikely to recognise that ‘Song of Balduran’ is also the same composition that first underscored his dramatic reveal (see Video 10). Melody and lyrics for a narratively significant excerpt of ‘Song of Balduran’ [starting 0:51] (Slavov, 2023j)
By situating an instrumental arrangement of ‘Song of Balduran’ alongside The Emperor’s introduction, the music aims to foreshadow the player’s eventual discover of his hidden past. This is a common discursive strategy in nondiegetic music which Heldt (2013a) likens to ‘prolepsis’, the ‘narrating or evoking in advance of an event that will take place later’ (Genette, 1980: 40). However, Heldt also notes that conventional musical foreshadowing is always ‘recognisable as foreshadowing…even though one may not yet know what exactly is being foreshadowed’ (Heldt, 2013a: 236). This functional awareness is impossible in The Emperor’s case, as first-time listeners are denied any means of inferring the music’s narrative significance until its lyrics are discovered several hours later. It is only retrospectively, once an association can be drawn between the lyrics, the composition, and its original context, that the ‘proleptic’ (Genette, 1980) nature of the music becomes evident. Heldt (2013a) thus describes this technique as ‘retrospective prolepsis’, a form of narrative anticipation that, perhaps ironically, only reveals itself after the fact.
Unlike the other musical moments discussed herein, the retrospective prolepsis of ‘Song of Balduran’ is not what I have called musically metaleptic because it operates entirely at the nondiegetic level of narration. Instead, the technique’s significance is derived from its metareferential impact on the RPG storytelling experience. Baldur’s Gate 3, like many RPGs, is designed to support multiple playthroughs. Most players who complete the game will develop an intimate familiarity with the lyrics and melody from ‘Song of Balduran’ due to its frequent repetition, which in turn equips them to uncover the retrospective prolepsis on subsequent playthroughs. In this way, the musical technique contributes to renewed narrative appreciation across recurrent experiences with the story, a sentiment often reflected in fan discussion regarding The Emperor’s identity (CaptainChalky, 2023; Littleladman, 2023). Notably, the developers have also incorporated a variety of non-musical hints, references, and easter eggs which allude to this plot twist. Even the animated logo for Larian Studios, the first thing players see upon opening the game, arguably depicts the likeness of Balduran transforming into a mind flayer. Together with the ‘Song of Balduran’, these playful inclusions represent a broader strategy of metareferential communication from the developers to the player, exposing the machinery of BG3’s interactive storytelling to support a deeper engagement therein.
Raphael’s final act
The boldest act of musical metalepsis in Baldur’s Gate 3 is tied to perhaps its most memorable side-character. Raphael is a Faustian devil with a flair for the dramatic, imposing on the player at several points throughout the story to impart knowledge and tender wicked bargains. Raphael’s theatrical disposition is foregrounded from the onset. He introduces himself by reciting a lullaby, confesses to rehearsing his speeches, and entraps others with binding contracts delivered as intrusive songs (see Video 11). Suitably, then, the player’s first encounter with Raphael also introduces his own nondiegetic leitmotif (see Figure 8), a stepwise minor melody rendered on a haunting, gothic organ befitting the character’s style. After progressing to the end of the story, players can choose to burgle Raphael’s infernal home, provoking a climactic confrontation with the devil that serves as one of the game’s most difficult ‘boss battles’. As combat commences, a lone feminine voice performs Raphael’s leitmotif, now complete with threatening lyrics and mixed notably louder than all previous battle music in BG3. The familiar organ then restates the motif before Raphael himself, to the player’s shock and awe, joins the music and begins to tauntingly lyricise about the player’s impending doom (see Video 12). Raphael’s leitmotif as heard during his first encounter with the player. This excerpt can also be heard in ‘Raphael’s Final Act’ [starting 0:25] (Slavov, 2023i)
It is critical to distinguish this metaleptic act from the earlier treatment of the ‘Harpy Song’ (see section ‘Harpy songs, silence, and player exploration'). There is no mistaking that the vocalist is Raphael given that the lyrics are performed by his voice actor, Andrew Wincott. However, if players experiment following the logic of the harpy encounter, they will quickly discover that the music does not seem to be tethered to the diegetic world. Raphael’s performance cannot be interrupted by using the ‘Silence’ spell, nor even by killing him (see Video 13). His mouth does not move as the harpies’ do, the song does not influence combat, and the in-game characters offer no reaction to it. One interpretation of these discrepancies is that Raphael is not performing within the diegesis, but reaching up into the artifice, addressing the player directly from the extradiegetic level of narration. From this elevated position, he might also be viewed as speaking down to the unwitting diegetic protagonists, or perhaps even hiding there beyond the reaches of their worldly wrath. In any instance, the metaleptic effect of Raphael’s song is palpable. All other metareferential gestures in BG3’s music are passing novelties, hidden curiosities, or otherwise subtle components of a multi-faceted storytelling apparatus. By contrast, ‘Raphael’s Final Act’ (Slavov, 2023i) dominates the attentional space of the encounter, foregrounding metalepsis at a time when narrative tension and immersion are otherwise at their peak. Given the noted anti-illusionistic potentials of metalepsis (Pier, 2014a), the boldness of its musical implementation here represents a significant risk to the player experience.
These risks were not lost on Slavov himself. Speaking on the process of implementing Raphael’s song, he recalls thinking ‘I know it’s crazy…this could easily be hated’ (Dan Allen Gaming, 2023: 36′40″). A subset of players across BG3’s community forums agreed, recounting that Raphael’s singing undercut the tension of the fight, broke their immersion, or was too ‘self-aware’ to take seriously (AmazingObserver, 2024; Darthbamf, 2024; Shinrahunter, 2023). One contributor even extended these anti-illusionistic effects to ‘Song of Balduran’, which they framed derisively as another of the game’s ‘meta songs’ (Meggannn, 2024). However, if one were to judge the success of ‘Raphael’s Final Act’ via its reception in online fan discourse, then the risks have certainly paid off. The majority sentiment on ‘r/BaldursGate3’ is that the metaleptic performance befits Raphael’s theatrical and narcissistic disposition, contributing meaningfully to players’ engagement with the encounter and rendering it a memorable narrative and gameplay experience (Sthenial, 2023; The1Death, 2023; HoutaroOreki, 2024). The song’s overt metareferential qualities also seem to have inspired acts of mythmaking within the fan discourse, exaggerating aspects of the song’s reactivity or diegetic representation as a form of praise. It is common, for instance, to read enamoured assertions that Raphael’s mouth moves with his music (IncrediblySneepy, 2023), that ‘Silencing’ him will cut out his vocal layer (Battletoad93, 2023), or that his companion ‘Korilla’ is linked to the feminine vocal performance in the same manner (BlueEyedDragonGal, 2024). Although these claims are demonstrably false (see Video 13), their invention nonetheless speaks to the value that some players ascribe to metareferential musical interactions in this delimited context.
It is clear how Raphael’s metalepsis could disrupt the experience for players that Pintar (2023) would describe as narratively ‘enrolled-attached’ (Pintar, 2023: 11). These players prefer a serious ‘submersion of self’ into stories rather than the ‘self-conscious performance’ of them (Pintar, 2023: 11), and a major villain joining the soundtrack seems certain to invoke the latter. For those with a favourable view of Raphael’s song, though, how might we account for its success? While BG3 is by no means the first video game in which an antagonist sings to the player, most precedents have either played this for comedic effect, as with Conker’s Bad Fur Day (Rare, 2001), are explicitly music-themed games themselves, as with Rhapsody: A Musical Adventure (Nippon Ichi Software, 1998), or have otherwise relegated the villain’s song to the end credits as a departing novelty, as with Portal (Valve, 2007). In part, any warm reception of Raphael’s performance at this otherwise serious narrative juncture can be attributed to his careful characterisation throughout the story. He is an egocentric dramaturg possessed of interplanar power, lending a measure of internal consistency to his self-indulgent and diegetically transgressive musical number. The conceptual associations between BG3 and D&D also make a critical contribution here, legitimising the metareferential moment via its alignment with the expected tone of TTRPG sessions. As one fan articulates: One of the things I like best about the game is the subtle reminders that all of this is simulating a tabletop game. So to [sic] they are able to pull off over-the-top scenes like this because humor and high drama go hand-in-hand in tabletop games. We laugh at things like this but also embrace them because they are so on-brand…In another situation, it might have been corny, but the ridiculously good music & voice work really sold this. (Khemeher, 2023)
Finally, and working in concert with these elements, is the integral groundwork laid by the many meta-musical strategies that players have encountered in BG3 up until this point. Together, they establish the metacommunicative musical ‘literacy’ (Van Elferen, 2016) by which players can interpret ‘Raphael’s Final Act’, but also serve as vital supportive strands in the aesthetic connection between BG3 and the D&D experience that crucially contextualises the encounter at large.
Discussion
The metareferential music of Baldur’s Gate 3 can be summarised as serving three goals: (1) to evoke the metacommunicative experience of playing Dungeons & Dragons; (2) to facilitate the sense of open exploration and creative experimentation that characterises the TTRPG genre; and (3) to achieve these goals whilst still upholding the conventional responsibility of RPG music to support the player’s narrative and interactional immersion (see section ‘Music and immersion in RPGs and TTRPGs'). On the whole, I view the music of BG3 as succeeding in these domains. The interweaving diegesis of the main leitmotif and the retrospective prolepsis of ‘Song of Balduran’ invoke a subtle metacommunicative awareness of the storytelling and gameplaying activities that serves only to deepen the connection between fiction, action, and artifice. The musical easter eggs and the metaleptic reactivity of the ‘Harpy Song’ act as a sonic reward and compass respectively, incentivising player experimentation while foregrounding the metacommunicative influence of the developers as proxy ‘dungeon masters’ working from above to enrich the interactional experience. All the while, the subtle and ‘rhetorical’ (Ryan, 2006) implementation of these ‘convergent’ (Ensslin, 2011) strategies minimises the risk of disrupting the fictive experience, and the scarce few ‘divergent’ (Ensslin, 2011) techniques – such as the musical easter eggs – are clearly demarcated as playfully hidden metacommunications from developer to player.
The bolder approach taken with Raphael’s metaleptic performance sits in stark contrast, inviting a more explicit risk of anti-illusionistic disruption (Pier, 2014a) as borne out by its divided reception in the online fan discourse. Despite this, it appears that the encounter’s close alignment to the archetypal Dungeons & Dragons experience, its careful musical and narrative preparation, and perhaps its sheer novelty are together enough to bridge the diegetic rupture for a majority of players and cement ‘Raphael’s Final Act’ as perhaps the most memorable musical moment in the game. Still, for those in the apparent minority, the severity of the song’s disruptive consequences must not be overlooked. So distasteful was the overt metalepsis to a select few players that they felt compelled to halt gameplay and mute the music outright (AmazingObserver, 2024) – an experience shared anecdotally by a colleague of mine. How else, then, might ‘Raphael’s Final Act’ have been conceived to assuage these disruptive effects whilst preserving the bold metaleptic qualities that render it so memorable? Would this even be possible for a seriously enrolled, narratively attached player (Pintar, 2023: 11)? Though the outcomes are surely unpredictable, I wonder at how the reception of Raphael’s performance might have differed were it implemented consistently with the transdiegetic manner of the Harpy Song: that is, if Raphael’s mouth did in fact move with his vocal performance, and the player could indeed silence his singing as some fans alleged (Battletoad93, 2023). Would this diegetic grounding support the song’s assimilation into a unified fictive experience, or serve only to widen the discursive fissure? And for those with favourable views of the song, do the myths alleging this more reactive implementation suggest that it would be preferred over the current approach, or do they simply reflect the ease with which novel misinformation spreads in online communities?
The difficulty of answering these questions points to both a gap in ludomusical scholarship and a limitation of this present research: a lack of formal studies evaluating how metareferential music shapes the player’s narrative and gameplay experiences. In particular, there is a vacancy for qualitative user studies comparing meta-musical perceptions across diverse genre and gameplay contexts. The video game Portal (Valve, 2007), like Baldur’s Gate 3, features both a metareferential villain song similarly praised for its memorability (Good Game, 2007) and several metaleptic incursions of this music via diegetic radios placed throughout the game (Ivănescu, 2024). Portal is distinguished from BG3, however, by its contemporary sci-fi setting, its dry and sarcastic comedic tone, its game design as a first-person puzzle-platformer, its lack of a foregrounded intermedial referent (i.e. Dungeons & Dragons), and the confinement of its primary meta-musical gesture to the post-game credits as a supplementary ‘peritext’ (Genette, 1997b). Each of these distinctions seem capable of influencing how players perceive metareferential music in games, and so comparative and qualitative accountings are needed to better chart their interrelationship as it concerns the immersive experience. This can in turn support richer analyses of phenomena such as the divided reception of Raphael’s song while also serving to guide the future implementation of meta-musical strategies in story-driven games.
Vital to any such efforts will be a critical focus on the player’s interaction with metareferential music. Games are ‘ergodic texts’ (Aarseth, 1997) requiring non-trivial effort to traverse and reconfigure, a description which encompasses video games (Aarseth and Calleja, 2015) and TTRPGs (White et al., 2018) alike. There is no meta-musical moment in BG3 that can be passively experienced. Their precise manifestations, and by extension the player’s perception of them, are always the outcome of active participation in the text which dynamically determines their structure and sequence. For this reason, Van Elferen (2016) positions ‘musical interaction’ as one of three key phenomena in her analytical model of game musical immersion, noting aptly that ‘playing games, quite simply, equals interacting with music’ (Van Elferen, 2016: 39). Likewise, Oliva (2019) has proposed ‘ergodic musicking’ as a new paradigm for ludomusicological inquiry that foregrounds musical participation in games over a disproportionate focus on their musical contents’.
These frames are critical, as the TTRPG experiences that BG3 seeks to emulate are themselves fundamentally interactional. They emerge from either the heightened metacommunicative awareness that participation in D&D elicits, or from the acts of open creativity and experimentation that its systems incentivise. As such, the meta-musical strategies which support these intermedial associations must be similarly framed if we are to understand their functions and evaluate their success. This is why the present discussion of BG3’s meta-music has addressed not only the musical content of each moment, nor only its implementation by developers, but offered an accounting of the interactional experience as supported by recorded gameplay and ethnographical engagement with fan discourse. My hope is that this case study of BG3 exemplifies the value of these methods, and of Van Elferen (2016) and Oliva’s (2019) critical perspectives, to the ongoing study of video game music.
Before concluding, I would highlight some exclusions made for the purpose of focussing discussion, but which are nonetheless worthy of future examination. Most significant is that I have examined only the single-player experience of Baldur’s Gate 3, when in fact the entirety of the game can be completed by up to four cooperating players. Each of the meta-musical moments explored herein can thus be encountered and reconfigured by multiple players simultaneously. It is unclear how this shared context might influence perceptions of musical metareferences or their effects on players’ immersion, but Cheng’s (2014) ethnographical study of social musical interactions in Massively Multiplayer Online (MMO) games provides at least one model to guide future inquiry. BG3 also features novel approaches to storytelling through non-metareferential music. For instance, if players encourage the grieving bard Alfira to finish composing her song, they are treated to a cinematic performance of ‘Weeping Dawn’ (Slavov, 2023k) in its entirety. The audiovisual relationship is temporarily inverted in what Altman (1987) would describe as ‘supradiegesis’ (Heldt, 2013b), foregrounding the suddenly orchestrated music performance while relegating visuals to the role of supportive cinematography in the manner of a quasi ‘music video’ (see Video 14). Alongside the primary discussion herein, these elements demonstrate the richness of story-driven RPGs as objects of musicological inquiry and suggest a deserving place for Baldur’s Gate 3 within the canon of game music studies.
Conclusion
This article presents one of the first detailed investigations of how metareferential music in video games can support the narrative and gameplay experience. Taking Baldur’s Gate 3 (Larian Studios, 2023a) as a case study, it explores how traditionally anti-illusionistic techniques such as musical metalepsis can strengthen the connection between fictive worlds and gameplay systems to deepen participatory immersion. It also highlights the tangible risks yet posed to the player’s storytelling experience, which are dependent on the specific narrative context and interactive mechanics by which these techniques are applied. Foremost, however, this article elucidates how each of BG3’s diverse meta-musical strategies contribute to its broader design goal of adapting the Dungeons & Dragons (Wizards of the Coast, 2014) experience for the video game medium. Taken together, they elicit the familiar meta-awareness that characterises D&D roleplaying while also employing meta-musical hints and rewards to inspire the core D&D sensation of free experimentation and open possibility.
In these ways, Baldur’s Gate 3 serves as a potent case for how video game music can support adaptations of intermedial source texts beyond simply importing their existing musical content. Likewise, the present analysis of BG3’s music exemplifies the need for critical perspectives which foreground musical interaction over mere musical content or implementation (Oliva, 2019; Van Elferen, 2016). Without such frameworks, the function of meta-musical strategies in games cannot be wholly apprehended, nor their success meaningfully evaluated. With this in mind, the questions left open by this article also reveal a need for formal studies to evaluate how player perceptions of metareferential music differ between contrasting games. Deeper qualitative and comparative accountings would help to chart the roles of genre, setting, gameplay design, audio implementation, and more in the interpretation of meta-musical gestures, and so guide more informed analyses of phenomena such as the divided reception to Raphael’s bold metaleptic performance. For now, at least, Baldur’s Gate 3 casts a light on the underexamined meta-musical potentials of interactive media and offers new understandings of the relationship between music and storytelling.
Supplemental Material
Supplemental Material - Meta-music and the sonic storytelling of Baldur’s Gate 3
Supplemental Material for Meta-music and the sonic storytelling of Baldur’s Gate 3 by Thomas Studley in Convergence.
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Footnotes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
