Abstract
Although it is frequently used and is highly valued in practice, background music in non-fictional media formats has shown a broad spectrum of ambiguous results in previous empirical research. Scholars have often even advised against the use of music in formats such as television news, news magazines, and documentaries. Discrepancies in the effectiveness of background music have also been found in film and advertising research. In these research areas, the congruence between music and medium has been shown to be especially relevant for predicting music’s effects. In this study, two experiments were conducted to investigate the influence of congruent and incongruent music in non-fictional media formats. The first experiment (N = 92) focused on music’s expressed and induced emotions, recipients’ memory performance, and the perceived credibility and general evaluation of the media format. Experiment 2 (N = 147) concentrated on attitude changes. As expected, carefully selected congruent background music (i.e., music expressing emotions and triggering associations fitting the media format’s topic) positively influenced recipients’ emotionalization, memory performance, and attitude change, as well as the perceived credibility and general evaluation of the media format. All of the measured effects can be considered medium or large (
Keywords
Background music has become an increasingly essential part of non-fictional media formats (e.g., Alencar & Kruikemeier, 2018; Moormann, 2010). In media formats such as television news, news magazines, and documentaries, 1 background music is frequently used (Grabe, Zhou, & Barnett, 2001; Leidenberger, 2015; Rogers, 2015) and is considered to have specific functions. For example, music—as an element of “infotainment” (Alencar & Kruikemeier, 2018; Brants & Neijens, 1998)—can underscore or clarify the drama of the pictures, emphasize and illustrate the protagonists’ emotions, and thereby, enhance the audience’s entertainment (Corner, 2002; Wegener, 2002). In contrast, from a scientific viewpoint, the use of background music in non-fictional media formats could be considered unnecessary or even problematic. In empirical studies focusing on these formats, background music has been shown to have no effect or very inconsistent effects—for example, regarding recipients’ memory performance—compared with no music (e.g., Boeckmann, Nessmann, Petermandl, & Stückler, 1990; Dillman Carpentier, 2010; Kopiez, Platz, & Wolf, 2013). Scholars have often even advised against the use of background music in non-fictional media formats (Brosius, 1990; Kopiez et al., 2013; Schmidt, 1976)—a suggestion that is not only in contrast to music’s widespread use in practice but also to its empirically tested positive effects in other audio-visual media contexts such as film and advertising (Cohen, 2010; Lipscomb & Tolchinsky, 2005; Shevy & Hung, 2013). Studies on the use of background music in film and advertising have shown that music must be composed or selected and edited in a reflective and professional manner, considering specific factors influencing its effective use (e.g., Breves, Herget, & Schramm, 2020; Herget, Schramm, & Breves, 2018; Tan, 2017). The congruence between music and the media format (i.e., music expressing emotions and triggering associations fitting the media format’s topic) is a particularly important factor (e.g., Boltz, 2001; North, Mackenzie, Law, & Hargreaves, 2004). However, to date, there has been no research investigating influencing factors for music in non-fictional media formats (Döben, 1993; Kallinen & Ravaja, 2004; Ronneberger, 1979; Rossmann & Rossmann, 2018). As Dillman Carpentier (2010) argued, “when it comes to the strategic use of music to enhance news, there is little published research to serve as guidance” (p. 64). The aim of this study was to conduct two experiments to test whether carefully selected and edited background music can have positive effects in non-fictional media formats (if certain, influencing factors are considered).
The alleged ineffectiveness of music in non-fictional media formats
Especially in early research on the effects of music in instructional films and news magazines, the results of empirical studies were very ambiguous. In fact, background music was often found not to have any effect at all (Boeckmann et al., 1990; Brosius, 1990; Kopiez et al., 2013; Schmidt, 1976; Thayer & Levenson, 1983; Wakshlag, Reitz, & Zillmann, 1982) or to have negative effects on the recall of verbally and visually conveyed information, potential attitude changes, and the film’s evaluation, compared with no music (Boeckmann et al., 1990; Brosius, 1990). In some cases, researchers have also identified effects that could not be explained (Schmidt, 1976; Schwartz, 1970)—for example, negative effects of emotionally positively connoted background music in a neutral documentary (Schmidt, 1976).
Only a few early research attempts have found music to have marginally positive influences (Brosius, 1990; Wakshlag et al., 1982). In a much-quoted study, Brosius (1990) tested the influence of music in two different educational films using two different music versions (the original music and an alternative version with music that was more congruent with the film content). Overall, his results indicated that music has no effect in educational films. However, for one of the two films, he found the significant result that the participants’ interest improved with congruent background music. Unfortunately, simultaneously, their recall of verbally conveyed information declined—a very undesirable effect in a media format that aims to convey information.
Kopiez et al. (2013) replicated Brosius’ (1990) study by investigating the influence of four versions of background music with different emotional connotations in a news magazine report. Because these researchers did not find any influence of the different versions of background music, they came to the following conclusion: “To conclude with a critical remark, at least in the genre of television news magazines, the application of background music is an ineffective use of resources and an unnecessary ingredient with no positive effects on the recipients” (Kopiez et al., 2013, p. 328). Brosius (1990) and Schmidt (1976) had already judged the issue similarly or even more negatively. Considering these early research attempts and their replications, the increasing use and growing importance of music in non-fictional media formats seem misguided.
The importance of differentiated considerations of music factors and media contexts
Can music have positive effects in non-fictional media formats?
In recent studies, background music has been shown to have more consistent positive effects in non-fictional media formats (Arriaga, Esteves, & Feddes, 2014; Dillman Carpentier, 2010; Kallinen & Ravaja, 2004; Nosal, Keenan, Hastings, & Gneezy, 2016; Rossmann & Rossmann, 2018). Studies 2 that report more positive effects for music in non-fictional media formats have two similarities. First, there is a shift away from simple stimulus–response models. Music is considered in all its complexity and specific musical parameters are taken into account (e.g., tempo in Rossmann & Rossmann, 2018, and rising vs. falling melody lines in Kallinen & Ravaja, 2004). Second, interactions between music and the media context in which it is used are carefully considered (e.g., Dillman Carpentier, 2010). A specific form of interaction—the congruence of background music and the media format—seems to play an especially important role in non-fictional media formats (Arriaga et al., 2014; Nosal et al., 2016).
Congruence of music and media context: General implications
The importance of the congruence of background music and the media format is already known from research on music in film and advertising (Cohen, 2010; Lipscomb & Tolchinsky, 2005; Shevy & Hung, 2013). Particularly in advertising research, the concept of musical fit—a congruence of music and commercial—has been established as a relevant factor influencing the effective, strategic use of music (for reviews, see Allan, 2007; Oakes, 2007). Only music selected to match the commercial’s content and message shows positive effects in terms of spot evaluation, participants’ learning performance, and changing attitudes in the form of, for example, purchase intentions (e.g., Galan, 2009; MacInnis & Park, 1991; North et al., 2004).
Film music research provides a systematic categorization of three particularly important dimensions of congruence. Music can establish affective analogies (parallelism, ambivalence, or divergence in emotional content) and associative analogies (e.g., an allusion to a historical time or a distant place) with a film. Structural analogies become apparent through synchronicity or asynchronicity on a temporal or spatial level. Accordingly, the emotional, associative, and structural potential of music is highly dependent on music’s congruence with specific film dimensions (e.g., Bullerjahn, 2001; Cohen, 2005).
Congruence of music and media context in non-fictional media formats
It is plausible to assume that these aspects also play an important role for non-fictional media formats (Arriaga et al., 2014; Nosal et al., 2016). The specific music manipulations and context considerations in the more recent studies on music in non-fictional media formats can be seen as unintentional manipulations of music’s congruence. For example, music’s emotional content is often determined using a combination of specific musical parameters (e.g., Bruner, 1990; Gabrielsson & Lindstroem, 2010), such as the music’s tempo (manipulated in Rossmann & Rossmann, 2018; Wakshlag et al., 1982) and the direction of melody lines (manipulated in Kallinen & Ravaja, 2004). A manipulation of these parameters could have led to a manipulation of the affective analogies of the music and media format—explaining the reported positive effects of specific versions of fitting music.
The idea of the congruence of music is not groundbreaking or new. Some of the studies that judged the use of music in non-fictional media formats to be largely ineffective or negative already considered the congruence of music and content, at least in terms of affective congruence (Boeckmann et al., 1990; Brosius, 1990; Kopiez et al., 2013; Rossmann & Rossmann, 2018; Schmidt, 1976; Schwartz, 1970). These studies have provided valuable insights and have created a good foundation for investigating which elements should be given more attention to ensure an intuitively perceived congruence of music and the media format. Using background music in a media format so that it is actually perceived as intuitively fitting is not easy. Therefore, a careful manipulation check is of high importance. In the experiments described earlier, participants’ perceived fit between music and content was measured as part of a general, unspecific scale for music evaluation (Brosius, 1990; Kopiez et al., 2013; Rossmann & Rossmann, 2018)—resulting in a measurement that was potentially not as precise as necessary. As a first step to make positive effects of background music in non-fictional media formats more likely, in this study, the congruence of music and content was carefully and skillfully manipulated and subsequently assessed.
How can congruent music have positive effects in non-fictional media formats?
Ensuring the emotionalizing potential of music
Music’s congruence is not the only important factor to consider. In the context of non-fictional media formats, music’s potential to emotionalize is often classified as its most important function (Corner, 2002; Leidenberger, 2015; Rotha, 1963). Surprisingly, however, in previous work, the emotional effect of music either was not measured at all (e.g., Boeckmann et al., 1990; Rossmann & Rossmann, 2018) or was measured insufficiently (Kopiez et al., 2013), especially in studies judging music to be ineffective in non-fictional media formats. For example, as background music, Kopiez et al. (2013) used four pieces of classical music with different emotional connotations that had previously been objectively rated on their expressed emotions by Kreutz, Ott, Teichmann, Osawa, and Vaitl (2008). Although there is often a positive relationship between music’s expressed and induced emotions (i.e., a musical stimulus that can be perceived as sad actually makes listeners feel sad; Gabrielsson, 2002; Herget, 2020), previous research has emphasized the importance of measuring both types of emotions (i.e., “locus of emotion,” Eerola & Vuoskoski, 2013; Evans & Schubert, 2008, p. 75). It is possible that Kopiez et al.’s (2013) finding of no effect for music can be attributed to a failure to induce the relevant emotions—their participants may have been irritated by the unusual use of classical music in news magazines—rather than to the general ineffectiveness of music assumed by these scholars (see also Schmidt, 1976; Schwartz, 1970 3 ). As a second step to make positive effects of background music in non-fictional media formats more likely, music’s expressed and especially induced emotions should be carefully measured.
Can a positive relationship between music’s expressed and induced emotions be assumed for documentaries? Thayer and Levenson (1983) showed a positive relation between expressed and induced emotions for congruent, stressful music in a stressful instructional film about industrial safety, which evoked higher levels of skin conductance than did calming music. Grabe, Zhou, Lang, and Bolls (2000) investigated music as one feature of tabloid news magazines and found a positive effect of music on participants’ arousal. Based on this previous research, we hypothesized that background music that is congruent with the film content in non-fictional media formats communicates stronger expressed emotions (H1) and elicits stronger induced emotions (H2) that are congruent with the film content, compared with no music or incongruent music.
Ensuring a positive effect of congruent, emotionalizing music on credibility
For media formats that convey information, credibility is particularly important as an image dimension and impact filter (Appelman & Sundar, 2016; Carter & Greenberg, 1965; Gaziano & McGrath, 1986). A stimulus or stimulus message perceived as realistic and credible encourages participants to process the information more deeply, which, in turn, leads to better memory performance (e.g., Atkins, 1983; Huston et al., 1995; Pouliot & Cowen, 2007; Slater & Rouner, 1996). When it comes to background music, the deliberate use of an emotionalizing, and thus subjective, feature in media formats that are meant to be realistic and objective representations of “real” events and persons, places, or topics (Have, 2010; Moormann, 2010; Rogers, 2015) might lead to the format being perceived as less credible (e.g., Grabe et al., 2000; Schultheiss & Jenzowsky, 2000). When specifically asked about the use of background music in documentaries, the dominant attitude of regular viewers has been found to be that music is manipulative and “coloring reality” (Have, 2010, p. 51).
However, because background music often goes unnoticed and recipients are, therefore, normally not consciously aware of its presence and effects (Strobin, Hunt, Spencer, & Hunt, 2015; Thompson, Russo, & Sinclair, 1994), it is possible to increase a media format’s credibility and its message credibility using a specific type of music. Psychological research on emotions has demonstrated that intense emotional experiences signal to individuals that something is “real, psychologically real, or of real importance” (Konijn, Walma van der Molen, & van Nes, 2009, p. 334; see also Mayne, 1999; Oatley, 1999), which Ellis and Simons (2005) supported in their study on the emotionalizing effects of music in films. In some of the previously described studies reporting negative effects of music, the ill-considered use of music that was incongruent with the non-fictional media format may explain the finding that music reduced perceptions of the credibility of the media format’s message (e.g., Boeckmann et al., 1990; Schmidt, 1976). 4 As a third step to increase the likelihood that background music in non-fictional media formats will have positive effects, congruent music that does not impair the media format’s (carefully assessed) perceived credibility should be used.
When not only the simple presence of emotionalizing music, but also its congruence with the context, is considered, advertising research has shown that positive effects of music can be found regarding, for example, the credibility of radio spokespersons (Martín-Santana et al., 2015). Therefore, we hypothesized that non-fictional media formats with emotionalizing, congruent music are perceived as more credible, compared with how these formats are perceived with no music or with incongruent music (H3).
Interactions of music and the topic of the non-fictional media format
As a last step to make positive effects of background music in non-fictional media formats more likely, some characteristics of the media format’s topic should be considered. In one of the first studies on this theme, Schmidt (1976) set three preconditions: For a topic to be influenced especially strongly by background music, it should (a) be an ideal projection surface for high emotionalization, and it should (b) challenge viewers to develop a specific attitude while simultaneously (c) being considered ambivalently enough in public opinion that, theoretically, both positive and negative attitudes on the issue could be established (see also Lipscomb & Tolchinsky, 2005; Tan, Spackman, & Bezdek, 2007).
The influence of emotionalizing, congruent music on memory performance, film evaluation, and attitude change
For as long as music in non-fictional media has been investigated, studies have focused on its influences on the recipients’ film evaluation, memory performance, and potential attitude change. These studies have usually found negative effects (e.g., Boeckmann et al., 1990; Brosius, 1990; Kopiez et al., 2013). How does emotionalizing, congruent music that does not impair media credibility influence these variables?
Memory performance and film evaluation
In general, higher levels of emotion can increase participants’ interest in a media format, as well as their memory performance (e.g., Grabe et al., 2000; LaMarre & Landreville, 2009; Nabi, 2003; Vettehen, Beentjes, Nuijten, & Peeters, 2011). Film music research has shown that, through the affective, associative, or structural congruence of music and the media format, the audience’s attention can be directed or focused on specific elements (Boltz, 2001). Schemata activated by the semantic potential of music (i.e., supraindividual associations triggered by music) can lead to improved processing and better memory of this music-congruent information (Boltz, 2004; Boltz, Schulkind, & Kantra, 1991). This better memory performance may also be explained by the concept of fluency (i.e., the ease with which information can be processed and stored). 5 High fluency often results in a generally positive feeling, which can lead to a positive evaluation of a media format (e.g., Liebers, Breves, Schallhorn, & Schramm, 2019; Reber, Schwarz, & Winkielman, 2004), an idea frequently used to explain positive commercial evaluations as a result of using congruent music in advertising (e.g., Galan, 2009; Lavack, Thakor, & Bottausci, 2008; North et al., 2004; Shen & Chen, 2006). Combining these findings from research on film music and advertising, we hypothesized that non-fictional media formats with emotionalizing, congruent music enhance participants’ memory performance (H4) and that non-fictional media formats are evaluated more positively (H5), compared with those without music or with incongruent music.
Attitude change
In the context of persuasive communication, affect can also shape the valence and importance of perceived information and can, therefore, influence evaluations or changes in attitude (e.g., Petty, DeSteno, & Rucker, 2001; Storbeck & Clore, 2008). In line with the affect-as-information framework, Shevy (2007) argued that in the context of film, music’s affective content plays an important role in shaping a film audience’s perceptions. When the music’s expressed emotions are projected onto the film’s topic or protagonist, it triggers—depending on the music’s positive or negative valence—a positive or negative change in the audience’s attitudes (e.g., Brosius & Kepplinger, 1991; Tan, Spackman, & Wakefield, 2017). In an ambiguous short film about the care of older relatives, Costabile and Terman (2013) showed that both positive and negative music could influence whether this responsibility was perceived as a positive function or as a burden. Nosal et al. (2016) set a silent film about sharks to both ominous and uplifting background music. Compared with the effect of the uplifting music, the ominous music resulted in worse attitudes toward sharks and lower motivation to support nature conservation projects protecting sharks. Therefore, we hypothesized that in the non-fictional context, music expressing positive or negative emotions in non-fictional media formats influences participants’ attitudes positively or negatively by reinforcing (congruent music) or diminishing (incongruent music) the media format’s message (H6). However, the potential of background music to influence a recipient affectively or cognitively should not be overestimated: If a recipient has already established a firm position on a topic, this attitude cannot be manipulated through the use of a specific type of music in a media format (H7; Bullerjahn, 2006; Have, 2010).
Experiment 1
Study design and participants
In a between-subjects design, we carefully manipulated the congruence (congruent vs. incongruent, control group without music) of the background music used in a documentary excerpt. Following the most recent study in this context, conducted by Rossmann and Rossmann (2018), 6 we used an effect size of f = 0.37 as the basis of our sample size calculation (Platz, Kopiez, & Lehmann, 2012). G*Power (Faul, Erdfelder, Lang, & Buchner, 2007) indicated a required sample size of at least N = 75 (ANOVA: Fixed effects, omnibus, one-way, α = .05, 1−β = .80, number of groups: 3). In total, 92 participants (70% female, age: M = 21.26, SD = 2.02) were recruited for a laboratory experiment and randomly assigned to either a control group without music or one of the two music conditions. Half of the participants were students who received course credits for taking part in the study, the other half were people randomly encountered on campus and asked to participate in the study. To make the study’s true purpose less obvious, the participants were told that the study focused on the general effects of documentaries.
Selection and construction of materials
Non-fictional media stimulus
The first experiment’s media stimulus was a 7-min excerpt from a documentary of a public broadcaster about Chernobyl today. The central theme of this documentary is the necessity of building a new protective sarcophagus for the deteriorating buildings in Chernobyl to prevent re-contamination with nuclear radiation. Because of flashbacks to the nuclear disaster in 1986 and the thread of a new radioactive contamination, the stimulus is depressing and causes discomfort and anxiety. Thus, the excerpt met Schmidt’s (1976) first precondition of being an ideal projection surface for high music-induced emotionalization. However, the goal of the excerpt was more information transfer than ambitious opinion-forming (Corner, 2008), making it well suited for investigating the effects of music on credibility, evaluation, and memory performance, but not on attitude change. Therefore, we decided to test music’s potential to change participants’ attitudes (H6, H7) in a subsequent experiment. Because music was used in some sequences of the original documentary, we deleted the original audio track in these sequences completely, had the spoken off-screen commentary re-recorded by a professional news anchor, and then added new background sounds, for a realistic sound atmosphere. This was especially critical for the stimulus condition without music because periods of total silence are unpleasant for audiences and unrealistic for audio-visual stimuli (Chion, 1994; Herget, 2021).
Music stimuli
We used online music libraries to select instrumental music stimuli that were affectively and associatively congruent or incongruent with the documentary excerpt. A music version expressing fear and tension, triggering typical “human touch” associations (genre: film music) was identified as congruent with the central theme of the documentary excerpt. The choice of the incongruent music carried a risk: When facing violations of external realism or credibility, viewers tend to evaluate a media format’s realism critically and negatively and to reject subsequent engagement (Busselle & Bilandzic, 2008; Dillman Carpentier, 2010). For music that was incongruent but not unrealistic, we chose music communicating high arousal, power, and anger, which was associatively well suited to the visually displayed massive undertaking of building a new sarcophagus on the large construction site (genre: rock or dubstep). Thus, we created a fleeting congruence with a secondary aspect of the media stimulus but not with the actual focus of the documentary. To further strengthen the impression of realism, both music versions were structurally congruent and professionally edited onto the stimulus condition without music. 7 All stimulus versions are available from the first author upon request.
Measures
As a manipulation check, the congruence of the music and the documentary content was evaluated using four items (e.g., “Regardless of how much I liked or disliked the music, it did seem appropriate for this documentary”; based on Kellaris, Cox, & Cox, 1993; M = 3.49, SD = 1.01, α = .85). The expressed and induced emotions were measured on four emotional dimensions, each of which was measured with three items of the M-DAS (a modified version of the Differential Affect Scale for capturing emotions in media usage; Renaud & Unz, 2006; see Table 1).
Inter-Item Reliability of the Four M-DAS (Modified Version of the Differential Affect Scale) Dimensions Used.
Note: The participants indicated music’s expressed and induced emotions on five-point Likert-type scales (1 = do not agree, 5 = completely agree). α = Cronbach’s alpha.
Participants’ memory performance was estimated by asking them to judge whether or not 11 specific statements had been mentioned in the documentary (i.e., aided recall, in line with Kopiez et al., 2013). For example, the participants were asked to indicate whether the following statement was part of the documentary: “Every day, more than 5000 workers commute to what is probably Europe’s most dangerous construction site.” Correct responses were added and incorrect responses subtracted, resulting in a score ranging from a minimum of –11 to a maximum of +11 (M = 2.39, SD = 4.31). Also, as suggested by Kopiez et al. (2013), the general evaluation of the documentary excerpt was measured using three items (e.g., “exciting”; M = 4.06, SD = 0.85, α = .88), and its message credibility was measured using five items (as in Schweiger, 1999, for example, “trustworthy”; M = 4.05, SD = 0.72, α = .89). Participants responded to these items on five-point Likert-type scales (1 = do not agree, 5 = completely agree).
Results and discussion
Manipulation check
Depending on the background music’s congruence, participants perceived the music and documentary as significantly more fitting in the congruent condition than the incongruent condition, incongruent: M = 2.68, SD = 0.76; congruent: M = 4.25, SD = 0.49; F(1, 60) = 95.45, p < .001,

Perceived Congruence of the Music and the Documentary in Experiment 1.
Expressed emotions
The music’s expressed emotions (H1) were also in alignment with our expectations (see Figure 2). The most obvious difference between the music versions was their potential to communicate fear or anger. Congruent with the frightening nature of the documentary, as expected, the congruent music condition expressed significantly higher levels of fear, F(1, 60) = 38.70, p < .001,

Expressed and Induced Emotions in Experiment 1.
Induced emotions
Based on the positive connection between expressed and induced emotions, we expected that fitting music would induce emotions congruent with the documentary more strongly than would the conditions of no music or incongruent music (H2). As illustrated on the right side of Figure 2, this hypothesis is not rejected. Exposure to congruent music made the participants experience the highest levels of fear and of being moved, along with low levels of both anger and satisfaction. Although the experimental conditions differed significantly, the effect sizes were distinctly smaller for induced emotions than for expressed emotions. In accordance with this finding, in repeated planned contrasts, the ideal result of significant differences between incongruent music and no music and between no music and congruent music could only be observed for the feeling of being moved (see Table 2). Previous studies have found a generally weaker effect in the emotions induced by music, compared with the emotions expressed by music, so this finding can be considered consistent with past research (e.g., Herget, 2020; Schubert, 2007). In addition, especially because of the high values for the feeling of being moved, it could be concluded that the documentary stimulus was an ideal projection surface for emotions, meeting Schmidt’s (1976) first precondition.
Induced Emotions Under All Three Stimulus Conditions.
Note: Results of ANOVAs (analysis of variance) and repeated planned contrasts assessing the effects of the stimulus conditions on induced emotions. N = 92. Bold type indicates significance at p < .05.
Credibility, memory performance, and evaluation
As long as the music used in the documentary was well edited and congruent with the content, the documentary excerpt was perceived as significantly more credible than were the excerpts without music or with incongruent music (H3; see the first row of Table 3). Although the stimulus conditions differed significantly in their credibility, none of the conditions was perceived as incredible. This indicated that our manipulation did not negatively influence the perceived realism of the stimulus, which, as described earlier, was of high importance (e.g., Dillman Carpentier, 2010).
Credibility, Recall, and Evaluation Under All Three Stimulus Conditions.
Note: Results of ANOVAs (analysis of variance) assessing the effects of the stimulus conditions on credibility, recall, and evaluation. N = 92. Values in square brackets indicate the 95% confidence interval (CI) for each mean. LL and UL indicate the lower and the upper limits of the confidence interval, respectively. The participants indicated the media format’s credibility and evaluation on five-point Likert-type scales (1 = negative, 5 = positive). Memory performance scores ranged from –11 to +11. Bold type indicates significance at p < .05.
When asked whether certain statements were part of the previously viewed documentary, participants who had seen the documentary with congruent music showed significantly better recall than did those who watched the stimuli without music or with incongruent music (H4; see the second row of Table 3). Repeated planned contrasts indicated that the stimulus with congruent music had the most positive effect on participants’ memory performance, followed by the stimulus without music and then the stimulus with incongruent music (see Table S3 in the online supplemental material section). As expected, the generally positive influence of fitting, emotionalizing music also had a positive effect on the evaluation of the documentary (H5; see the third row of Table 3). Although the participants rated all of the stimulus conditions as exciting, interesting, and well made, the highest ratings were given to the version with congruent music (incongruent music < no music < congruent music; see Table S3). Not only did the experimental conditions differ as expected in terms of perceived credibility, evaluation, and memory performance, but all associated effect sizes can also be rated as large. Therefore, hypotheses 3, 4, and 5 can be accepted.
Experiment 2
The purpose of many non-fictional media formats is not solely to inform audiences (Grabe et al., 2001) but also to convey a message intended to have a direct or indirect influence on the formation of public opinion (Corner, 2008; Döben, 1993; LaMarre & Landreville, 2009; Rogers, 2015). Experiment 2 extended the scope of the first experiment by investigating the influence of congruent and incongruent background music on the audience’s potential change of attitude, an element that was lacking in Experiment 1.
Design and participants
As in Experiment 1, a non-fictional media stimulus was set to congruent and incongruent music, with a control stimulus without music completing the design. Most empirical studies investigating the effects of music take place in a laboratory (e.g., Herget, 2021) to control the setting and increase internal validity. However, Eerola and Vuoskoski (2013) considered more realistic designs possible and especially promising for studying emotions induced by music. Therefore, following the examples of Kopiez et al. (2013) and Nosal et al. (2016), we conducted an online experiment using the online survey software UNIPARK. On the basis of positive experiences with this approach, we asked the subjects to complete the experiment on their own computers with speakers or headphones in a quiet environment to minimize the problem of the external conditions of the subjects’ participation being difficult to control in an online experiment. Because no recent studies on the influence of music in non-fictional media formats have focused on attitude change, there was no effect size reference for a power analysis. To make the findings relevant for applications in practice, a medium effect size of f ⩾ 0.25 (Cohen, 2013) was used as the basis of an a priori sample size calculation. A required sample size of N = 159 was indicated (using G*Power, α = .05, 1−β = 0.80, number of groups: 3). Consequently, in a between-subjects online experiment, 147 subjects (75% female, age: M = 25.25, SD = 7.98; mostly recruited through online platforms such as Facebook) were randomly assigned to the three experimental conditions.
Selection and construction of materials
Non-fictional media stimulus
Our stimulus was based on a 3-min television magazine report of a public broadcaster addressing the problem of an unhealthy diet in general and especially for children and criticizing unhealthy food advertising for targeting children specifically. In addition to providing information, television magazine reports typically also express a clear opinion, with the aim of changing recipients’ attitudes, and the shortness of these reports makes them quite suitable for online experiments. Unlike the documentary used in Experiment 1, this stimulus fulfilled all three preconditions established by Schmidt (1976). In its treatment of (a) the consequences of unhealthy nutrition among children and (b) the unfair methods of the advertising industry to target children, the stimulus used in Experiment 2 is highly emotionalizing, meeting the first precondition. It also demands positioning from the recipients, and more than one specific opinion is possible regarding the advertising-topic (the second and third preconditions). The magazine report underwent the same editing steps as described in Experiment 1.
Music stimuli
The media stimulus was professionally (see the description in Experiment 1) set to two versions of music conveying negative congruent and positive incongruent (but not unrealistic) emotions and associations. Whereas the congruent music expressed negative emotions and thus underlined the serious message of the report (genre: film music), the incongruent music matched the cuteness of the interviewed children and the colorful children’s products (a secondary aspect of the report) through positive emotions and triggered childhood schemata (genre: pop). Accordingly, although we planned the positively connoted music to be perceived as incongruent with the report’s main message, this music was not planned to be perceived as unrealistic because of its fit with a secondary aspect of the report’s content. All stimulus versions are available from the first author upon request.
Measures
Because the second experiment was conducted as a completion of the first and because this was an online experiment, we measured only the expressed and induced emotions and the attitude change of participants. The music’s expressed emotions were collected using the Geneva Emotional Music Scale (GEMS; Zentner, Grandjean, & Scherer, 2008). As an improvement to Experiment 1, we used an instrument developed specifically to measure emotions induced (Eerola & Vuoskoski, 2013; Zentner & Eerola, 2010) and expressed (Torres-Eliard et al., 2012) by music. The emotions induced by the media stimulus were again measured with the M-DAS (Renaud & Unz, 2006). Because the emotions fitting and not fitting this media stimulus were generally negative and positive, in contrast to the specific emotions in Experiment 1, we aggregated the specific emotion dimensions of the measurement instruments into indices (expressed emotions: 12 items assessed on five-point Likert-type scales, Cronbach’s α = .90; induced emotions: 12 items assessed on five-point Likert-type scales, Cronbach’s α = .72; for more details, see Table S4 in the online supplemental material section). Following Costabile and Terman (2013), the questionnaire on participants’ (cognitive) attitude change included specific statements assessing the participants’ agreement with the magazine report’s message. We distinguished statements regarding the importance of healthy nutrition for children (four items assessed on five-point Likert-type scales, M = 4.19, SD = 0.499, α = .70; for example, “A healthy diet is important for children.”) and statements on commercials targeting children (three items assessed on five-point Likert-type scales, M = 3.49, SD = 0.80, α = .70; for example, “Advertising for children is unethical.”).
Results and discussion
In line with the emotional connotation of the television magazine report, the music version selected as congruent expressed significantly more negative emotions than did the incongruent music version, F(1, 98) = 104.30, p < .001,

Expressed and Induced Emotions and Change of Attitude in Experiment 2.
We expected the participants to already have established firm positions regarding the importance of a healthy diet for children; therefore, following Bullerjahn (2006) and Have (2010), we did not expect to see a music effect for this attitude. Indeed, the participants were in complete agreement that a healthy diet is important for children, and, as expected, their attitudes were not influenced by the stimulus conditions (H7), F(2, 144) = 1.19, p = .307,
General discussion and implications
Music is an essential part of non-fictional media formats. This study’s results indicated that carefully selected music can be used effectively to influence recipients’ emotions, memory performance, potential attitude changes, and evaluations of the media format and its perceived credibility, with medium or large effect sizes. Four steps were identified to make this positive influence more likely. The congruence of music and medium (e.g., Cohen, 2005) has proven to be an especially relevant influencing factor. Whether music’s expressed emotions actually result in distinct emotionalization of the recipients (Evans & Schubert, 2008; Gabrielsson, 2002; Schubert, 2007) should also be examined, as should whether the use of music impairs media credibility (Grabe et al., 2000; Have, 2010; Schultheiss & Jenzowsky, 2000) and the media format’s suitability for being influenced by background music (e.g., Bullerjahn, 2006; Schmidt, 1976). The long-assumed ineffectiveness or even negative influence of background music in non-fictional media formats (Boeckmann et al., 1990; Brosius, 1990; Kopiez et al., 2013; Schmidt, 1976; Wakshlag et al., 1982) may be explained by previous research disregarding these factors.
Study limitations and further research
Only the congruence of music with the non-fictional media format was experimentally manipulated in this study. Other relevant influencing factors were taken into account in the stimulus creation or data collection and analysis but were not manipulated. For this reason, drawing final conclusions about the effects of these other factors requires further research.
This study attempted to make generalizable statements about the effects of music in non-fictional media formats by testing hypotheses using a documentary excerpt and a television news magazine report. Of course, generalizable statements based on the findings of a small number of experiments—especially those focusing on media formats as diverse as television news, news magazines, and documentaries—must always be critically evaluated. The effects of audio-visual media formats on recipients’ emotions, cognitions, and actions are determined in part by the expectations these individuals have developed during previous viewing experiences with similar types of media formats (e.g., Collins, 1981; Collins & Wiens, 1983; Pouliot & Cowen, 2007). Alencar and Kruikemeier (2018) found that infotainment elements including music are used with varying frequencies in public and commercial broadcasting, in different non-fictional media formats, and in different topics. It seems reasonable to suppose that background music in different formats from diverse sources and with various topics is perceived and processed differently, which may result in varying effects. The effects indicated here should be tested using other diverse non-fictional media stimuli from different sources and with various topics (Boeckmann et al., 1990; Brosius, 2013). In addition, especially in the case of documentaries, it makes sense to use the full length of this media format in experiments because longer periods of exposure to music may enable the detection of stronger music effects (e.g., Pouliot & Cowen, 2007).
This study’s results may be biased because student samples were used in the experiments. Children and young adults have grown up in a highly arousing media environment, surrounded by omnipresent music. In contrast, older recipients have experienced an increase in arousing characteristics in media messages over the years (e.g., Kleemans, Vettehen, Beentjes, & Eisinga, 2017; Tapscott, 2009). As Have (2010) suggested, older audience members may be more critical about and more aware of the appearance of music in non-fictional media formats. In future studies, the experimental samples should be more balanced—in terms of the participants’ ages and genders.
This study represents an initial overview of the influencing factors that mediate the effect of music in non-fictional media formats and should not be considered conclusive. For example, there is a need for further investigation of the factor of narrative engagement or psychological transportation, which was introduced in this context by Rossmann and Rossmann (2018) and has already been frequently explored as a mediator of music’s effects in fictional films (Cohen, 2009; Costabile & Terman, 2013).
Implications for practice
In a recent survey, Moormann (2010) confirmed an initial tendency toward professionalization in the use of music in non-fictional media formats. However, in the same year, music consultant Schmidt-Banse (2010) expressed dissatisfaction with the quality of the music selection in most of these formats. It would be useful to consider the factors described in this study as potentially relevant not only for scholarly research, but also for practice. In 1981, Seidman (1981) already expressed disagreement with the “hit-or-miss” approach to selecting music for non-fictional media formats, and instead advocated for a balanced approach based on principles from research and practical experience.
To be fair, rather than plain incompetence or ignorance, it is more often factors such as time pressure or administrative constraints that lead to the use of ineffective music in practice. In accordance with previous research, Experiment 2 indicated that it is sometimes more reasonable to omit background music in non-fictional media formats instead of risking the use of incongruent music with potentially negative effects regarding emotion induction and attitude change.
A general increase in the effectiveness of background music in non-fictional media formats should also be accompanied by critical reflection: Music may be just as important as words or pictures in communicating how we should value, despise, admire, fear, love or hate other people or ourselves, other cultures or our own. The only difference is that while education may help us, if we are lucky, to reflect upon and criticize messages that come to us through visual and verbal media, we are not trained to reflect upon or criticize musically mediated ideology. We are in this sense more open to manipulation through music than through most other channels of mediating meaning. (Tagg, 2006, p. 169)
Further research and educational work on the preconditions for the effective use of background music and on this type of music’s specific effects are, therefore, of particular importance (Have, 2010; Rossmann & Rossmann, 2018).
Supplemental Material
sj-pdf-1-pom-10.1177_0305735621999091 – Supplemental material for Soundtrack for reality? How to use music effectively in non-fictional media formats
Supplemental material, sj-pdf-1-pom-10.1177_0305735621999091 for Soundtrack for reality? How to use music effectively in non-fictional media formats by Ann-Kristin Herget and Jessica Albrecht in Psychology of Music
Footnotes
Acknowledgements
The authors thank Alexander Frank, Fabian Henning, Emely Krimm, and Stefanie Schröter for their help with Experiment 1. They also thank Elmar Bartel, a professional German voice actor and news anchor, and Carolin Kaiser, a former journalist at the University of Wuerzburg campus radio station, for contributing their voices for the off-screen commentaries in the media stimuli.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
