Sage Journals: Discover world-class research

Abstract

Research on the effect of background music (BgM) on cognitive task performance is marked by inconsistent methods and inconclusive findings. In order to provide clarity to this area, we performed a systematic review on the impact of BgM on performances in a variety of tasks whilst considering the contributions of various task, music, and population characteristics. Following the PRISMA and SWiM protocols, we identified 95 articles (154 experiments) that comprise cognitive tasks across six different cognitive domains—memory; language; thinking, reasoning, and problem-solving; inhibition; attention and processing speed. Extracted data were synthesized using vote counting based (solely) on the direction of effects and analyzed using a sign test analysis. Overall, our results demonstrate a general detrimental effect of BgM on memory and language-related tasks, and a tendency for BgM with lyrics to be more detrimental than instrumental BgM. Only one positive effect (of instrumental BgM) was found; and in most cases, we did not find any effect of BgM on task performance. We also identified a general detrimental impact of BgM towards difficult (but not easy) tasks; and towards introverts (but not extraverts). Taken together, our results show that task, music, and population-specific analyses are all necessary when studying the effects of BgM on cognitive task performance. They also call attention to the necessity to control for task difficulty as well as individual differences (especially level of extraversion) in empirical studies. Finally, our results also demonstrate that many areas remain understudied and therefore a lot more work still needs to be done to gain a comprehensive understanding of how BgM impacts cognitive task performance.

Keywords

Background music cognitive task performance effects of music individual differences systematic review

Introduction

With the growth in the accessibility, exposure, and consumption of music in everyday life, people engage with music listening in a wide variety of situations and contexts (Bull, 2006; North et al., 2004). Interestingly, amongst these music listening behaviors, research shows that on most occasions people listen to music when they are engaged with other tasks like studying or working, exercising or doing housework, shopping or traveling, amongst many others. Some of the key reasons for listening to music in these situations are fighting boredom, passing the time, general entertainment, and out of habit (Greasley & Lamont, 2011; Juslin et al., 2008; Lonsdale & North, 2011; North et al., 2004; Randall & Rickard, 2017; Rentfrow, 2012; Sloboda et al., 2001; Stratton & Zalanowski, 2003).¹

Amongst these activities, some of the most common ones involve mental work that require intensive cognitive functioning. For instance, Calderwood et al. (2014) conducted a study to understand what other activities students normally engage with whilst studying and found that, in a 3-hr study session, students spent more than one-third of the time (73 min) listening to music. Similarly, a survey conducted amongst 295 office employees in the UK showed that employees reported spending an average of 36% of their working week listening to music (Haake, 2011). In fact, listening to music often makes it to the list of tips and hacks towards achieving better work productivity and cognitive performance (D’Angelo, 2022; Robinson, 2020; Spherion Staffing & Recruiting, 2022). Interestingly, when students and employees were asked about their reasons for listening to music whilst working/studying and the perceived impact music has on them, the answers tended to be mood-related (e.g., improves mood, helps relaxation, alleviates boredom) rather than to enhance cognitive performance or work quality (Haake, 2011; Kotsopoulou & Hallam, 2010). Still other studies suggest that to improve ‘efficiency’ is also a key reason (Kononova & Yuan, 2017).

Whether or not music may elevate mood and increase motivation whilst people engage in mental work, the human cognitive capacity is limited (the brain can only attend to and process limited amount of information at one time; Eysenck & Keane, 2020) and it is plausible to ask whether (or to what extent) background music (BgM) listening can hinder cognitive performance in any way. At the same time, BgM listening also helps sustain attention and prevent mind-wandering in low demand cognitive tasks (Kiss & Linnell, 2020) and can improve performance through mitigating task-related cognitive interference (e.g., inhibiting instinctive responses in a color Stroop task; Masataka & Perlovsky, 2013). Therefore, it is also plausible to ask whether or to what extent and/or in what circumstances does BgM improve cognitive performance.

Music During the Execution of Cognitive Tasks: Good or Bad?

It should come as no surprise that discerning the effects of BgM listening on cognitive performance has become a very popular research area. Indeed, BgM may (consciously or unconsciously, positively or negatively) interfere with a variety of cognitive processes (e.g., Haake, 2011), and the ubiquity of this habit (Calderwood et al., 2014; David et al., 2015; Haake, 2011; Kononova & Yuan, 2017) demands that more attention is paid to its implications. Unfortunately, research in this area has been marked by inconclusive findings, with many studies showing that BgM can have beneficial (Crust et al., 2004; Mammarella et al., 2007; Miller & Schyb, 1989; Proverbio & De Benedetto, 2018), detrimental (Alley & Greene, 2008; Avila et al., 2012; Deng & Wu, 2020; Liu et al., 2017; Perham & Currie, 2014; Perham & Vizard, 2011; Xiao et al., 2020), or no effects (Burkhard et al., 2018; Ferreri et al., 2015; Kou et al., 2018; Liu et al., 2012; Reynolds et al., 2014) on a wide range of cognitive tasks performance.

In order to foster an understanding of these findings, researchers have conducted one systematic review (De La Mora Velasco & Hirumi, 2020) and two meta-analyses (Kämpfe et al., 2010; Vasilev et al., 2018)² of studies published in this area, but there are some inconsistencies amongst their reports:

Kämpfe et al. (2010) concluded that BgM hinders performance on memory-related tasks and reading comprehension.

Vasilev et al. (2018) concluded that BgM hinders reading comprehension, that BgM with lyrics (L-BgM) is more detrimental than instrumental BgM (I-BgM) for reading comprehension, and that BgM also hinders reading speed (i.e., slows down reading).

De La Mora Velasco and Hirumi (2020) did not find any definitive effects of BgM on cognitive performance.

Arguably, the different conclusions reached by these reviews are the result of methodological differences and limitations. For instance, the inclusion/exclusion criteria (and the resulting list of included articles) are quite diverse across these reviews, which naturally has a direct implication to the findings. One of these criteria is the targeted population: Kämpfe et al. (2010) reviewed only studies of adult participants, whereas Vasilev et al. (2018) and De La Mora Velasco and Hirumi (2020) reviewed studies of both adults and children (whose cognitive control capacity is different from adults; Cowan et al., 2006). Another criterion concerns the types of outcome measures being assessed: De La Mora Velasco and Hirumi (2020) only vaguely described the eligibility criteria for their outcome measures—“an explicit learning outcome”; whereas Kämpfe et al. (2010) did not outline the inclusion/exclusion criteria for the types of outcome measures being assessed. Another methodological difference amongst these reviews concerns the publication timeline (and correspondingly the sample sizes of the reviews). Kämpfe et al. (2010) included all articles published before the year 2008, and Vasilev et al. (2018) included all articles published before 2017 (the starting year was not mentioned in both reviews). However, De La Mora Velasco and Hirumi (2020) only reviewed articles published between 2008 and 2018. As a result of this, De La Mora Velasco and Hirumi (2020) sampled only 30 articles, whereas Kämpfe et al. (2010) and Vasilev et al. (2018) managed to sample 97 and 65 articles respectively. Given that the findings in De La Mora Velasco and Hirumi (2020) differed from the other two, the 10-year coverage in their systematic review may not be sufficient to obtain a sizable and representable sample of studies that allow detection of a true effect. The last methodological difference we identified is the differing analytical approach. Unlike the other two reviews, De La Mora Velasco and Hirumi (2020) performed vote counting based on the statistical significance reported in their sample. However, they did not conduct any inferential analysis (e.g., binomial test, sign test, etc.) and therefore the results are merely descriptive. This largely reduces the review's power in identifying meaningful outcomes (Chaimani et al., 2021; Deeks et al., 2021), which, again, could conceal any potential true effects of BgM.

Aside from the methodological differences between previous attempts to synthesize evidence on the impact of BgM on cognitive performance, there is also a general lack of consideration of the multiplicative interactions between task (e.g., the difficulty of task, cognitive domain), music (e.g., presence of lyrics, arousal level, etc.) and population-specific factors (e.g., personality traits, music education, etc.). For example, considering the impacts of different types of BgM, performance in verbal processing tasks such as reading comprehension tend to be hindered by lyrical but not instrumental music (Perham & Currie, 2014). Masataka and Perlovsky (2013) also observed that cognitive dissonance (e.g., performance in a color Stroop task) was mitigated by consonant music but impaired by dissonant music. In terms of task complexity, BgM also tends to hinder performance in complex (but not simple) tasks (Gonzalez & Aiello, 2019). Finally, the direction of these effects could further vary amongst different populations (e.g., extraverts vs. introverts (Furnham & Strbac, 2002); musicians vs. nonmusicians (Herath, 2018), higher vs. lower working memory capacity individuals (Christopher & Shelton, 2017), etc.). Simply put, in order to obtain a more comprehensive understanding of the phenomenon, the evaluation of BgM's effect on cognitive task performance should encompass each task, music, and population-specific analysis.

Another relevant aspect is the fact that none of the three reviews conducted a quality assessment (i.e., methodological quality, reporting quality) on their sampled articles. This is also an important limitation because quality assessments enable both reviewers and readers alike to balance a review's findings against the methodological qualities of the articles that contributed to the findings, which then helps them to better appraise the overall quality of the evidence from the review (Wells & Littell, 2009). This is particularly important to an area of research marked by inconsistent findings and a lack of standard procedures that encompass the heterogeneity of variables and situational factors that can interfere with the findings.

Revisiting the Evidence

Due to conflicting evidence regarding the impact of BgM on cognitive performance and the limitation of previous attempts to synthesize evidence in this area, we propose to revisit the evidence from a different perspective. Through a systematic review of all the empirical studies published so far on this topic, our goals are to quantitatively evaluate the effect of BgM on cognitive task performance separately for different cognitive domains and tasks and consider the contribution of different music and population characteristics. Specifically, our research questions are as follows:

Research Question 1:

How does BgM affect performance in different types of cognitive domains and tasks?

Research Question 2:

Are there any music characteristics (e.g., presence of lyrics, volume, tempo, genre, tonality, etc.) that contribute to the effect of BgM on cognitive task performance?

Research Question 3:

Are there any population characteristics (e.g., personality traits, musical background, gender, etc.) that contribute to the effect of BgM on cognitive task performance for specific listeners?

Methods

Protocol and Registration

The protocol of the review was registered on PROSPERO (Cheah et al., 2020). The review was conducted based on the Preferred Reporting Items for Systematic Review and Meta-Analysis Statement (PRISMA; Page et al., 2021). The PRISMA 2020 Checklist can be found in Appendix A.

Eligibility Criteria

Studies were included if they met the following criteria:

primary research written in English and published in an academic journal from January 1, 1960, to July 2, 2020;

the sample population consisted of individuals aged 16 years or older³;

at least one of the interventions consisted of performing a cognitive task whilst listening to BgM;

at least one of the outcomes was a quantitative measure of performance in a cognitive task that is typically required in office desk jobs or studying;

the study had a control condition that is either silence, no music or containing natural sounds that are expected to occur in the environment in which the study was executed (e.g., café noises, office noises, etc.);

Studies were excluded if they contain the following criteria:

the sample consisted of a special population with particular cognitive or health conditions or disabilities that would systematically affect cognitive performance (e.g., people with dementia, Parkinson's, Autism Spectrum Disorder, any form of learning disabilities, etc.);

music was not being listened to during the execution of a cognitive task (e.g., music was listened to before the cognitive task; music-making, etc.);

the study does not contain an adequate control group (see inclusion criteria above);

the outcome measure consisted of a cognitive performance that is not typical of a regular desk job or studying, which includes:

motoric behaviors (e.g., driving, exercising, surgical procedures, etc.);

moral reasoning/solving moral dilemma;

communication/social/interpersonal skills (e.g., attention to the body language of others, interpreting the facial expression of others, interpersonal/group bonding or controlling agitated behaviors, etc.);

musical tasks (e.g., pitch identification, tempo recognition, etc.);

sensorimotor adaptations/skills (e.g., pain tolerance, odor discrimination, time perception, etc.);

other types of activities that rely on cognitive skills but are not typical of a regular desk job or studying (e.g., gambling, remembering film scenes/events, remembering advertisements, autobiographical memory, purchasing behaviors, affective impressions, sleep quality, psychological wellbeing, eating behaviors, and alcohol consumption, etc.).

Information Sources and Search Strategy

Given the varied terminology used in studies focusing on music listening and cognitive performance, we devised a multi-step procedure to optimize our search strategy. First, we conducted a preliminary (manual) search to gather a list of relevant articles to serve as a gold standard for our searches and help optimize our search strategy. The articles were sourced from the reference lists of two relevant articles (Kämpfe et al., 2010; Schellenberg & Weiss, 2013), as well as from a blind search through Google Scholar (using the key words ‘background music’ and ‘cognitive performance’).⁴ In this process we identified 67 relevant articles (see Appendix B, Table B1). This list was then used as the gold standard to evaluate different search strategies and identify the optimal one, i.e., the one that yielded the highest number of pre-selected articles and the lowest number of total hits (i.e., the total number of sources retrieved by each search strategy).

Our searches were conducted on five electronic databases: PubMed, PsycINFO, Scopus, Web of Science and Google Scholar (the first 500 search results). The optimal search strategy identified via the method described above was ‘music AND cogniti* OR task’ (see also Appendix B, Table B2 for the filters applied to the searches in each database), which yielded 15,407 hits that included 51 of the 67 articles from our list of pre-selected articles. After removing 6,568 duplicates, we added to the pool of articles the 16 pre-selected articles that were not retrieved by the search strategy, and 12 more articles identified in another review that was published in the meantime (De La Mora Velasco & Hirumi, 2020). The final list of articles for screening included 8,867 unique sources.

Screening Process

The screening and selection process was conducted and recorded on the platform Rayyan (Ouzzani et al., 2016) and consisted of two stages: (1) title and abstract screening; (2) full-text screening. First, the titles and abstracts of 8,867 articles were reviewed by the authors YC and EC, and seven independent researchers. Each article was screened by at least two researchers, and inconsistencies were resolved between two of the review authors (YC, EC). At this stage, 185 articles were identified as relevant and were subsequently retrieved for full-text screening. The full-text screening was conducted by two of the review authors (YC and EC), and discrepancies were discussed by the same authors until consensus was reached. The final list of sources included in this review consists of 95 articles (reporting 154 relevant empirical experiments)⁵ that meet all the pre-defined inclusion criteria in our protocol. The PRISMA 2020 flow diagram depicting the study selection process is presented in Figure 1.

Figure 1.

PRISMA 2020 flow diagram for searches of databases and other sources.

Data Extraction Process

The data were extracted by one review author (YC) and double-extraction was performed on 25 articles (∼26%; HKW and three independent researchers). All inconsistencies were resolved by EC. The data extracted included the following information:

basic sample characteristics (sample size, age, gender);

study design information (repeated measures, between-subjects, mixed);

experimental conditions (i.e., the independent variables);

characteristics of the music used in each relevant experimental condition (e.g., with lyrics or instrumental);

type(s) of cognitive task(s) (e.g., arithmetic, abstract reasoning, verbal reasoning, etc.);

(if available) level of difficulty of the cognitive tasks;

(if available) population characteristics (e.g., personality traits, working memory capacity, level of music education);

outcome measures (e.g., mean, median, percentage or total scores) for all BgM and control conditions (as defined in the inclusion and exclusion criteria).

Data items 1–7 can be found in Table 1 under the ‘Results’ section. Data item 8 is available as Supplementary Material because it is too large to include in the main article.

Table 1.

Master list of extracted data: Population characteristics (sample size, age, gender, other population characteristics), study design, experimental conditions, types of cognitive tasks, and other moderating variables (e.g., task difficulty, personality traits, music training.).

No.	Reference	Population characteristics					Design	Experimental conditions	Other variables	Cognitive domain/tasks
No.	Reference	N	M	F	Others	Age	Design	Experimental conditions	Other variables	Cognitive domain/tasks
1	Alley and Greene (2008)	60	10	50	n/a	M = 18.6	RM	1. L-BgM 2. I-BgM 3. Control	—	MEM/SR
2	Amezcua et al. (2005)	12	nr	nr	n/a	M = 27.92( + /-2.50)	RM	1. Slow BgM 2. Fast BgM 3. Control	—	ATT/NV
3	Angel et al. (2010)	56	28	28	n/a	undergraduates	MIXED	1. I-BgM 2. Control	TF	LANG/LG; THINK/NVR
4	Avilaet al. (2011)	58	22	36	nr	M = 16.78	RM	1. L-BgM 2. I-BgM 3. Control	EXT	LANG/RC; THINK/MATH; THINK/NVR
5	Beauchene et al. (2016)	28	16	12	n/a	M = 27.6	RM	1. Classical I-BgM 2. Pure tone 3. Control	—	MEM/REC-NV
6	Begum et al. (2019)	280	155	125	n/a	.Range: 18–25	IDP	1. LA-P L-BgM 2. HA-P L-BgM 3. HA-N L-BgM 4. Control	GENDER	LANG/LG; ATT/V
7	Bonin and Smilek (2016): experiment 1	48	16	32	n/a	M = 19.51 SD = 1.82	RM	1. Harmonic I-BgM 2. Inharmonic I-BgM 3. Control	—	MEM/WM
8	Bonin and Smilek (2016): experiment 2	48	13	35	n/a	M = 19.01 SD = 1.56	RM	1. Harmonic I-BgM 2. Inharmonic I-BgM 3. Control	—	MEM/WM
9	Bottiroli et al. (2014)	65	14	51	n/a	M = 69.03 SD = 5.79	RM	1. P I-BgM 2. N I-BgM 3. Control	—	MEM/I-FR-V; LANG/LG; PS
10	Boyle and Coltheart (1996): experiment 1	40	nr	nr	n/a	.Range: 18–54	RM	1. I-BgM 2. L-BgM 3. Control	TF	LANG/LG
11	Boyle and Coltheart (1996): experiment 2	35	nr	nr	n/a	.Range: 18–52	RM	1. I-BgM 2. L-BgM 3. Control	TF	MEM/SR
12	Burkhard et al. (2018)	25	8	17	n/a	M = 23 SD = 2.87	RM	1. HA-P I-BgM 2. LA-P I-BgM 3. Control	—	INH/G-NG
13	Burton 1986)	64	32	32	n/a	.Range: 18–21	IDP	1. I-BgM 2. Control	—	THINK/NVR
14	Cassidy and MacDonald (2007)	40	10	30	Introverts = 12 Extraverts = 28	Introverts: M = 21 Extraverts: M = 21	IDP	1. HA-N L-BgM 2. LA-P L-BgM 3. Control	EXT	MEM/I-FR-V; MEM/D-FR-V; INH/STRP
15	Cauchard et al. (2012)	30	9	21	n/a	M = 21.7 SD = 2.6	RM	1. I-BgM 2. Speech 3. Control	—	LANG/RC; LANG/RS
16	Chew et al. (2016)	165	71	94	n/a	M = 21.87 SD = 2.36	IDP	1. FAM English L-BgM 2. Italian version of (1) 3. UNFAM Italian L-BgM 4. English version of (3) 5. Control	GENDER	MEM/REC-V; LANG/RC; THINK/MATH;
17	Cho (2015)	28	2	26	n/a	M = 24 SD = 1.62	RM	1. L-BgM 2. Control	—	LANG/WQ; LANG/WF
18	Chou (2010)	133	nr	nr	n/a	M = 31.8 .Range: early 20s to mid 50s	IDP	1. Classical I-BgM 2. Hip-hop L-BgM 3. Control	—	LANG/RC
19	Christopher and Shelton (2017)	138	nr	nr	nr	M = 19.03	RM	1. L-BgM 2. Control	WMC	LANG/RC; THINK/MATH
20	Cockerton et al. (1997)	30	17	13	n/a	M = 24 .Range: 18–32	RM	1. I-BgM 2. Control	—	THINK/IQ
21	Crawford and Strapp (1994)	61	29	32	nr	.Range: 18–21	IDP	1. L-BgM 2. I-BgM 3. Control	EXT	MEM/I-PAL-NV; THINK/NVR; THINK/VR
22	Crust et al. (2004)	57	38	19	n/a	M = 21.6 SD = 3.0	RM	1. L-BgM 2. I-BgM 3. Control	GENDER	ATT/V
23	Daoussis and McKelvie (1986)	48	nr	nr	Introverts = 24 Extraverts = 24	Undergraduates	IDP	1. L-BgM 2. Control	EXT	LANG/RC
24	Darrow et al. (2006)	87	26	61	Musicians = 43; Nonmusicians = 44	Undergraduates / graduates	RM	1. BgM 2. Control	MT	ATT/V
25	Davenport (1972)	48	24	24	n/a	Undergraduates	RM	1. I-BgM 2. Control	GENDER	ATT/NV
26	De Groot and Smedinga (2014)	41	nr	nr	n/a	University students	RM	1. UNFAM LANG L-BgM 2. FAM LANG L-BgM 3. Control	EXT; TF	MEM/I-PAL-V; MEM/D-PAL-V
27	De Groot (2006)	36	nr	nr	n/a	University students	IDP	1. I-BgM 2. Control	TF	MEM/I-PAL-V; MEM/D-PAL-V
28	Deng and Wu (2020)	30	18	12	Introverts = 15 Extraverts = 15	M = 21.26 SD = 1.60	RM	1. L-BgM 2. Control	EXT	ATT/NV
29	Doyle and Furnham (2012)	56	13	43	n/a	M = 27 SD = 12	RM	1. L-BgM 2. Control	—	LANG/RC
30	Echaide et al. (2019): experiment 1	60	15	45	n/a	.Range: 18–35	IDP	1. I-BgM 2. Control	—	MEM/I-FR-V; MEM/D-FR-V
31	Echaide et al. (2019): experiment 2	60	15	45	n/a	.Range: 18–35	IDP	1. I-BgM 2. Control	—	MEM/I-FR-V; MEM/D-FR-V
32	Evered et al. (2018)	34	9	25	n/a	Males: M = 19–47 Females: M = 18–37	RM	1. PREF L-BgM 2. NON-PREF L-BgM 3. Control	—	THINK/CELL
33	Feizpour et al. (2020)	70	40	30	n/a	.Range: 19–28	RM	1. L-BgM 2. Control	GENDER; TF	THINK/NVR
34	Fernandez et al. (2019)	46	nr	nr	Young = 18 Old = 28	Young: M = 26; SD = 4.2 Old: M = 68; SD = 6.0	RM	1. HA-P I-BgM 2. HA-N I-BgM 3. LA-N I-BgM 4. LA-P I-BgM 5. Silent	—	ATT/NV
35	Ferreri et al. (2015)	19	6	13	n/a	M = 21.65 SD = 3.2	RM	1. I-BgM 2. Control	—	MEM/I-FR-V
36	Ferreri et al. (2013)	22	11	11	n/a	M = 23.5 SD = 4.3	RM	1. I-BgM 2. Control	—	MEM/REC-V
37	Ferreri et al. (2014)	16	6	10	n/a	M = 64.5 SD = 2.5	RM	1. I-BgM 2. Control	—	MEM/REC-V
38	Fontaine and Schwalm (1979)	35	27	8	n/a	Undergraduates	IDP	1. FAM Rock BgM 2. FAM Easy-listening BgM 3. UNFAM Rock BgM 4. UNFAM Easy-listening BgM 5. Control	—	ATT/NV
39	Furnham and Allass (1999)	48	14	34	Introverts = 24 Extraverts = 24	Introverts: M = 22.32 Extraverts: M = 21.41	RM	1. Complex L-BgM 2. Simple L-BgM 3. Control	EXT	I-FR6; D-FR4; RC8; NVR6
40	Furnham and Bradley (1997)	20	9	11	Introverts = 10 Extraverts = 10	Introverts: M = 20.4 Extraverts: M = 23.3	RM	1. L-BgM 2. Control	EXT	MEM/I-FR-NV; MEM/D-FR-NV; LANG/RC
41	Furnham and Strbac (2002)	76	33	43	Introverts = 38 Extraverts = 38	Introverts: M = 17.39 Extraverts: M = 16.75	RM	1. L-BgM 2. Control	EXT	MEM/SR; LANG/RC; THINK/MATH
42	Furnham et al. (1999)	142	111	31	Introverts = 71 Extraverts = 71	M = 16.91 .Range: 16–18	RM	1. L-BgM 2. I-BgM 3. Control	EXT	LANG/RC; THINK/VR; PS
43	Geethanjali et al. (2016a)	20	12	8	n/a	M = 20 SD = 0.4	RM	1. PREF/NON-PREF I-BgM 2. Control	—	INH/G-NG
44	Geethanjali et al. (2016b)	10	7	3	n/a	M = 20 SD = 0.4	RM	1. LA-P BgM 2. HA-N BgM 3. Control	—	INH/G-NG
45	Gonzalez and Aiello (2019)	150	38	112	n/a	M = 21.23 SD = 3.42	IDP	1. Simple; Soft I-BgM 2. Simple; Loud I-BgM 3. Complex; Soft I-BgM 4. Complex; Loud I-BgM 5. Control	—	MEM/I-PAL-V; ATT/V
46	Herath (2018)	20	10	10	Musicians = 10 Nonmusicians = 10	.Range: 18–23	RM	1. Classical I-BgM 2. Jazz I-BgM 3. Pop L-BgM 4. Control	MT	LANG/RC
47	Ho et al. (2006)	34	13	21	n/a	M = 20 .Range: 18–23	RM	1. Classical I-BgM 2. Classical I-BgM (reversed) 3. Control	—	ATT/V
48	Huang and Shih (2011)	89	37	52	n/a	M = 24 .Range: 19–28	IDP	1. Pop L-BgM 2. Classical Western I-BgM 3. Classical Chinese I-BgM 4. Control	—	ATT/NV
49	Iwanaga and Ito (2002)	47	21	26	n/a	M = 19 .Range: 18–23	IDP	1. L-BgM 2. I-BgM 3. Control	—	MEM/REC-V; MEM/REC-NV
50	Jäncke et al. (2014)	199	66	133	n/a	Control: M = 25.6; SD = 5.9 Vocal: M = 25.23; SD = 5.43 Instrumental: M = 26.65; SD = 6.76	IDP	1. HA L-BgM 2. LA L-BgM 3. HA I-BgM 4. LA I-BgM 5. Control	—	MEM/I-FR-V; MEM/D-FR-V
51	Jaušovec and Habe (2004)	20	5	15	n/a	M = 20.2 SD = 0.6	RM	1. I-BgM 2. Control	—	ATT/NV
52	Johansson et al. (2012)	24	12	12	nr	M = 27.9 SD = 7.7	RM	1. PREF BgM 2. NON-PREF BgM 3. Control	EXT	LANG/RC; LANG/RS
53	Kang and Williamson (2014)	32	12	20	Learn Arabic = 16 Learn Chinese = 16	Arabic-musicians: M = 23.38; SD = 4.50 Arabic-Nonmusicians: M = 22.50; SD = 2.78 Chinese-musicians: M = 25.50; SD = 6.16 Chi-Nonmusicians M = 28.75; SD = 11.95	IDP	1. I-BgM 2. Control	—	LANG/LL
54	Kou et al. (2018)	92	33	59	nr	M = 25.6; SD = 3.9	RM	1. Pop L-BgM 2. Control	EXT	LANG/RC; THINK/MATH; THINK/NVR
55	Küssner et al. (2016): experiment 1	31	13	18	Introverts = 16 Extraverts = 15	M = 21.06; SD = 3.53	RM	1. I-BgM 2. Control	EXT	MEM/I-PAL-V; MEM/D-PAL-V
56	Küssner et al. (2016): experiment 2	38	13	25	Introverts = 20 Extraverts = 18	M = 20.45; SD = 4.19	RM	1. I-BgM 2. Control	EXT	MEM/I-PAL-V; MEM/D-PAL-V
57	Lehmann and Seufert (2017)	81	15	66	High WMC: (scores 4 & 5) = 42 Low WMC: (scores 2 & 3) = 39	M = 21.46; SD = 4.30	IDP	1. I-BgM 2. Control	WMC	LANG/RC
58	Liu et al. (2012)	22	11	11	n/a	M = 22.3; SD = 1.5	RM	1. I-BgM 2. Control	—	ATT/V
59	Mammarella et al. (2007)	24	nr	nr	n/a	M = 81; SD = 4.5	RM	1. Classical I-BgM 2. Control	—	MEM/SR; LANG/LG
60	Mansouri et al. (2017)	73	36	37	n/a	.Range: 18–32	IDP	1. Fast L-BgM 2. Slow L-BgM 3. Control	—	INH/G-NG
61	Mansouri et al. (2016)	39	19	20	n/a	Males: M = 21.2; SD = 0.3 Females: M = 20.7; SD = 0.3	RM	1. L-BgM 2. Control	GENDER	INH/G-NG
62	Manthei and Kelly (1999)	72	nr	nr	n/a	.Range: 18–22	RM	1. Classical I-BgM 2. Pop L-BgM 3. Control	—	THINK/MATH
63	Martin et al. (1988): experiment 1	36	nr	nr	n/a	Undergraduates	RM	1. I-BgM 2. Control	—	LANG/RC
64	Martin et al. (1988): experiment 2	36	nr	nr	n/a	Undergraduates	RM	1. I-BgM 2. L-BgM 3. Control	—	LANG/RC
65	Masataka and Perlovsky (2013)	25	nr	nr	Children = 25 (data not extracted) Older adults = 25	.Range: 65–75	RM	1. Consonant I-BgM 2. Dissonant I-BgM 3. Control	TF	INH/STRP
66	Mayfield and Moss (1989): experiment 1	44	nr	nr	n/a	Undergraduates	IDP	1. Fast I-BgM 2. Slow I-BgM 3. Control	—	THINK/MATH
67	Mayfield and Moss (1989): experiment 2	70	nr	nr	n/a	Undergraduates	IDP	1. Fast I-BgM 2. Slow I-BgM 3. Control	GENDER	THINK/MATH
68	Miller and Schyb (1989)	198	nr	nr	n/a	Freshman and sophmore (college)	IDP	1. Classical I-BgM 2. Pop L-BgM 3. Pop I-BgM 4. Control	GENDER	LANG/LG; LANG/RC; THINK/MATH; THINK/NVR
69	Nguyen and Grahn (2017): experiment 1	30	15	15	n/a	M = 20.00; SD = 2.32	RM	1. HA-P I-BgM 2. HA-N I-BgM 3. LA-P I-BgM 4. LA-N I-BgM 5. Control	—	MEM/I-FR-V
70	Nguyen and Grahn (2017): experiment 2	30	15	15	n/a	M = 21.07; SD = 2.89	RM	1. HA-P I-BgM 2. HA-N I-BgM 3. LA-P I-BgM 4. LA-N I-BgM 5. Control	—	MEM/REC-V
71	Nguyen and Grahn (2017): experiment 3	30	17	16	n/a	M = 19.27; SD = 2.81	RM	1. HA-P I-BgM 2. HA-N I-BgM 3. LA-P I-BgM 4. LA-N I-BgM 5. Control	—	MEM/I-PAL-V
72	Nittono (1997)	24	9	15	n/a	M = 22.13; SD = 2.22	RM	1. I-BgM 2. Control	—	MEM/SR
73	Parente (1976)	42	nr	nr	n/a	university freshman	IDP	1. PREF BgM 2. NON-PREF BgM 3. Control	—	INH/STRP
74	Patston and Tippett (2011)	72	27	45	Musicians = 36 Nonmusicians = 36	Musicians: M = 23.47; SD = 4.91 Nonmusicians: M = 24.14; SD = 7.10	RM	1. Correct I-BgM 2. Control	MT	LANG/RC; ATT/NV
75	Pavlygina et al. (2010)	19	nr	nr	n/a	.Range: 18–47	RM	1. Classical, soft I-BgM 2. Classical, loud I-BgM 3. Rock, soft L-BgM 4. Rock, loud L-BgM 5. Control	—	ATT/V
76	Pavlyugina et al. (2012)	15	5	10	n/a	M = 30	RM	1. Classical, soft I-BgM 2. Classical, loud I-BgM 3. Rock, loud L-BgM 4. Rock, very loud L-BgM 5. Control	—	THINK/MATH
77	Perham and Sykora (2012)	25	nr	nr	n/a	.Range: 18–30	RM	1. PREF L-BgM 2. NON-PREF L-BgM 3. Control	—	MEM/SR
78	Perham and Vizard (2011)	25	nr	nr	n/a	.Range: 18–30	RM	1. PREF L-BgM 2. NON-PREF L-BgM 3. Control	—	MEM/SR
79	Perham and Currie (2014)	30	15	15	n/a	.Range: 19–65	RM	1. PREF L-BgM 2. NON-PREF L-BgM 3. I-BgM 4. Control	—	LANG/RC
80	Proverbio and De Benedetto (2018)	15	7	8	n/a	M = 25.4; SD = 6.7 accuracy might be off, 3 participants removed from 18	RM	1. Classical I-BgM 2. Control	—	MEM/REC-NV
81	Proverbio et al. (2018b)	50	25	25	Introverts = 25 Extraverts = 25	M = 22.94; SD = 2.49	RM	1. HA-N, tonal L-BgM 2. HA-N, atonal I-BgM 3. HA-P, tonal I-BgM 4. HA-P, atonal I-BgM 5. HA-N, tonal I-BgM 6. HA-N, atonal I-BgM 7. Control	EXT; TF	THINK/MATH
82	Proverbio et al. (2015)	54	27	27	n/a	M = 22.277 .Range: 18–28	RM	1. HA-P I-BgM (tonal & atonal) 2. HA-N I-BgM (tonal & atonal) 3. Control	GENDER	MEM/REC-NV
83	Ransdell and Gilroy (2001)	45	7	38	nr	undergraduates	RM	1. I-BgM 2. L-BgM 3. Control	MT	LANG/WQ; LANG/WF
84	Reynolds et al. (2014)	70	26	44	n/a	M = 16.71; SD = 0.68	RM	1. Pop L-BgM 2. Control	—	LANG/LG; THINK/IQ; THINK/MATH; THINK/NVR
85	Riby (2013)	17	8	9	n/a	M = 21.1; SD = 4.2	RM	1. I-BgM 1 2. I-BgM 2 3. I-BgM 3 4. I-BgM 4 5. Control	—	ATT/NV
86	Ritter & Ferguson (2017)	155	34	121	n/a	M = 22.30; SD = 5.24	IDP	1. HA-P I-BgM 2. LA-N I-BgM 3. LA-P I-BgM 4. HA-N I-BgM 5. Control	—	THINK/CRT
87	Röer et al. (2014): experiment 1	113	25	88	n/a	M = 23	RM	1. I-BgM 2. Control	—	MEM/SR
88	Röer et al. (2014): experiment 2	123	30	93	n/a	M = 23	RM	1. I-BgM 2. Control	—	MEM/SR
89	Röer et al. (2014): experiment 3	204	50	154	n/a	M = 23	RM	1. I-BgM 2. Control	—	MEM/SR
90	Salamé and Baddeley (1989): experiment 1	24	2	22	n/a	M = 19	RM	1. L-BgM 2. I-BgM 3. Control	—	MEM/SR
91	Salamé and Baddeley (1989): experiment 2	24	24	0	n/a	.Range: 25–40	RM	1. L-BgM 2. I-BgM 3. Control	—	MEM/SR
92	Salamé and Baddeley (1989): experiment 3	24	0	24	n/a	.Range: 19–22	RM	1. I-BgM 2. Control	—	MEM/SR
93	Daud and Sudirman (2017)	60	20	40	n/a	M = 23; SD = 1.57	RM	1. Classical I-BgM 2. Control	TF	MEM/I-PAL-NV
94	Shih et al. (2009)	32	14	18	n/a	.Range: 20–27	IDP	1. I-BgM during task 2. I-BgM before task 3. Control	—	ATT/NV
95	Sogin (1988)	96	nr	nr	n/a	undergraduates	IDP	1. Classical I-BgM 2. Jazz I-BgM 3. Pop L-BgM 4. Silent	—	PS
96	Taylor and Rowe (2012)	128	103	25	n/a	undergraduates	IDP	1. I-BgM 2. Control	—	THINK/MATH
97	Thaut and De I’Etoile (1993)	50	nr	nr	n/a	undergraduates	IDP	1. I-BgM during encoding 2. I-BgM during recall 3. I-BgM during encoding and recall 4. I-BgM before encoding; music before recall 5. Control	—	MEM/D-FR-V
98	Thompson et al. (2012)	16	6	10	n/a	M = 23.9 .Range: 19–48	N/A	Control	—	LANG/RC
		25	9	16	Musicians = 12Nonmusicians = 13	M = 19.7.Range: 17–26	RM	1. Slow; Soft I-BgM2. Slow; Loud I-BgM3. Fast; Soft I-BgM4. Fast; Loud I-BgM	MT	LANG/RC
99	Threadgold et al. (2019): experiment 1	30	15	15	n/a	M = 22; SD = 2.78	RM	1. L-BgM in foreign language2. Control	TF	THINK/CRT
100	Threadgold et al. (2019): experiment 2	18	6	12	n/a	M = 25; SD = 9.31	RM	1. I-BgM2. Control	TF	THINK/CRT
101	Threadgold et al. (2019): experiment 3	36	13	23	n/a	M = 24; SD = 8.36	RM	1. L-BgM in familiar language2. Control	TF	THINK/CRT
102	Verga et al. (2015)	80	40	40	n/a	M = 24.86; SD = 2.62	IDP	1. I-BgM2. Control	TF	THINK/VR
103	Woo and Kanachi (2006)	21	11	10	n/a	.Range: 18–25	RM	1. Classical, loud I-BgM2. Classical, soft I-BgM3. FAM, loud BgM4. FAM, soft BgM5. Rock, loud L-BgM6. Rock, soft L-BgM7. Control	—	MEM/I-FR-V
104	Wolf and Weiner (1972)	15	0	15	n/a	college students	RM	1. BgM2. Control	—	THINK/MATH
105	Wolfe (1983)	200	110	90	n/a	M = 20.85	IDP	1. Soft I-BgM2. Moderately loud I-BgM3. Loud I-BgM4. Control	—	THINK/MATH
106	Wu and Shih (2019)	103	24	79	Musicians = 56Nonmusicians = 47	Musicians:M = 22.90; SD = 2.21Nonmusicians:M = 23; SD = 2.39	RM	1. BgM2. Control	MT	ATT/NV
107	Xiao et al. (2020)	26	16	10	n/a	M = 19.5; SD = 1.4	RM	1. Slow I-BgM2. Medium pace I-BgM3. Fast I-BgM4. Control	—	INH/G-NG
108	Zhu et al. (2009)	13	6	7	n/a	M = 22; SD = 1.60	RM	1. Guqin I-BgM2. Piano I-BgM3. Control	—	ATT/NV
109	Zhu et al. (2008)	16	8	8	n/a	M = 22.8; SD = 1.34	RM	1. I-BgM2. Control	—	ATT/NV
Total population size		6246	2009 +	2949 +

Codes [population characteristics]. M = means; SD = standard deviation

Codes [design]. RM = repeated measures; IDP = independent design (i.e., between-subjects); MIXED = combination of repeated measures and independent design

Codes [experimental conditions]. BgM = background music; I-BgM = instrumental background music; L-BgM = background music with lyrics; HA = high arousal music; LA = low arousal music; P = positive valence music; N = negative valence music; FAM = familiar music; UNFAM = unfamiliar music; LANG = language of lyrics; PREF = preferred music; NON-PREF = non-preferred music

Codes [other variables]. EXT = extraversion level; WMC = working memory capacity; MT = music training; TF = task difficulty

Codes [cognitive domain/tasks]. MEM = memory; LANG = language; THINK = thinking, reasoning, and problem-solving; ATT = attention; INH = inhibition; PS = processing speed; SR = serial recall; I-/D-FR = immediate/delayed free recall; I-/D-PAL = immediate/delayed paired-associates learning; REC = recognition; WM = working memory; RC = reading comprehension; RS = reading speed; LG = linguistic; LL = language learning; WQ = writing quality; WF = writing fluency; MATH = mathematics/arithmetics; VR = verbal reasoning; NVR = non-verbal reasoning; IQ = general IQ tests; CRT = creativity tests; CELL = cell interpretation task; G-NG = No-NoGo task; STRP = colour Stroop task; PS = processing speed; V = verbal; NV = non-verbal

Data Classification and Synthesis

Classification

In order to answer our research questions, we have further classified the extracted data by types of cognitive task (and, when this information was available, the level of difficulty of the task), BgM characteristics (e.g., music with lyrics, instrumental music, music conveying positive moods, etc.), and relevant population characteristics (e.g., level of extraversion, participants’ level of music education, etc.).

In relation to the types of cognitive tasks, we first created a two-level taxonomy of cognitive domains and the corresponding tasks within each domain. At the first level, we classified the outcome measures reported by each experiment into one of six cognitive domains: (1) memory; (2) language; (3) thinking, reasoning, and problem-solving; (4) inhibition; (5) attention; and (6) processing speed. Then, within each of these domains, we created a second, more specific classification level that indicates the specific type of cognitive task (e.g., the language domain includes tasks such as reading, writing, language learning, etc.). The two-level taxonomy is depicted in Figure 2 and a detailed description of each cognitive task is included in Appendix D (Table D1). Note as well that, for some experiments, outcome measures were also classified (and the data synthesized) according to their difficulty levels (as reported by the authors in each experiment).

Figure 2.

Taxonomy of the six cognitive domains and the cognitive tasks within each domain.

Concerning the experimental condition, in addition to synthesizing and contrasting the results of the control condition and any BgM interventions irrespective of music characteristics (i.e., only considering the presence or absence of BgM), where possible, we also grouped and contrasted the outcomes of different experiments based on intervention characteristics related to certain musical features. In most cases, this meant summarizing and contrasting the outcomes of BgM interventions using L-BgM and I-BgM, but also other specific music characteristics identified such as tempo, genre, complexity, or its emotional character. Furthermore, given the varied ways in which descriptions of the emotional characteristics of the music were used in the various studies (e.g., happy, sad, energetic, relaxed, etc.), we also standardized this terminology by mapping discrete emotion terms to their respective quadrant in Thayer's arousal-valence emotion plane (Thayer, 1990). For instance, ‘happy’ music would constitute a ‘high arousal, positive valence’ (HA-P) music, and ‘sad’ music would be ‘low arousal, negative valence’ (LA-N). Note that, as a consequence of considering these music characteristics and conducting multiple comparisons, the results of the same experiments could be used in multiple analyses (e.g., a comparison of BgM vs. silence might simultaneously be tested as L-BgM vs. silence and HA-P vs. silence). In all these cases, appropriate procedures were adopted to adjust the significance of statistical tests for multiple comparisons (as it will be described later).

Finally, because some studies included in this review evaluated the moderating effects of various population characteristics and reported outcome measures separately for each subgroup, we also synthesized and analyzed these results separately. In particular, we considered both the mean difference between a BgM and the control condition for each subgroup (e.g., introverts: BgM vs. silence; extraverts: BgM vs. silence) as well as the mean difference between comparable subgroups for each BgM intervention and the control condition (e.g., BgM: introverts vs. extraverts; silence: introverts vs. extraverts). In situations where population characteristic(s) were measured but subgroup outcomes were not available, we did not consider those subgroups in our analysis and focused only on the impact of the experimental conditions as a whole.

All the data extracted and classified using the strategies described above can be found in the Supplementary Material.⁶

Synthesis

Due to the heterogeneity of the interventions, outcomes and study designs, and the lack of necessary data in some articles, we were not able to perform a meta-analysis. Summarizing the effect estimates or combining the p-values was also not feasible, and narrative synthesis is undesirable as it can make it difficult to determine the validity of the findings (Campbell et al., 2020). As such, we employed the Synthesis Without Meta-analysis (SWiM) reporting method (Campbell et al., 2020; McKenzie & Brennan, 2021), which synthesizes the extracted data using vote counting based (solely) on the direction of effects (mean difference between two conditions irrespective of the p-values, effect estimates and the participant sample size) and used a sign test analysis (a form of binomial test whereby the chances of an intervention significantly affect or not affect an outcome is set to p = 0.5) to determine whether there is any evidence of a specific effect across comparable studies (see Bushman & Wang, 2009; McKenzie & Brennan, 2021). It is necessary to note that data analysis using vote counting is exploratory in nature and comes with its limitations, therefore the results derived from this method will need to be interpreted with some level of caution. We will depict this in more detail later in the Limitations section.

To perform vote counting, we first calculated (for each outcome of each experiment) the mean difference between the measured outcomes for any two experimental conditions, population and/or task-related groups (e.g., BgM vs. silence, L-BgM vs. I-BgM, BgM: introverts vs. extraverts, BgM: easy vs. difficult task). Then, based on the sign of this difference (i.e., negative or positive), the directions of the effects were determined and recorded using a standardized binary metric of 0 (negative) and 1 (positive), where 0/1 means decreased/increased performance in the target condition (‘n/a’ was used when the mean difference was 0). For example, when comparing the effect of BgM (the target) to silence (the control), i.e., BgM vs. Silence, if the performance on a particular cognitive task was better in the BgM condition it would be coded with 1, otherwise it would be coded as 0 (if worst) or ‘n/a’ (if there was no difference). When the measured outcomes were not available in numerical form (e.g., exact mean values), we determined the mean difference based on available graphical data or, when this was not available, we requested the original data from the authors. The direction of effects that were unidentifiable and unattainable through any of the above means were recorded as not available (‘n/a’).

Once the above process was completed for all experiments and comparisons, we added up the number of positive signs (i.e., all 1s) for all identical comparisons (e.g., all L-BgM vs. Silence comparisons) for each outcome measure (e.g., reading comprehension task), and performed a two-tailed sign test against a significance level of α = 0.05. To correct for potential false discovery rate due to multiple testing, the p-values were also adjusted using the Benjamini-Yekutieli False Discovery Rate (B-Y FDR; Benjamini & Yekutieli, 2001).⁷ The confidence intervals (CI) for each intervention effect were calculated using the (SWiM recommended) Wilson's CI formula (Brown et al., 2001; McKenzie & Brennan, 2021). Note that, due to the small number of matched pairs (hereinafter referred to as ‘tests’) used for some of the comparisons (e.g., only a sample of 3 tests that compared the effects of L-BgM vs. silence on creative task performance) it would not be possible to reach statistical significance in the sign test irrespective of the estimated proportion of positive (or negative) effects. Indeed, as shown in Table 2, the smallest number of tests needed to achieve statistical significance (at p < .05) is 6 (comparisons with less than 6 tests will always be inconclusive even if all tests are positive or negative). Due to this, we decided to analyze samples that have at least 9 or more tests, which requires less than 10% (or more than 90%) of the tests achieving positive signs to reach statistical significance.

Table 2.

A sampling of the number of successes required to achieve statistical significance at α<.05 in a two-tailed sign test for test samples 1 to 20 and 140 to 150.

No. tests	No. successes (≤ or ≥)
	≤		≥
	n	%	n	%
1–5	—	—	—	—
6	0	0.00	6	100.00
7	0	0.00	7	100.00
8	0	0.00	8	100.00
9	1	11.11	8	88.89
10	1	10.00	9	90.00
11	1	9.09	10	90.91
12	2	16.67	10	83.33
13	2	15.38	11	84.62
14	2	14.29	12	85.71
15	3	20.00	12	80.00
16	3	18.75	13	81.25
17	4	23.53	13	76.47
18	4	22.22	14	77.78
19	4	21.05	15	78.95
20	5	25.00	15	75.00
*		…
140	57	40.71	83	59.29
141	58	41.13	83	58.87
142	58	40.85	84	59.15
143	59	41.26	84	58.74
144	59	40.97	85	59.03
145	60	41.38	85	58.62
146	60	41.10	86	58.90
147	61	41.50	86	58.50
148	61	41.22	87	58.78
149	62	41.61	87	58.39
150	62	41.33	88	58.67

All the analyses were performed using RStudio version 4.1.0 (RStudio Team, 2021). The sign test analysis was performed using the function “binom.test” (p = 0.5; alternative = ”two.sided”). Wilson's confidence interval was computed using the package “PropCIs” (Scherer, 2018)—function “scoreci” (conf.level = 0.95). Lastly, the BY-FDR adjustment was performed using the “p.adjust” function (method = ‘BY’).

Assessment of Quality

To our best knowledge, there is no specific tool to evaluate the quality of studies measuring cognitive performance in general or the impact of music interventions on cognition in particular. Therefore, we decided to develop a new tool—the Music and Cognitive Performance Appraisal Tool (MCPAT; see Appendix C). Following the general recommendations in Whiting et al. (2017), we first adapted a list of potential assessment criteria (one that caters for research in music and cognitive performance) based on existing risk of bias assessment tools—Cochrane RoB 2.0 (Sterne et al., 2019), ROBINS-I (Sterne, Hernán et al., 2016; Sterne, Higgins et al., 2016), Mixed Methods Appraisal Tool (MMAT) version 2018 (Hong, Fàbregues et al., 2018; Hong, Pluye et al., 2018), and the checklist for reporting music-based interventions (Robb et al., 2011)—and refined them along a series of selections and consensus meetings among our team (the authors). With the first draft of the MCPAT, we then piloted the tool using a subset of the articles sampled in this review, followed by further refinements of the criteria. Any selection conflicts along this process was supervised by a third person.

As a result, the MCPAT offers a quality assessment reference for five domains (each with a variety of assessment criteria): (1) experimental design, (2) music selection and characteristics (i.e., the interventions), (3) cognitive tasks and outcome measures, (4) experimental procedures, and (5) results and data analysis. Each assessment criterion is equipped with a signalling question for reviewers to determine (using the responses ‘Yes’(Y), ‘No’ (N), ‘Can't tell’ (CT) or ‘Not applicable’ (NA)) whether they were fulfilled by the relative experiments. The complete MCPAT can be found in Appendix C, together with the signalling questions associated with each of the assessment criterion.

The quality assessment was conducted by the review author (YC), and double-extraction was conducted on 25 (∼26%) of the articles by another author (HKW). Any discrepancies were then resolved by the last author (EC). Following this, we then calculated the total number (and the corresponding percentage) of experiments that fulfilled each assessment criterion (i.e., [Y / (Y + N + CT)] * 100%).⁸

Results

In what follows, we provide a summary of relevant characteristics of our review sample, the quality assessment of individual experiments, and the sign test results for each cognitive domain and task (according to music-specific characteristics), and the sign test results for specific population groups.

Studies Characteristics

The studies included in this review (mostly published after 2010; cf. Figure 3) include data from 6,246 participants (females: 2,949; males: 2,009; not reported: 1,288), most of which are young adults (see Table 1, column ‘Age’).

Figure 3.

Trend of publication of included articles.

In relation to cognitive tasks, as it can be seen in Figure 4, a variety of tasks have been studied, with the most common relating to the memory, language, and thinking domains. The most frequently studied tasks were reading comprehension (n = 21 experiments), mathematics/arithmetics tasks (n = 15), and serial recall and non-verbal attention (n = 13 each). It is interesting to notice that (cf. Figure 4) only 17 experiments (out of 154) considered the role of task difficulty when investigating the impact of BgM on task performance, and most of them were related to thinking (n = 7).

Figure 4.

Tree-map of the number of experiments sampled for each of the six cognitive domains as well as for task difficulty, and the corresponding cognitive tasks.

Regarding population distributions by cognitive task (see Table 3), there are very clear differences in the number of participants tested in different cognitive tasks. Many participants were tested on reading comprehension (1544), mathematics/arithmetics (1391), serial recall (781), non-verbal reasoning (717), and verbal attention (649), but very few were tested on tasks such as immediate and delayed free recall of non-verbal materials (68 each), writing process and quality (73 each), and language learning (32).

Table 3.

Population size for each individual cognitive task (i.e., level 2 classification), organized by their respective cognitive domains and then by verbal or non-verbal properties.

Cognitive tasks	Total population	Males^a	Females^a
Memory-related tasks
(verbal)
Immediate free recall	494	152	342
Delayed free recall	409	106 +	253 +
Immediate paired-associates learning	326	81 +	171 +
Delayed paired-associates learning	146	26 +	439
Recognition	280	124	156
Serial recall	781	183 +	489 +
Working memory	96	29	67
(non-verbal)
Immediate free recall	68	23	45
Delayed free recall	68	23	45
Immediate paired-associates learning	121	49	72
Delayed paired-associates learning	—	—	—
Recognition	144	71	73
Language-related tasks
Language learning	32	12	20
Linguistic	733	223 +	248 +
Reading comprehension	1544	409 +	546 +
Reading speed	54	21	33
Writing fluency	73	9	64
Writing quality	73	9	64
Thinking
(verbal)
Verbal reasoning	283	180	103
(non-verbal)
Cell interpretation	34	9	25
Mathematics/arithmetics	1391	428 +	441 +
Non-verbal reasoning	717	224 +	295 +
(others)
Creativity	239	68	171
General IQ test	100	43	57
Attention
Non-verbal	533	198 +	277 +
Verbal	649	281 +	349 +
Inhibition
Colour Stroop	107	10 +	30 +
Go-NoGo	193	98	95
Processing speed
Eye-hand coordination	303	125 +	82 +
Task difficulty	577	187 +	213 +

Note. As some studies evaluated multiple types of cognitive tasks (e.g., abstract reasoning, reading comprehension and mathematics) on the same group of participants, the population and sample size listed above do not add up to the total population (n = 6246, males = 2009; females = 2949; not reported = 1288) of the review.

Plus signs ( + ) indicate the presence of missing data on gender distribution.

Regarding experiments that specifically evaluated the interactions between individual characteristics and BgM (see Table 4), the personality trait level of extraversion (i.e., introverts, extraverts) was the variable tested in the largest number of experiments (n = 31), followed by gender (n = 15), music training (n = 7) and working memory capacity (n = 3). In relation to music characteristics (see Table 5), by far, the largest sample obtainable was that of the absence (I-BgM; n = 103) or presence of lyrics (L-BgM; n = 77). Others included arousal and valence (n = 18), genre (n = 13), tempo (n = 8) and volume (n = 5), complexity of the music (n = 6) and listeners’ preference for the music (n = 7), to name a few.

Table 4.

Population and sample sizes (total and by respective cognitive tasks) for each subgroup: personality traits, gender, music education, working memory capacity, and task difficulty.

Subgroup	Total population	Population size by subgroups^a	No. experiments	No. experiments for each cognitive task
Extraversion	799	Introverts: 255 + Extraverts: 268 +	31	– Reading comprehension = 8– Math = 4– Non-verbal reasoning = 3– Imm. free recall (NV) = 2– Imm. free recall (V) = 2– Del. free recall (NV) = 2– Language learning = 2– Del. free recall (V) = 1– Serial recall = 1– Imm. PAL (V) = 1– Imm. PAL (NV) = 1– Verbal reasoning = 1– Non-verbal attention = 1– Colour Stroop = 1– Eye-hand coordination = 1
Gender^b	981	Males: 374 + Females: 339 +	15	– Math = 3– Non-verbal reasoning = 2– Linguistics = 2– Reading comprehension = 2– Verbal attention = 2– Recognition (V) = 1– Recognition (NV) = 1– Non-verbal attention = 1– Go/NoGo = 1
Music training	368	Musicians: 157 + Nonmusicians: 150 +	7	– Reading comprehension = 3– Non-verbal attention = 2– Writing quality = 1– Verbal attention = 1
Working memory capacity	219	High WMC: 42 + Low WMC: 39 +	3	– Reading comprehension = 2– Math = 1

Code. Imm. = immediate; Del. = delayed; V = verbal; NV = non-verbal; PAL = paired-associates learning

Plus signs ( + ) indicate the presence of missing data in respect to the respective population distributions.

Population characteristics in relation to gender were extracted only from studies that reported summary statistics in relation to gender.

Table 5.

Sample of each type of BgM condition, organized by the respective cognitive task.

Interventions / Cognitive tasks	BgM	I-BgM	L-BgM	A&V	GEN	PREF	TEM	VOL	LAN	TONAL	COMP	FAM	INST	CONT
Memory
(verbal)
Immediate free recall	8	7	3	4	1	—	—	1	—	—	—	—	—	—
Delayed free recall	5	4	2	2	—	—	—	—	—	—	—	—	—	1
Immediate paired-associates learning	6	5	1	1	—	—	1	—	1	—	1	—	—	—
Delayed paired-associates learning	4	3	1	—	—	—	—	—	1	—	—	—	—	—
Recognition	5	4	2	1	—	—	—	—	1	—	—	1	—	—
Serial recall	13	10	7	—	—	2	—	—	—	—	—	—	—	—
Working memory (auditory)	1	1	—	—	—	—	—	—	—	1	—	—	—	—
Working memory (visual)	1	1	—	—	—	—	—	—	—	1	—	—	—	—
(non-verbal)
Immediate free recall	2	—	2	—	—	—	—	—	—	—	1	—	—	—
Delayed free recall	2	—	2	—	—	—	—	—	—	—	1	—	—	—
Immediate paired-associates learning	2	2	1	—	—	—	—	—	—	—	—	—	—	—
Delayed paired-associates learning	0	—	—	—	—	—	—	—	—	—	—	—	—	—
Recognition	4	4	1	1	—	—	—	—	—	—	—	—	—	—
Language
Language learning	1	1	—	—	—	—	—	—	—	—	—	—	—	—
Linguistics	7	5	4	2	1	—	—	—	—	—	—	—	—	—
Reading comprehension	21	12	15	—	3	2	1	1	1	—	1	1	—	—
Reading speed	2	1	—	—	—	—	—	—	—	—	—	—	—	—
Writing fluency	2	1	2	—	—	—	—	—	—	—	—	—	—	—
Writing quality	2	1	2	—	—	—	—	—	—	—	—	—	—	—
Thinking
(verbal)
Verbal reasoning	3	3	2	—	—	—	—	—	—	—	—	—	—	—
(non-verbal)
Cell interpretation	1	—	—	—	—	1	—	—	—	—	—	—	—	—
Mathematics/arithmetics	15	9	10	1	3	—	2	2	1	—	—	1	—	—
Non-verbal reasoning	9	5	7	—	1	—	—	—	—	—	1	—	—	—
(others)
Creativity	4	2	2	—	—	—	—	—	2	—	—	—	—	—
IQ tests	2	1	1	—	—	—	—	—	—	—	—	—	—	—
Attention
Non-verbal	13	9	2	1	2	—	1	—	—	—	—	1	1	—
Verbal	7	5	3	1	1	—	1	1	—	—	1	—	—	—
Inhibition
Colour Stroop	3	1	1	1	—	1	—	—	—	1	—	—	—	—
Go/NoGo	6	3	2	2	—	1	2	—	—	—	—	—	—	—
Processing speed	3	3	2	1	1	—	—	—	—	—	—	—	—	—
Total experiments	154	103	77	18	13	7	8	5	7	3	6	4	1	1

Codes. L-BgM = background music with lyrics; I-BgM = instrumental background music; A&V = arousal and valence; GEN = genre; PREF = preference; TEM = tempo; VOL = volume; LANG = language of lyrics; TONAL = tonality/harmonicity; COMP = complexity; FAM = familiarity; INST = type of musical instrument; CONT = context

Quality Assessment

Quality assessment was conducted for each empirical study⁹ rather than for each article. An overview is shown in Table 6.

Table 6.

Quality assessment criteria rating for each individual experiment.

No.	Authors (Year)	1.1	1.2	1.3	1.4	1.5	2.1	2.2	3.1	3.2	3.3	4.1	4.2	4.3	5.1	5.2	5.3	5.4
1	Alley and Greene (2008)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	CT	Y	N	N
2	Amezcua et al. (2005)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
3	Angel et al. (2010)	Y	Y	CT	CT	Y	Y	Y	Y	Y	NA	Y	CT	CT	Y	N	N	N
4	Avila et al. (2011)	Y	Y	Y	NA	NA	Y	Y	CT	Y	NA	Y	N	CT	Y	Y	N	N
5	Beauchene et al. (2016)	Y	Y	Y	NA	NA	CT	CT	Y	Y	NA	Y	Y	Y	Y	Y	N	N
6	Begum et al. (2019)	Y	Y	CT	CT	Y	CT	CT	Y	Y	NA	Y	N	N	Y	Y	N	N
7	Bonin and Smilek (2016): study 1	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	N	CT	CT	Y	N	N
8	Bonin and Smilek (2016): study 2	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	N	CT	CT	Y	N	N
9	Bottiroli et al. (2014)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	CT	Y	Y	N	N
10	Boyle and Coltheart (1996): study 1	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
11	Boyle and Coltheart (1996): study 2	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
12	Burkhard et al. (2018)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
13	Burton (1986)	Y	Y	CT	N	Y	Y	Y	N	Y	NA	N	N	N	Y	Y	N	N
14	Cassidy and MacDonald (2007)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	CT	N
15	Cauchard et al. (2012)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
16	Chew et al. (2016)	Y	Y	CT	CT	Y	Y	Y	Y	Y	NA	Y	Y	Y	Y	N	N	N
17	Cho (2015)	Y	Y	N	N	CT	Y	Y	Y	Y	CT	Y	CT	CT	N	Y	Y	N
18	Chou (2010)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	CT	Y	Y	Y	N
19	Christopher and Shelton (2017)	Y	Y	Y	NA	NA	Y	Y	CT	Y	NA	Y	N	CT	Y	Y	Y	Y
20	Cockerton et al. (1997)	Y	Y	Y	NA	NA	Y	Y	N	Y	NA	CT	CT	CT	CT	Y	N	N
21	Crawford and Strapp (1994)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	CT	CT	Y	N	N
22	Crust et al. (2004)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
23	Daoussis and McKelvie (1986)	Y	Y	CT	N	Y	Y	Y	Y	Y	NA	Y	CT	CT	Y	Y	N	N
24	Darrow et al. (2006)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	CT	Y	Y	N	N
25	Davenport (1972)	CT	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	CT	Y	Y	N	N
26	De Groot and Smedinga (2014)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
27	De Groot (2006)	Y	Y	CT	Y	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	Y
28	Deng and Wu (2020)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	CT	Y	Y	N	N
29	Doyle and Furnham (2012)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	CT	Y	Y	N	N
30	Echaide et al. (2019): study 1	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
31	Echaide et al. (2019): study 2	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
32	Evered et al. (2018)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
33	Feizpour et al. (2020)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
34	Fernandez et al. (2019)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	N	N	N
35	Ferreri et al. (2015)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
36	Ferreri et al. (2013)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
37	Ferreri et al. (2014)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	Y	N
38	Fontaine and Schwalm (1979)	CT	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	N	CT	Y	N	N	N
39	Furnham and Allass (1999)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
40	Furnham and Bradley (1997)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
41	Furnham and Strbac (2002)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
42	Furnham et al. (1999)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	CT	Y	Y	N	N
43	Geethanjali et al. (2016a)	Y	Y	N	Y	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	Y	N
44	Geethanjali et al. (2016b)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	CT	Y	Y	N	N
45	Gonzalez and Aiello (2019)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	N	N	N
46	Herath (2018)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	N	N	N	N
47	Ho et al. (2006)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	N	N	N
48	Huang and Shih (2011)	Y	Y	Y	NA	NA	CT	N	Y	Y	NA	Y	CT	N	CT	Y	N	N
49	Iwanaga and Ito (2002)	Y	Y	CT	CT	CT	Y	Y	Y	Y	NA	Y	Y	CT	Y	Y	N	N
50	Jäncke et al. (2014)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	CT	N	N
51	Jaušovec and Habe (2004)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	N	N	N
52	Johansson et al. (2012)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
53	Kang and Williamson (2014)	Y	Y	Y	NA	NA	Y	Y	Y	Y	CT	Y	Y	Y	Y	Y	N	N
54	Kou et al. (2018)	Y	Y	Y	NA	NA	Y	Y	N	Y	NA	Y	CT	CT	Y	Y	Y	N
55	Küssner et al. (2016): study 1	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	Y	N
56	Küssner et al. (2016): study 2	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	N	Y	N
57	Lehmann and Seufert (2017)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	Y	CT	Y	N	Y
58	Liu et al. (2012)	Y	Y	CT	CT	CT	Y	Y	Y	Y	NA	Y	Y	Y	Y	CT	N	N
59	Mammarella et al. (2007)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	Y	N
60	Mansour et al. (2017)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	CT	Y	Y	N	N
61	Mansouri et al. (2016)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
62	Manthei and Kelly (1999)	Y	Y	N	CT	CT	Y	Y	N	Y	NA	Y	Y	Y	Y	Y	Y	N
63	Martin et al. (1988): study 1	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	CT	Y	Y	N	N
64	Martin et al. (1988): study 2	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	CT	Y	Y	N	N
65	Masataka and Perlovsky (2013)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	Y	N
66	Mayfield and Moss (1989): study 1	CT	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	CT	Y	N	N	N
67	Mayfield and Moss (1989): study 2	CT	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	CT	CT	Y	N	N
68	Miller and Schyb (1989)	CT	Y	CT	Y	NA	Y	Y	CT	Y	NA	Y	Y	CT	Y	Y	N	N
69	Nguyen and Grahn (2017): study 1	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	N	N	N
70	Nguyen and Grahn (2017): study 2	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
71	Nguyen and Grahn (2017): study 3	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	N	N	N
72	Nittono (1997)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
73	Parente (1976)	CT	Y	Y	NA	NA	N	Y	N	Y	NA	Y	N	N	Y	Y	N	N
74	Patston and Tippett (2011)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	CT	Y	Y	N	N
75	Pavlygina et al. (2010)	Y	Y	CT	CT	CT	Y	Y	Y	Y	NA	Y	CT	CT	Y	CT	N	N
76	Pavlyugina et al. (2012)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	CT	Y	Y	N	N
77	Perham and Sykora (2012)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	N	N	Y	Y	N	N
78	Perham and Vizard (2011)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	CT	Y	CT	N	N
79	Perham and Currie (2014)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	N	N	Y	Y	N	N
80	Proverbio and De Benedetto (2018)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
81	Proverbio et al. (2018b)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	CT	Y	N	N	N
82	Proverbio et al. (2015)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
83	Ransdell and Gilroy (2001)	CT	Y	N	CT	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	N	N	N
84	Reynolds et al. (2014)	Y	Y	Y	NA	NA	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	N	N
85	Riby (2013)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	CT	N	N	N
86	Ritter and Ferguson (2017)	Y	Y	Y	NA	NA	Y	Y	Y	Y	Y	Y	Y	Y	Y	N	N	N
87	Röer et al. (2014): study 1	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
88	Röer et al. (2014): study 2	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
89	Röer et al. (2014): study 3	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
90	Salamé and Baddeley (1989): study 1	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
91	Salamé and Baddeley (1989): study 2	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
92	Salamé and Baddeley (1989): study 3	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
93	Daud and Sudirman (2017)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	CT	CT	Y	N	N
94	Shih et al. (2009)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	CT	Y	N	N
95	Sogin (1988)	CT	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
96	Taylor and Rowe (2012)	Y	Y	CT	Y	NA	Y	Y	Y	Y	NA	Y	CT	CT	Y	Y	Y	N
97	Thaut and De I’Etoile (1993)	CT	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
98	Thompson et al. (2012)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
99	Threadgold et al. (2019): study 1	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	CT	Y	Y	Y	Y
100	Threadgold et al. (2019): study 2	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	CT	Y	Y	Y	Y
101	Threadgold et al. (2019): study 3	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	CT	Y	Y	Y	Y
102	Verga et al. (2015)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
103	Woo and Kanachi (2006)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	CT	Y	N	N
104	Wolf and Weiner (1972)	Y	Y	Y	NA	NA	CT	CT	Y	Y	NA	Y	CT	CT	Y	Y	N	N
105	Wolfe (1983)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	CT	Y	N	N
106	Wu and Shih (2019)	Y	Y	Y	NA	NA	N	N	Y	Y	NA	Y	CT	CT	CT	Y	Y	N
107	Xiao et al. (2020)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	N	N
108	Zhu et al. (2009)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	CT	Y	Y	Y	N	N
109	Zhu et al. (2008)	Y	Y	Y	NA	NA	Y	Y	Y	Y	NA	Y	Y	Y	Y	Y	Y	N
	Y (n)	100	109	94	10^b		103	104	101	109	3	107	75	65	93	89	17	6
	N/CT (n)	9	0	15	5^c		6	5	8	0	2	2	34	44	16	20	92	103
	NA (n)	—	—	—	94^d		—	—	—	—	104	—	—	—	—	—	—	—
	Y (%) ^a	92	100	86	67		94	95	93	100	60	98	69	60	85	82	16	6

Codes. Y = yes; N = no; CT = can't tell; N/A = not applicable

1.1 = sampling criteria; 1.2 = control condition; 1.3 = extraneous variables; 1.4 = baseline comparability; 1.5 = internal validity; 2.1 = music selection rationale; 2.2 = compliance to hypothesis; 3.1 = task description; 3.2 = outcome measures; 3.3 = objectivity of assessors; 4.1 = timing of delivery; 4.2 = intervention strategies; 4.3 = replicability; 5.1 = analyses appropriateness; 5.2 = reporting (direction of effect); 5.3 = reporting (statistical significance); 5.4 = reporting (effect magnitude).

Note. The ‘Y’ response is a sign of good quality.

Y (%) = [n_Y / (n_Y + n_N/CT)]*100% (i.e., NA responses were discarded from the calculation).

Outcome (n) for if either criterium 1.4 or 1.5 is ‘Y’.

Outcome (n) for if both criteria 1.4 and 1.5 is either ‘N’ or ‘CT’.

Outcome (n) for only and only if criteria 1.4 and 1.5 are both ‘NA’.

As can be seen, the majority of the studies (94%) did not report effect sizes, and only 16% reported exact significance values (i.e., p-values) for all the significant and non-significant results.

On the other extreme, all studies included a control group (naturally, as this was an inclusion criterion in this review), and employed adequate outcome measures to quantify performance on the various cognitive tasks reported. Additionally, most studies contained clear description of the characteristics of their sampled population (92%), and had a clear rationale for the music selection process (94%) that was aligned to the hypothesis of the experiment (95%). Generally, there were also clear descriptions regarding the timing of delivery of the BgM interventions (98%), how the length of the BgM was accommodated to the length of the cognitive task (69%), clear descriptions of the cognitive tasks being studied and how they were executed (93%), and clear justifications for the chosen statistical analyses (85%). Most studies (86%) also had adequate measures in place (e.g., random allocation, counterbalancing, etc.) to control for extraneous variables. Of the 14% that did not perform or report such measures, 67% of them had adequate alternative measures in place to ensure homogeneity among participants (e.g., measures of working memory capacity at baseline).

Various other criteria were less consistent (or not clearly described) across studies, namely the existence of an independent outcome assessor for outcome measures that require subjective rating (60%), and the inclusion of sufficient information about the experiments that allows a full replication (60%).

Impact of BgM on Different Cognitive Domains

The sign test analyses showed significant effects of BgM (of particular features) on cognitive performance for two cognitive domains—memory (Table 7) and language (Table 8). There are also significant effects associated with task difficulty (Table 9). The full vote counting and sign test analyses for the other cognitive domains are included in Appendix D (Tables D2-D5). Note that, as mentioned before, some tests were not conducted due to small sample sizes (and therefore comparisons are not included in the tables).

Table 7.

Sign test results of BgM and memory-related tasks, performed on RStudio version 4.1.0, using the function ‘binom.test’ (p = 0.5, alternative=‘two.sided’).

Tests	Comparisons	No. of successes^a	Total tests	Proportion	p	Adjusted p^b	95% Confidence^c
		No. of successes^a	Total tests	Proportion			Lower	Upper
		n	n	%			%	%
All memory tasks
M1	BgM vs. S	62	152	40.79	0.028	0.210	33.30	48.74
M2	**L-BgM vs. S	12	59	20.34	0.000	0.000	12.04	32.27
M3	I-BgM vs. S	50	91	54.95	0.402	1.000	44.73	64.76
M4	*L-BgM vs. I-BgM	0	12	0.00	0.000	0.009	0.00	24.25
M5	HA-N vs. S	4	14	28.57	0.180	1.000	3.01	56.35
M6	LA-P vs. S	5	12	41.67	0.774	1.000	19.33	68.05
M7	HA vs. S	3	16	18.75	0.021	0.198	6.59	43.01
M8	LA vs. S	7	17	41.18	0.629	1.000	21.61	63.99
M9	HA vs. LA	3	17	1.27	0.013	0.158	6.19	41.03
M10	P vs. S	7	19	36.84	0.359	1.000	19.15	58.96
M11	N vs. S	7	20	35.00	0.263	1.000	18.12	56.71
M12	P vs. N	11	21	52.38	1.000	1.000	32.37	71.66
Serial recall
M13	**BgM vs. S	1	22	4.55	0.000	6.03E-05	0.80	21.80
M14	*L-BgM vs. S	0	11	0.00	0.001	0.003	0.00	25.90
M15	*I-BgM vs. S	1	11	9.09	0.012	0.021	1.60	37.70
Immediate free recall (verbal materials)
M16	BgM vs. S	8	25	32.00	0.108	0.296	17.21	51.59
M17	L-BgM vs. S	1	11	9.09	0.012	0.064	1.60	37.70
M18	I-BgM vs. S	7	12	58.33	0.774	1.000	31.95	80.67
Delayed free recall (verbal materials)
M19	BgM vs. S	8	13	61.54	0.581	—	35.52	82.29
Immediate paired-associates learning (verbal-verbal)
M20	*BgM vs. S	22	29	75.86	0.008	0.012	57.90	87.80
M21	*I-BgM vs. S	21	25	84.00	0.001	0.003	65.40	93.60
Delayed paired-associates learning (verbal-verbal)
M22	BgM vs. S	8	12	66.67	0.388	0.582	39.06	86.19
M23	I-BgM vs. S	8	10	80.00	0.109	0.328	55.20	94.33
Recognition (verbal materials)
M24	*BgM vs. S	2	20	10.00	0.000	0.002	2.80	30.10
M25	*L-BgM vs. S	0	9	0.00	0.004	0.011	0.00	29.90
M26	I-BgM vs. S	2	11	18.18	0.065	0.120	5.10	47.70

Codes. BgM = background music; L-BgM = background music with lyrics; I-BgM = instrumental background music; S = silence; HA = high arousing music; LA = low arousing music; P = positive valence music; N = negative valence music

* p < .05 ** p < .001

Successes in favour of the first listed condition. For example, the number of successes for a comparison L-BgM vs. I-BgM would correspond to the successes in favour of L-BgM

p-values corrected for multiple comparisons, based on the Benjamini-Yekutieli False Discovery Rate.

Confidence intervals are calculated based on Wilson's CI.

Table 8.

Sign test results of BgM and language-related tasks, performed on RStudio version 4.1.0, using the function ‘binom.test’ (p = 0.5, alternative = ‘two.sided’).

Tests	Comparisons	No. of successes^a	Total tests	Proportion	p	Adjusted p^b	95% Confidence^c
		No. of successes^a	Total tests	Proportion			Lower	Upper
		n	n	%			%	%
All language tasks
L1	BgM vs. S	43	106	40.57	0.064	0.474	31.71	50.08
L2	L-BgM vs. S	17	50	34.00	0.033	0.474	22.44	47.85
L3	I-BgM vs. S	26	54	48.15	0.892	1.000	35.39	61.15
L4	L-BgM vs. I-BgM	9	21	42.86	0.664	1.000	24.47	63.45
L5	Hip-hop/Pop vs. S	6	13	46.15	1.000	1.000	23.21	70.86
L6	Classical vs. Hip-hop/Pop	4	10	40.00	0.754	1.000	16.82	68.73
Reading comprehension
L7	*BgM vs. S	22	70	31.43	0.003	0.021	21.80	43.00
L8	*L-BgM vs. S	9	34	26.47	0.009	0.038	14.60	43.10
L9	I-BgM vs. S	13	34	38.24	0.230	0.638	23.90	55.00
L10	L-BgM vs. I-BgM	5	11	45.45	1.000	1.000	21.30	72.00
Reading speed
L11	*BgM vs. S	2	13	15.38	0.022	—	4.30	42.20
Linguistic
L12	BgM vs. S	14	29	48.28	1.000	1.000	31.40	65.60
L13	L-BgM vs. S	7	15	46.67	1.000	1.000	24.80	69.90
L14	I-BgM vs. S	7	14	50.00	1.000	1.000	26.80	73.20
L15	L-BgM vs. I-BgM	4	10	40.00	0.754	1.000	16.80	68.70

Codes. BgM = background music; L-BgM = background music with lyrics; I-BgM = instrumental background music; S = silence

* p < .05

Successes in favor of the first listed condition. For example, the number of successes for a comparison L-BgM vs. I-BgM would correspond to the successes in favor of L-BgM

p-values corrected for multiple comparisons, based on the Benjamini-Yekutieli False Discovery Rate.

Confidence intervals are calculated based on Wilson's CI.

Table 9.

Sign test results of BgM and cognitive performances by task difficulty, performed on RStudio version 4.1.0, using the function ‘binom.test’ (p = 0.5, alternative = ‘two.sided’).

Tests	Comparisons	No. of successes^a	Total tests	Proportion	p	Adjusted p^b	95% Confidence^c
		No. of successes^a	Total tests	Proportion			Lower	Upper
		n	n	%			%	%
Difficult
TF1	BgM vs. S	11	24	45.83	0.839	1.000	27.90	64.90
TF2	I-BgM vs. S	10	18	55.56	0.815	1.000	33.70	75.40
Easy
TF3	BgM vs. S	11	24	45.83	0.839	1.000	27.90	64.90
TF4	I-BgM vs. S	8	17	47.06	1.000	1.000	26.20	69.00
BgM (difficult–easy) ^c
TF5	*BgM (difficult - easy)	6	24	25.00%	0.023	0.046	12.00%	44.90%
TF6	*I-BgM (difficult - easy)	4	18	22.22%	0.031	0.046	9.00%	45.20%
Silence (difficult—easy) ^c
TF7	S (difficult - easy)	4	14	28.57%	0.180	—	11.70%	54.70%

Codes. BgM = background music; I-BgM = instrumental background music; S = silence

* p < .05

Successes in favour of the first listed condition. For example, the number of successes for a comparison L-BgM vs. I-BgM would correspond to the successes in favour of L-BgM. In the case of subgroup comparisons, the number of successes will be successes in favour of the first listed subgroup, e.g., in a comparison of BgM (males - females), it would be successes in favour of males in the BgM condition (when compared to females in BgM condition).

p-values corrected for multiple comparisons, based on the Benjamini-Yekutieli False Discovery Rate.

Confidence intervals are calculated based on Wilson's CI.

Memory

Out of 152 tests comparing the effect of BgM on memory tasks, our results show that there is no evidence that BgM (irrespective of its characteristics) either hinders or benefits performance in the memory domain (cf. Table 7, Test M1). Nonetheless, if the music had lyrics, performance was significantly worse compared to performance in silence (p < .001; cf. Table 7, Test M2) and performance in instrumental music (p = .009; Table 7, Test M4). The performance in instrumental music did not differ from that in silence.

When looking at individual memory tasks, we found some task specific effects. Compared to silence, there was a strong detrimental effect of BgM (p < .001), L-BgM (p = .003) and I-BgM (p = .021) on serial recall task performance (cf. Table 7, Tests M13-M15). There was also a significant detrimental effect of L-BgM (when compared to silence) on the performance on memory recognition of verbal materials (cf. Table 7, Test M25; p = .011). The only positive effects we found pertain to immediate paired-associates learning of verbal materials, which indicate a positive effect of I-BgM (compared to silence, cf. Table 7, Test M21; p = .003).

No significant results were found for any other types of memory tasks, either because the sign tests were non-significant (sometimes after corrections for multiple comparisons) or that the analyses were not conducted due to the low number of available tests.

Language

Out of 106 tests comparing the impact of BgM on all language-related tasks, our results show that there is no evidence that BgM either hinders or benefits performance irrespective of task (cf. Table 8, Test L1). Nonetheless, the analyses of individual tasks have revealed two significant effects. First, BgM (compared to silence) hindered reading comprehension (p = .029; cf. Table 8, Test L7), which seems to be associated with the detrimental effect of L-BgM (p = .038; cf. Table 8, Test L8). Second, BgM (compared to silence) had slowed down reading speed (p = .022; cf. Table 8, Test L11).

No significant results were found for any other language-related tasks, either because the sign tests were non-significant or that the analyses were not conducted due to the low number of available tests.

Task Difficulty

Out of 24 tests that compared the impact of BgM on tasks varying in difficulty level (irrespective of the cognitive domain and types of tasks), our results show no difference in difficult tasks performance with or without BgM. This is the same for easy tasks. However, when we directly compared the performance between difficult and easy tasks, we found no significant difference in task performance when in silence; but when in the presence of BgM, performance in difficult tasks was significantly poorer than performance in the easy tasks (p = .023; cf. Table 9, Test TF5). We found a similar effect when we tested only instrumental music (p = .031; cf. Table 9, Test TF6). The analysis of L-BgM was not conducted due to the low number of available tests.

The Contributions of Individual Characteristics

In relation to the contributions of individual listeners’ characteristics on the effect of BgM on cognitive task performance, only the level of extraversion yielded statistically significant effects. As it can be seen in Table 10, compared to silence, introverts’ performance across all types of cognitive tasks was significantly poorer in the presence of BgM in general (p = .004; cf. Table 10, Test EXT3) and L-BgM in particular (p = .004; cf. Table 10, Test EXT4), whereas the performance of extraverts was not affected by the presence of music. We also directly compared the performances of introverts and extraverts in both BgM and silent conditions. Interestingly, our results show that, compared to extraverts, introverts had a significantly superior performance in the silent condition (p = .004; cf. Table 10, Test EXT7), but this effect disappeared in the presence of BgM (cf. Table 10, Tests EXT3-EXT4).

Table 10.

Sign test results of BgM and cognitive performances by level of extraversion, performed on RStudio version 4.1.0, using the function ‘binom.test’ (p = 0.5, alternative = ‘two.sided’).

Tests	Comparisons	No. of successes^a	Total tests	Proportion	p	Adjusted p^b	95% Confidence^c
		No. of successes^a	Total tests	Proportion			Lower	Upper
		n	n	%			%	%
Extraverts
EXT1	BgM vs. S	18	33	54.55	0.728	1.000	38.00	70.20
EXT2	L-BgM vs. S	13	26	50.00	1.000	1.000	32.10	67.90
Introverts
EXT3	*BgM vs. S	8	34	23.53	0.003	0.004	12.40	40.00
EXT4	*L-BgM vs. S	5	27	18.52	0.002	0.004	8.20	36.70
BgM (Introverts–Extraverts) ^c
EXT5	BgM (introverts–extraverts)	10	33	30.30	0.035	0.053	17.40	47.30
EXT6	L-BgM (introverts–extraverts)	7	26	26.92	0.029	0.053	13.70	46.10
Silence (Introverts–Extraverts) ^c
EXT7	*S (introverts–extraverts)	18	22	81.82	0.004	—	61.50	92.70

Codes. BgM = background music; L-BgM = background music with lyrics; I-BgM = instrumental background music; S = silence; EXT = extraverts; INT = introverts

* p < .05

Successes in favour of the first listed condition. For example, the number of successes for a comparison L-BgM vs. I-BgM would correspond to the successes in favour of L-BgM. In the case of subgroup comparisons, the number of successes will be successes in favour of the first listed subgroup, e.g., in a comparison of BgM (males–females), it would be successes in favour of males in the BgM condition (when compared to females in BgM condition).

p-values corrected for multiple comparisons, based on the Benjamini-Yekutieli False Discovery Rate.

Confidence intervals are calculated based on Wilson's CI.

No significant effects were found for gender and music training because the sign tests were non-significant, and analysis was not conducted for working memory capacity due to the low number of available tests. The full results of the vote counting and sign test analyses for these factors are included in Appendix D (Tables D6-D7).

Discussion

The aim of this systematic review was to evaluate the effect of BgM on cognitive task performances and to provide clarity to the findings in this field. Building upon the findings and limitations of previous reviews, we adopted a task-specific approach towards evaluating BgM's effect on specific cognitive domains and tasks. To this end, we devised a taxonomy to classify the cognitive tasks reported in 95 empirical articles (154 experiments) into one of six cognitive domains: (1) memory; (2) language; (3) thinking, reasoning, and problem-solving; (4) inhibition; (5) attention; and (6) processing speed. Within each domain, we then performed task-specific analyses on each type of cognitive task (e.g., within the memory domain: serial recall, free recall, recognition, etc.); and if the data are available, analyses according to the difficulty levels of the cognitive tasks (as reported in each experiment). We also adopted a music-specific approach to the task-relevant analyses, and identified 13 music characteristics—the presence/absence of music (i.e., BgM vs. silence); and when the information is available, the presence (L-BgM) and absence of lyrics (I-BgM) in the music, arousal and valence, genre, tempo, volume, language of lyrics, tonality/harmonicity, complexity, listeners’ preference, listeners’ familiarity to the music, types of musical instrument used, and/or the context in which BgM was played (e.g., during encoding only, during both encoding and recall, etc.). Further subgroup analyses based on individual differences (extraversion level, gender, and music training) were also performed.

Before discussing our findings, the first important observation to make concerning our results is that, for the majority of the comparisons analyzed (e.g., L-BgM vs. Silence), cognitive performance with BgM did not differ from conducting the same tasks in silence (see results summary in Table 11). Nonetheless, it is important to interpret this carefully: many comparisons had a very low number of tests, which means that achieving the level of significance is very difficult even if the vast majority of them report positive or negative effects. Indeed, amongst the comparisons that we analyzed (those with 9 or more tests), approximately one-third of them (38 out of a total of 94 comparisons) had a sample size of 15 tests or less, which means that, unless less than 20% (or more than 80%) of the tests achieved positive signs, it is not possible to achieve statistical significance (see critical values in Table 2). Furthermore, we also could not statistically analyze BgM's effect on a variety of cognitive tasks (see Table 11) due to their small sample of studies (more studies are needed for these tasks).

Table 11.

Summary of the detrimental, facilitative and inconclusive effects of BgM (with corresponding number of tests) on different cognitive tasks and population, organised by BgM, L-BgM, I-BgM, Arousal and/or valence and genre.

^a Interventions / Cognitive tasks	S		BgM		L-BgM		I-BgM		L-BgM (vs. I-BgM)		HA		LA		HA (vs. LA)		HA-N		LA-P
^a Interventions / Cognitive tasks		(n)		(n)		(n)		(n)		(n)		(n)		(n)		(n)		(n)		(n)
Memory (all)	n/a	n/a		(152)	●	(59)		(91)	●	(12)		(16)		(17)		(17)		(14)		(12)
(verbal)
Immediate free recall	n/a	n/a		(25)		(11)		(12)	—	—	—	—	—	—	—	—	—	—	—	—
Delayed free recall	n/a	n/a		(13)	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Immediate PAL	n/a	n/a		(29)	—	—		(25)	—	—	—	—	—	—	—	—	—	—	—	—
Delayed PAL	n/a	n/a		(12)	—	—		(10)	—	—	—	—	—	—	—	—	—	—	—	—
Recognition	n/a	n/a	●	(20)	●	(9)		(11)	—	—	—	—	—	—	—	—	—	—	—	—
Serial recall	n/a	n/a	●	(22)	●	(11)	●	(11)	—	—	—	—	—	—	—	—	—	—	—	—
Working memory	n/a	n/a	—	—	—	—	—		—	—	—	—	—	—	—	—	—	—	—	—
(non-verbal)
Immediate free recall	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Delayed free recall	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Immediate PAL	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Delayed PAL	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Recognition	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Working memory	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Language (all)	n/a	n/a		(106)		(50)		(54)		(21)	—	—	—	—	—	—	—	—	—	—
Language learning	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Linguistics	n/a	n/a		(29)		(15)		(14)		(10)	—	—	—	—	—	—	—	—	—	—
Reading comprehension	n/a	n/a	●	(70)	●	(34)		(34)		(11)	—	—	—	—	—	—	—	—	—	—
Reading speed	n/a	n/a	●	(13)	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Writing fluency	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Writing quality	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Thinking (all)	n/a	n/a		(93)		(46)		(41)		(17)	—	—	—	—	—	—	—	—	—	—
(verbal)
Verbal reasoning	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
(non-verbal)
Math/arithmetics*	n/a	n/a		(51)		(21)		(25)	—	—	—	—	—	—	—	—	—	—	—	—
Non-verbal reasoning	n/a	n/a		(23)		(13)		(10)	—	—	—	—	—	—	—	—	—	—	—	—
Cell interpretation	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
(others)
Creativity	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
IQ tests	n/a	n/a	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Attention (all)	n/a	n/a		(45)		(12)		(28)	—	—	—	—	—	—	—	—	—	—	—	—
Verbal	n/a	n/a		(28)		(9)		(16)	—	—	—	—	—	—	—	—	—	—	—	—
Non-verbal	n/a	n/a		(17)	—	—		(12)	—	—	—	—	—	—	—	—	—	—	—	—
Inhibition (all)	n/a	n/a		(29)	—	—		(18)	—	—	—	—	—	—	—	—	—	—	—	—
Go-NoGo	n/a	n/a		(20)	—	—		(14)	—	—	—	—	—	—	—	—	—	—	—	—
Colour Stroop	n/a	n/a		(9)	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Processing speed (all)	n/a	n/a		(9)	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Task difficulty
Difficult	n/a	n/a		(24)	—	—		(18)	—	—	—	—	—	—	—	—	—	—	—	—
Easy	n/a	n/a		(24)	—	—		(17)	—	—	—	—	—	—	—	—	—	—	—	—
Difficult–easy		(14)	●	(24)	—	—	●	(18)	—	—	—	—	—	—	—	—	—	—	—	—
Individual differences
Introverts	n/a	n/a	●	(34)	●	(27)	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Extraverts	n/a	n/a		(33)		(26)	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Introverts–extraverts		(22)		(33)		(26)	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Musicians	n/a	n/a		(10)	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Nonmusicians	n/a	n/a		(10)	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Males	n/a	n/a		(34)		(17)		(14)	—	—	—	—	—	—	—	—	—	—	—	—
Females	n/a	n/a		(34)		(17)		(14)	—	—	—	—	—	—	—	—	—	—	—	—
Males–females		(14)		(34)		(20)		(14)	—	—	—	—	—	—	—	—	—	—	—	—

Codes. S = silence; BgM = background music; L-BgM = background music with lyrics; I-BgM = instrumental background music; HA = high arousing music; LA = low arousing music; P = positive valence music; N = negative valence music; PAL = paired-associates learning

● = detriment effect; = facilitative effect; = no identifiable effect; dash symbol (-) = no available/sizable analysis; n/a = not applicable

Single-interventions displayed in the heading indicate comparisons with the control condition (i.e., silence/no music); unless when in circumstances when two subgroups are compared (e.g., introverts–extraverts; males–females), it is the comparison between the two specified subgroups within the corresponding condition (e.g., introverts–extraverts in L-BgM only).

^a Interventions / Cognitive tasks	S		P		N		P (vs. N)		Pop		Classical (vs. Pop)
^a Interventions / Cognitive tasks		(n)		(n)		(n)		(n)		(n)		(n)
Memory (all)	n/a	n/a		(19)		(20)		(21)	—	—	—	—
(verbal)
Immediate free recall	n/a	n/a	—	—	—	—		(9)	—	—	—	—
Delayed free recall	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Immediate PAL	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Delayed PAL	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Recognition	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Serial recall	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Working memory	n/a	n/a	—	—	—	—	—	—	—	—	—	—
(non-verbal)
Immediate free recall	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Delayed free recall	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Immediate PAL	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Delayed PAL	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Recognition	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Working memory	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Language (all)	n/a	n/a	—	—	—	—	—	—		(13)		(10)
Language learning	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Linguistics	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Reading comprehension	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Reading speed	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Writing fluency	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Writing quality	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Thinking (all)	n/a	n/a	—	—	—	—	—	—		(14)	—	—
(verbal)
Verbal reasoning	n/a	n/a	—	—	—	—	—	—	—	—	—	—
(non-verbal)
Math/arithmetics	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Non-verbal reasoning	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Cell interpretation	n/a	n/a	—	—	—	—	—	—	—	—	—	—
(others)
Creativity	n/a	n/a	—	—	—	—	—	—	—	—	—	—
IQ tests	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Attention (all)	n/a	n/a	—	—	—	—		(9)	—	—	—	—
Verbal	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Non-verbal	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Inhibition (all)	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Go-NoGo	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Colour Stroop	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Processing speed (all)	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Task difficulty
Difficult	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Easy	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Difficult–easy	—	—	—	—	—	—	—	—	—	—	—	—
Individual differences
Introverts	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Extraverts	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Introverts–extraverts	—	—	—	—	—	—	—	—	—	—	—	—
Musicians	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Nonmusicians	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Males	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Females	n/a	n/a	—	—	—	—	—	—	—	—	—	—
Males–females	—	—	—	—	—	—	—	—	—	—	—	—

Codes. S = silence; P = positive valence music; N = negative valence music; FAM = familiar music; UFAM = unfamiliar music; PAL = paired-associates learning

● = detriment effect; = facilitative effect; = no identifiable effect; dash symbol (-) = no available/sizable analysis; n/a = not applicable

It is also worth mentioning that the methodological quality of the experiments analyzed in this review is generally satisfactory and therefore so is the quality of the evidence presented in this review. Indeed, all studies reported clear sample characteristics, had adequate control condition(s), and used valid and reliable outcome measures. The majority of the studies also adequately controlled for the influence of possible extraneous variables, and provided clear procedural description and justification in regards to the selection and delivery of their interventions (i.e., BgM) and outcome measures (i.e., cognitive tasks). There were nonetheless various limitations that should be considered in future work. Firstly, there is a lack of clear descriptions of the complete empirical procedures (which are crucial for replicating the work). Secondly, many studies did not report statistical data on effect sizes and significance values, which are required for better appraisal of the overall magnitude of an intervention effect on an outcome measure. That being said, in the context of this review, as these studies mainly fall short in terms of the data being reported rather than their methodological quality, we do not think that they pose significant influence to the overall confidence of our findings (especially considering the type of analysis we conducted).

We turn now to our research questions in more detail. For the sake of clarity, Table 11 summarizes our results.

Research Question 1: How Does BgM Affect Performance in Different Types of Cognitive Tasks (i.e., Tasks of Different Cognitive Domains and Levels of Difficulty)?

Our results show that the impact of BgM on cognitive performance differs for different types of cognitive tasks as well as for tasks of different levels of difficulty; and when significant impacts were identified, those impacts are mostly negative. Overall, we found that music with lyrics (compare to silence and instrumental music) has a general detrimental impact in the memory domain, which is particularly evident in tasks involving memory recognition of verbal materials and serial recall tasks (in the case of serial recall, instrumental music also has a detrimental impact). Interestingly, instrumental music led to an improvement in the performance on immediate paired-associates learning of verbal pairs (note that there were no studies that used music with lyrics). In relation to language, the effects were also task-specific, with music (with or without lyrics) slowing down reading speed, and music with lyrics hindering reading comprehension. In relation to task difficulty, we found that instrumental music (there were no experiments using music with lyrics) led to a significant reduction in performance in difficult tasks (compared to easy tasks).

Clearly, our evidence show that BgM seems to affect a small range of domains (language and memory), that the effects of music on cognitive task performance depend on the nature of the task and its difficulty, and almost all effects hinder performance.

Research Question 2: What Are the Music Characteristics (e.g., Lyrics, Volume, Tempo, etc.) that Contribute to the Effect of BgM on Cognitive Task Performance?

Whereas many different types of music characteristics have been manipulated (or identified by us) in the studies included in our review (e.g., presence of lyrics, music complexity, music genre, tempo, loudness, mood, etc.), by far the most common (and with adequate sample sizes for our analyses) was the presence/absence of lyrics, followed by arousal-mood and genre. Clearly, music with lyrics seems to hinder cognitive performance more often than music without lyrics, and it particularly affects memory-related tasks and reading comprehension.

With regard to instrumental music, serial recall (memory domain) and difficult cognitive tasks, in general, were hindered by its presence. This suggests that instrumental music is less likely to affect cognitive task performance (in the context of the tasks evaluated in this review) unless they are complex tasks that are more cognitively demanding. It is also worth mentioning that the only positive effects of music on cognitive performance were related to instrumental music. In sum, there is clear evidence that music with lyrics tend to have more detrimental effects on cognitive task performance compared to instrumental music.

Taken together, both the task and music-specific findings largely agree with the current models of human working memory, specifically of a capacity and structural limit to the working memory system (Baddeley, 2003, 2012; Eysenck & Keane, 2020). Firstly, due to the limits of working memory capacity, tasks that require high levels of cognitive effort (to the extent that they exhaust or overload working memory resources) will limit the quality of cognitive performance (Kahneman, 1973; Norman & Bobrow, 1975). The fact that instrumental music (there are no sufficient data on music with lyrics) impaired performance in cognitive tasks that were generally identified as being ‘difficult’ (regardless of them being verbal or non-verbal tasks) could be a manifestation of an overall cognitive ‘overload’. It is also possible that our results can be related to the structural limits of working memory, whereby ‘[i]f two tasks use the same component [from the working memory system], they cannot be performed successfully together’ (Eysenck & Keane, 2020, p. 247). This is evident in our findings that L-BgM impaired performance in serial recall (of verbal materials), reading comprehension and memory recognition (of verbal materials) tasks. On top of that, serial recall performance is also impaired by I-BgM, and reading speed is significantly slower when BgM was present—all of which are tasks related to verbal processing.

Nonetheless, it is worth noting that following the logic of a structurally limited working memory system, we would expect that BgM should impair performance in all verbal tasks. However, this is not what we found. There is no impact of BgM on linguistic task performance, and no (to potentially weak) impact of BgM on the free recall of verbal materials. We even observed improved performance of immediate paired-associates learning of verbal pairs in effect of I-BgM. Some of the precedents that may support these findings could be earlier studies on patients with phonological deficits (Han & Bi, 2009; Hanley & McDonnell, 1997), as well as studies on the impact of articulatory suppression on paired-associates learnings of foreign and native language vocabularies (Papagno et al., 1991). These studies postulated that although phonological coding is integral in enabling verbal processing (Leinenger, 2014; Slowiaczek & Clifton, 1980), verbal processing is not necessarily always phonologically mediated (Baron, 1973; Han & Bi, 2009; Hanley & McDonnell, 1997; Levy, 1978; Papagno et al., 1991). Particularly in Papagno et al. (1991), the researchers observed that when participants underwent paired-associates learning in a native language, semantic access to the verbal information can be achieved by bypassing phonological coding.

All in all, the trends of behaviors that we have identified through this review may be explained by general models of the working memory as well as findings related to the role of phonological coding in various verbal tasks. Nonetheless, we must also emphasize that given the exploratory nature of our review and the fact that many BgM-performance relationships remain unidentified due to limited samples (e.g., language learning, writing fluency, writing quality, verbal reasoning), these are attempted explanations, and should therefore be investigated further (rather than used to affirm any existing theories).

Research Question 3: What Are the Individual Characteristics (e.g., Personality Traits, Music Education, etc.) that Contribute to the Effect of BgM on Cognitive Task Performance?

The population characteristics analyzed in our review were the personality trait of extraversion (extraverts vs. introverts; 31 experiments), gender (males vs. females; 15 experiments) and self-reported level of music training (musicians vs. nonmusicians; seven experiments). Our results revealed that introverts’ cognitive task performance is clearly hindered by the presence of music with lyrics (when compared to silence), whereas extraverts’ performance is not affected (note that the impact of instrumental music is unknown due to insufficient tests). This effect is also evident that, when directly comparing introverts’ and extraverts’ performance in the presence and absence of music, introverts outperformed extraverts in silence but this advantage disappeared when music was present.

Overall, the results related to extra/introversion seem to cohere partially with Eysenck's theory of personality (Eysenck, 1967). This theory posits that introverts generally have higher cortical arousal at rest, and therefore the presence of BgM during task performance would tip their arousal off the optimal level, leading towards performance decline. On the other hand, extraverts’ task performance should benefit from BgM due to their lower cortical arousal at rest. Our results are consistent with the former depiction. As for the latter, instead of extraverts benefitting from BgM, we only observed that BgM did not affect extraverts’ task performance. It is possible that we failed to identify an existing impact due to methodological limitations, or that the impact of BgM on extraverts’ task performance is only subtle such that they could not be statistically identified in the context of this review. It is also worth mentioning that current theorizations of how personality traits might moderate the influence of BgM on task performance is still debatable (see Küssner, 2017). Therefore, the specific contribution of extra/introversion towards how BgM affects cognitive task performance remains an open-ended question and warrants further studies.

Comparison with Past Reviews

Generally, our review confirmed the findings of Kämpfe et al. (2010) and Vasilev et al. (2018) regarding the detrimental effect of BgM (and particularly music with lyrics) on memory-related tasks and reading comprehension. Our finding that BgM slows down reading speed is also consistent with that of Vasilev et al. (2018). Additionally, we have extended their findings in relation to the specific types of memory-related tasks that are affected by music with lyrics, and the fact that instrumental music generally does not affect cognitive task performance—except for its detrimental impact on serial recall and complex cognitive tasks, and its positive effect on the immediate paired-associates learning of verbal materials (all of which were not identified in past reviews). Furthermore, we have also demonstrated that in order to obtain a comprehensive understanding of how BgM impacts cognitive task performance, we should also account for the contribution of population (i.e., listeners’) characteristics.

Limitations of Our Approach

There are some limitations of this review that must be acknowledged.

Firstly, the use of a vote counting approach to data analysis (imposed by the lack of sufficient data to conduct a meta-analysis) imposes some limitations. Indeed, through vote counting, we synthesized all data based solely on the direction of effect derived from mean differences, and precluded the summation of significance values (i.e., p-values) and effect sizes (as well as population sample sizes). Without data on the overall significance and magnitude of effect, our findings are only demonstrations of the trends of findings but not effect estimates. They should therefore be interpreted with caution. Nonetheless, given the complexity of the field and the lack of consistent findings, our review functions as a preliminary synthesis of available data that provides an exploratory overview of current trends. On this account, our findings can be used by researchers to further the field—for example, the generation of more research questions and/or hypothesis in relation to music listening and cognitive performance.

Secondly, due to our analytical approach, the fact that some tasks, music, and population characteristics have been less studied meant that we could not include them in our analysis. As such, we urge our readers to interpret our results whilst considering: (1) the total number of tests (cf. relevant sign test results table) as well as the number of experiments (cf. Table 5) analyzed for each comparison, (2) the number of comparisons (e.g., L-BgM vs. Silence; Pop music vs. Silence) analyzed for each outcome measure (the total number of comparisons will affect the outcome of the adjusted p-values), (3) both the unadjusted and adjusted p-values, and (4) the corresponding CI for each comparison. These can be helpful for data interpretation, as well as for identifying areas in need for further research.

Challenges for Future Reviews

Given that music listening and cognitive task execution are both highly individualized processes, a review of relevant studies in this topic has its own challenges. For instance, the generalizability of some findings (at least, in the context of this review) could be limited due to methodological heterogeneities amongst studies. As a starting point, the musical pieces used across studies were (naturally) not the same. Unlike studies of the so-called Mozart effect, whereby there is a clear scope concerning the specific music (Mozart's K.448) and cognitive tasks involved (visuospatial tasks), the study of BgM and cognitive task performance does not share this level of specificity. Although some prevalent trends are observable, the unavoidable fact for studies in this discipline is the lack (and impracticality) of controlling or standardizing the intervention stimuli beyond those of general music characteristics (e.g., tempo, loudness, presence of lyrics, etc.). Other variables, such as the complexity of music, are less tangible and more complicated to quantify, but eventually might also have impact on cognitive performance (e.g., Furnham & Allass, 1999; Gonzalez & Aiello, 2019). For instance, the ‘complexity’ of music can be defined on many levels (e.g., instrumental, melodic, harmonic, rhythmic). The constituent of complexity is also multi-faceted (Streich, 2006), operating at different levels depending on the genre (e.g., classical vs. popular music) as well as individual differences (e.g., music training, level of exposure/familiarity, etc.), thereby making quantifying these aspects a complication (see Downie, 2004). Furthermore, not all studies attempted to control for other possible individual differences (e.g., personality traits, working memory capacity, etc.), and we have shown that at least the level of extraversion is clearly central to the impact of background music on cognitive performance. Therefore, the conclusions we are able to draw from the current sample simply address the more prevalent trends in terms of (1) the types of relationship between (particular) BgM and (particular) cognitive task performance that are more robust to additional influences from population characteristics (e.g., serial recall performance), and (2) the contributions of non-musical factors that prevail regardless of the types of cognitive task (e.g., levels of extraversion, task difficulty).

Another challenge in reviewing the evidence on this topic is in synthesizing cognitive tasks with comparable levels of difficulty. In the context of this review, some types of tasks were classified based on a collection of shared cognitive processes rather than by the specific tasks per se. As such, there is the possibility that the associated results could be confounded by differing levels of difficulties amongst the tasks. For example, the linguistic tasks assessed in our review are a broad category that consist of different tasks such as proofreading, matching consonants and vowels, and verbal fluency (to name a few). The execution of verbal fluency and proofreading tasks may require more complex cognitive processes than a task of matching consonants and vowels. However, a further classification by each specific linguistic task was not feasible due to the limited sample. There are no means (at least in the context of this review) to identify how comparable (or not) the difficulty levels amongst these tasks, and the extent to which they are susceptible to the influence of BgM. But as our analysis on the contribution of task difficulty demonstrated a detrimental impact of BgM on difficult tasks, our findings with regard to the impact of BgM on linguistic tasks should be considered with caution, especially in light of the possible confound of task difficulty.

On the other hand, the inconsistent situational and procedural contexts across studies can also pose challenges to the generalizability of some results. For example, the studies on BgM and free recall performance differed in terms of when music was played. Some studies played BgM only during the process of encoding (Ferreri et al., 2015; Woo & Kanachi, 2006), and others from the encoding stage all the way until the end of the testing stage (Bottiroli et al., 2014; Furnham & Bradley, 1997). Some studies also started playing BgM seconds in advance to the start of the cognitive tasks (Bottiroli et al., 2014; Echaide et al., 2019; Ferreri et al., 2015), whilst for others, both music and tasks commenced together (Nguyen & Grahn, 2017; Woo & Kanachi, 2006). Past studies have suggested that when concerning memory consolidation, if the same contextual information (e.g., BgM) is presented during both the encoding and recall stage, it could enhance the formation of memory bonds and prompt better recall (Godden & Baddeley, 1975; Tulving, 1979). Although the studies that set out to actually test this hypothesis found no support for it (Echaide et al., 2019; Ferreri et al., 2015; Nguyen & Grahn, 2017), the sheer difference in the learning and testing environment amongst our sample of studies is still a potential confound that could influence (however subtly) how BgM affected participants’ performance.

Another main challenge in conducting a systematic review on this topic is the trade-off between having a representative sample of articles (but with less robust results) or conducting in-depth, informative, and robust quantitative analysis (but with a smaller and potentially less representative sample of articles). Our quality assessment outcomes from 154 experiments showed that only 3% reported effect sizes, and only 14% reported the exact significance values for all comparisons (both significant and non-significant). To that end, it is not surprising that the meta-analysis conducted by Kämpfe et al. (2010) identified only less than a handful of cognitive tasks, and each with a small number of experiments (eight for reading performance and memory and two for mathematics/arithmetics, compared to our review that includes 21 experiments for reading, 53 for various memory-related tasks, and 15 for mathematics/arithmetics). However, Kämpfe et al. (2010) (as well as Vasilev et al., 2018) provided more in-depth report on the effect sizes of BgM's impact on each cognitive task performance, whereas our sample data only allows a surface-level evaluation of potential trends. With respect to this, we also advise future empirical studies to consider using the MCPAT in the design phase in order to provide high-quality contributions to this area.

Contributions

The contributions of this work and our findings are manifold.

In relation to the research questions we presented at the outset, we clearly show that it is fundamental to consider the nature of the cognitive tasks when evaluating the effect of BgM on cognitive task performance. Our analysis has demonstrated that even cognitive tasks in the same domain can have different levels of susceptibility to the influence of (different types of) BgM. Beyond that, task difficulty (and perhaps other characteristics not evaluated here) can further determine how BgM affects task performance, irrespective of cognitive domain or type of task. Indeed, human cognition is a complex system, and distinctive cognitive tasks are also functionally different (Eysenck & Keane, 2020). Consequently, they could be affected differently by different types of BgM, and evaluations that do not adequately account for the task-specific effects of BgM might not be representative.

Moreover, our results demonstrated that both music and listeners’ characteristics can further influence how BgM affects cognitive task performance. Indeed, we have clearly shown that at least the presence of lyrics (music-related) and the listener's level of extraversion (listener-related) are determinant factors in this process. Therefore, a proper control and reporting of relevant music characteristics and individual variables are important in empirical studies. Note also that the effects of (various) music and listeners’ characteristics should not be reduced to the effects found in this review due to the fact the data available for our review were limited, and as such we were unable to evaluate thoroughly other relevant effects.

Whilst attempting to answer the research questions, we have also provided a thorough perspective on research in this area, with very detailed insights into the studies that have been conducted in the field, and the sub-areas and topics that still lack research. Clearly, a lot has been done (especially in the last 10 years); but, by far, most studies (see summary results in Table 11) have concentrated on the domains of memory (and especially those concerning verbal materials), language and thinking tasks, with particular focus on serial recall, reading comprehension and mathematics/arithmetics test performance. Clearly, other cognitive domains with relevance to everyday life tasks need further research. For instance, there is a lack in the studies of BgM's impact on memory for non-verbal materials, language learning, writing quality, writing fluency, reading fluency, working memory, verbal reasoning, or creativity. Furthermore, more music characteristics may likely moderate the impact of BgM on cognitive performance (e.g., mood, complexity, language of lyrics, instruments used, tonality, dissonance), sometimes in interaction with listener-related characteristics (e.g., familiarity, preferences, musical training/background, working memory capacity, preference for external stimulation).

Another contribution is the MCPAT, which offers an important tool to appraise both the methodological and reporting qualities of experiments related to music listening and cognitive task performance. Given the challenges in this area, we hope that the development of this tool will not only aid the conduct of future systematic reviews, but also to guide the design of future empirical studies. Ideally, we hope that such guideline could aid the development of the field in the long-run—with the production of quality empirical studies and consequently, contributing to reviews with results that are informative, robust, and representative.

Furthermore, our methodological and analytical approaches are also important contributions. For methodological contribution, we have demonstrated through our findings that in order to obtain informative results, domain and task-specific mappings of the cognitive tasks should be performed prior to data synthesis and analysis. With respect to analytical contribution, our analysis approach (i.e., vote counting and sign test analysis) also testify to the SWiM guidelines that in circumstances when a meta-analysis is not plausible, there are next-best alternatives available that generate meaningful statistical inferences using a lesser amount of data from the reported summary statistics (although the limitations of the approaches should be considered when interpreting the results). We hope that these approaches could also serve informative as examples and/or guidelines for future reviews.

Finally, our detailed protocol and data are freely available and we hope that future work can build upon this review by updating it with new findings.

Conclusion and Outlook

With this systematic review, we have provided a structured approach to analyzing previous works on the impact of background music on cognitive tasks performance. By doing so, we shed light in this area and demonstrate that future research must consider task, music and population-related factors from the outset in order to offer impactful results in this area. We have also demonstrated that, with the limited available evidence, music does not seem to have a generalized negative impact on cognitive task performance; but, when it does, it tends to have lyrics, primarily affecting specific memory-related and language-related tasks, and affect disproportionately individuals that display personality traits of introversion. We hope that these findings are also of value for people who habitually accompany their work or study with music, which is particularly relevant given the increasing commonality in desk jobs and remote working and studying (especially during and after the COVID-19 pandemic; Office for National Statistics, 2020; Wong, 2020).

Looking forward, we have some key recommendations for future research. Obviously, and discussed earlier, task, music and population-related factors (and especially the multiplicative impacts amongst these factors) must be considered in any research design. Moreover, there is a clear lack of research is a variety of cognitive domains and tasks that needs to be addressed. On top of that, we suggest that future research ventures beyond the analysis of cognitive tasks based on their content (e.g., verbal or non-verbal materials) and considers the level of cognitive control required to perform the cognitive processes involved in those tasks (some tasks benefit from high-level cognitive control whereas some benefit from a more relaxed cognitive state; Amer et al., 2016). Correspondingly, the potential evaluations could be (1) whether different types of BgM would differentially affect tasks that demand high and low cognitive control, and (2) whether there are multiplicative interactions among the types of music, types of tasks, as well as the level of cognitive control required by those tasks on performance. Furthermore, current studies on the impact of BgM on cognitive tasks performance have been focusing mainly on very distinctive types of tasks (e.g., serial recall, reading comprehension, abstract reasoning, etc.). However, the execution of everyday-life activities (including studying and work-related tasks) are much less straightforward—they involve the combinations of multiple cognitive processes. Therefore, it will be interesting for future studies to evaluate the differentiated impact of different types of BgM (as well as the moderating impacts of individual differences) on performances in more generalizable tasks that reflect those occurring in everyday life (e.g., evaluate how BgM affects the outcomes in actual study or work sessions; see procedures in Calderwood et al. (2014) and Lesiuk (2005)). More generally, we suggest that researchers use this systematic review as a platform for informing future research questions and exploring specific areas in this field.

As a final note, we would like to add that the self-reported benefits of BgM on task performance (by both students and office workers; e.g., Haake, 2011; Kotsopoulou & Hallam, 2010; Lesiuk, 2005) suggest that music facilitates the perceived improvement in performance through its impact on the affective states of listeners (e.g., improving mood, enhancing motivation, promoting relaxation, regulating energy levels), which then facilitates engagement and time spent on the tasks (rather than the impacts on performance per se). Interestingly, this is seldom the focus of research in this area and, whereas the main focus thus far has been on the interference (positive or negative) of BgM on specific cognitive processes, we have yet to assess how they interact with affective responses to the music and their relative importance in terms of task engagement, completion and performance. The interesting question here would be to what extent listening to music is helpful towards performance in a concurrent task at all these levels.

Supplemental Material

sj-docx-1-mns-10.1177_20592043221134392 - Supplemental material for Background Music and Cognitive Task Performance: A Systematic Review of Task, Music, and Population Impact

Supplemental material, sj-docx-1-mns-10.1177_20592043221134392 for Background Music and Cognitive Task Performance: A Systematic Review of Task, Music, and Population Impact by Yiting Cheah, Hoo Keat Wong, Michael Spitzer and Eduardo Coutinho in Music & Science

Footnotes

Acknowledgement

We would like to thank Chairos Loo, Yi-Ning Chuah, Emma Risley, Yee-Wen How, Shu-En Lee, Xin-Er Lee, Nalni Moorthi, Sheng-Yee Wan, Jing-Kai Wong, and Xuen Yu who assisted with data screening and extraction. We also express our gratitude to Manuel Gonzalez, Roger Johansson, Samuel Ken-En Gan, and William Thompson for sharing their research data.

Action Editor

Diana Omigie, Goldsmiths, University of London, Department of Psychology.

Peer review

Luca Kiss, Goldsmiths, University of London, Department of Psychology.

E. Glenn Schellenberg, University of Toronto Mississauga, Department of Psychology.

Author Contributions

YC and EC planned the review and wrote the protocol. YC performed all the searches and YC, EC, and HKW conducted the article selection process. YC led the data extraction with the support of HKW and EC. The data analyses were conducted by YC with the support of EC. YC and EC prepared the manuscript and MS provided feedback on the final version. All authors reviewed and approved the final version of the manuscript.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Ethical Approval

This research did not require ethics committee or IRB approval. This research did not involve the use of personal data, fieldwork, or experiments involving human or animal participants, or work with children, vulnerable individuals, or clinical populations.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by a Ph.D. scholarship awarded by the University of Liverpool to Yiting Cheah.

ORCID iDs

Yiting Cheah

Hoo Keat Wong

Eduardo Coutinho

Supplemental Material

Supplemental material for this article is available at .

Notes

References

Alley

T. R.

Greene

M. E.

(2008). The relative and perceived impact of irrelevant speech, vocal music and non-vocal music on working memory. Current Psychology, 27(4), 277–289. https://doi.org/10.1007/s12144-008-9040-z

Amer

Campbell

K. L.

Hasher

(2016). Cognitive control as a double-edged sword. Trends in Cognitive Sciences, 20(12), 905–915. https://doi.org/10.1016/j.tics.2016.10.002

Amezcua

Guevara

M. A.

Ramos-Loyo

(2005). Effects of musical tempi on visual attention ERPs. International Journal of Neuroscience, 115(2), 193–206. https://doi.org/10.1080/00207450590519094

Angel

L. A.

Polzella

D. J.

Elvers

G. C.

(2010). Background music and cognitive performance. Perceptual and Motor Skills, 110(3), 1059–1064. https://doi.org/10.2466/04.11.22.PMS.110.C.1059-1064

Avila

Furnham

McClelland

(2012). The influence of distracting familiar vocal music on cognitive performance of introverts and extraverts. Psychology of Music, 40(1), 84–93. https://doi.org/10.1177/0305735611422672

Baddeley

(2003). Working memory and language: An overview. Journal of Communication Disorders, 36, 189–208. https://doi.org/10.1016/S0021-9924(03)00019-4

Baddeley

(2012). Working memory: Theories, models, and controversies. Annual Review of Psychology, 63(1), 1–29. https://doi.org/10.1146/annurev-psych-120710-100422

Baron

(1973). Phonemic stage not necessary for reading. Quarterly Journal of Experimental Psychology, 25(2), 241–246. https://doi.org/10.1080/14640747308400343

Beauchene

Abaid

Moran

Diana

R. A.

Leonessa

(2016). The effect of binaural beats on visuospatial working memory and cortical connectivity. PLoS ONE, 11(11), e0166630. https://doi.org/10.1371/journal.pone.0166630

10.

Begum

M. M.

Uddin

M. S.

Rithy

J. F.

Kabir

Tewari

Islam

Ashraf

G. M.

(2019). Analyzing the impact of soft, stimulating and depressing songs on attention among undergraduate students: A cross-sectional pilot study in Bangladesh. Frontiers in Psychology, 10, 161. https://doi.org/10.3389/fpsyg.2019.00161

11.

Benjamini

Yekutieli

(2001). The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics, 29(4), 1165–1188. https://www.jstor.org/stable/2674075

12.

Bonin

Smilek

(2016). Inharmonic music elicits more negative affect and interferes more with a concurrent cognitive task than does harmonic music. Attention, Perception, & Psychophysics, 78(3), 946–959. https://doi.org/10.3758/s13414-015-1042-y

13.

Bottiroli

Rosi

Russo

Vecchi

Cavallini

(2014). The cognitive effects of listening to background music on older adults: Processing speed improves with upbeat music, while memory seems to benefit from both upbeat and downbeat music. Frontiers in Aging Neuroscience, 6, 284. https://doi.org/10.3389/fnagi.2014.00284

14.

Boyle

Coltheart

(1996). Effects of irrelevant sounds on phonological coding in reading comprehension and short term memory. The Quarterly Journal of Experimental Psychology Section A, 49A(2), 398–416. https://doi.org/10.1080/713755630

15.

Brown

L. D.

Cai

T. T.

DasGupta

(2001). Interval estimation for a binomial proportion. Statistical Science, 16(2), 101–133. https://doi.org/10.1214/ss/1009213286

16.

Bull

(2006). Investigating the culture of mobile listening: From Walkman to iPod. In O’Hara

Brown

(Eds.), Consuming music together: Social and collaborative aspects of music consumption technologies (pp. 131–149). Springer Netherlands. https://doi.org/10.1007/1-4020-4097-0_7

17.

Burkhard

Elmer

Kara

Brauchli

Jäncke

(2018). The effect of background music on inhibitory functions: An ERP study. Frontiers in Human Neuroscience, 12, 293. https://doi.org/10.3389/fnhum.2018.00293

18.

Burton

(1986). Relationship between musical accompaniment and learning style in problem solving. Perceptual and Motor Skills, 62(1), 48–50. https://doi.org/10.2466/pms.1986.62.1.48

19.

Bushman

B. J.

Wang

M. C.

(2009). Vote-counting procedures in meta-analysis. In Cooper

Hedges

L. V.

Valentine

J. C.

(Eds.), Handbook of research synthesis and meta-analysis (2nd ed.), pp. 207–220. Russell Sage Foundation.

20.

Calderwood

Ackerman

P. L.

Conklin

E. M.

(2014). What else do college students ‘do’ while studying? An investigation of multitasking. Computers & Education, 75, 19–29. https://doi.org/10.1016/j.compedu.2014.02.004

21.

Campbell

McKenzie

J. E.

Sowden

Katikireddi

S. V.

Brennan

S. E.

Ellis

Hartmann-Boyce

Ryan

Shepperd

Thomas

Welch

Thomson

(2020). Synthesis without meta-analysis (SWiM) in systematic reviews: Reporting guideline. BMJ, 368, l6890. https://doi.org/10.1136/bmj.l6890

22.

Cassidy

MacDonald

R. A. R.

(2007). The effect of background music and background noise on the task performance of introverts and extraverts. Psychology of Music, 35(3), 517–537. https://doi.org/10.1177/0305735607076444

23.

Cauchard

Cane

J. E.

Weger

U. W.

(2012). Influence of background speech and music in interrupted reading: An eye-tracking study. Applied Cognitive Psychology, 26(3), 381–390. https://doi.org/10.1002/acp.1837

24.

Chaimani

Caldwell

D. M.

Higgins

J. P. T.

Salanti

(2021). Chapter 11: Undertaking network meta-analyses. In Higgins

J. P. T.

Thomas

Chandler

Cumpston

Page

M. J.

Welch

V. A.

(Eds.), Cochrane handbook for systematic reviews of interventions version 6.2 (updated February 2021). Cochrane. www.training.cochrane.org/handbook.

25.

Cheah

Y.-T.

Spitzer

Coutinho

(2020). The impact of background music on cognitive task performance: A systematic review. PROSPERO, CRD42020207193. https://www.crd.york.ac.uk/prospero/display_record.php?ID = CRD42020207193.

26.

Chew

A. S.-Q.

Y.-T.

Chua

S.-W.

Gan

S. K.-E.

(2016). The effects of familiarity and language of background music on working memory and language tasks in Singapore. Psychology of Music, 44(6), 1431–1438. https://doi.org/10.1177/0305735616636209

27.

Christopher

E. A.

Shelton

J. T.

(2017). Individual differences in working memory predict the effect of music on student performance. Journal of Applied Research in Memory and Cognition, 6(2), 167–173. https://doi.org/10.1016/j.jarmac.2017.01.012

28.

Cho

(2015). Is background music a distraction or facilitator? An investigation on the influence of background music in L2 writing. Multimedia-Assisted Language Learning, 18(2), 37–58. https://doi.org/10.15702/mall.2015.18.2.37

29.

Chou

P. T.-M.

(2010). Attention drainage effect: How background music effects concentration in Taiwanese college students. Journal of the Scholarship of Teaching and Learning, 10(1), 36–46.

30.

Cockerton

Moore

Norman

(1997). Cognitive test performance and background music. Perceptual and Motor Skills, 85(3), 1435–1438. https://doi.org/10.2466/pms.1997.85.3f.1435

31.

Cowan

Fristoe

N. M.

Elliott

E. M.

Brunner

R. P.

Saults

J. S.

(2006). Scope of attention, control of attention, and intelligence in children and adults. Memory & Cognition, 34, 1754–1768. https://doi.org/10.3758/BF03195936

32.

Crawford

H. J.

Strapp

C. M.

(1994). Effects of vocal and instrumental music on visuospatial and verbal performance as moderated by studying preference and personality. Personality and Individual Differences, 16(2), 237–245. https://doi.org/10.1016/0191-8869(94)90162-7

33.

Crust

Clough

P. J.

Robertson

(2004). Influence of music and distraction on visual search performance of participants with high and low affect intensity. Perceptual and Motor Skills, 98(3), 888–896. https://doi.org/10.2466/pms.98.3.888-896

34.

D’Angelo

(2022, January 14). 20 easy ways to boost your productivity. Business News Daily. https://www.businessnewsdaily.com/5658-easy-productivity-tips.html.

35.

Daoussis

McKelvie

S. J.

(1986). Musical preferences and effects of music on a reading comprehension test for extraverts and introverts. Perceptual and Motor Skills, 62(1), 283–289. https://doi.org/10.2466/pms.1986.62.1.283

36.

Darrow

A.-A.

Johnson

Agnew

Fuller

E. R.

Uchisaka

(2006). Effect of preferred music as a distraction on music majors’ and nonmusic majors’ selective attention. Bulletin of the Council for Research in Music Education, 170, 21–31. https://www.jstor.org/stable/40319346

37.

Daud

S. N. S. S.

Sudirman

(2017). Brain signal analysis to investigate sound effect on memorization. Advanced Science Letters, 23(11), 11119–11123. https://doi.org/10.1166/asl.2017.10233

38.

Davenport

W. G.

(1972). Vigilance and arousal: Effects of different types of background stimulation. The Journal of Psychology: Interdisciplinary and Applied, 82(2), 339–346. https://doi.org/10.1080/00223980.1972.9923824

39.

David

Kim

J.-H.

Brickman

J. S.

Ran

Curtis

C. M.

(2015). Mobile phone distraction while studying. New Media & Society, 17(10), 1661–1679. https://doi.org/10.1177/1461444814531692

40.

Deeks

J. J.

Higgins

J. P. T.

Altman

D. G.

(2021). Chapter 10: Analysing data and undertaking meta-analyses. In Higgins

J. P. T.

Thomas

Chandler

Cumpston

Page

M. J.

Welch

V. A.

(Eds.), Cochrane handbook for systematic reviews of interventions version 6.2 (updated February 2021). Cochrane. www.training.cochrane.org/handbook.

41.

De Groot

A. M. B.

(2006). Effects of stimulus characteristics and background music on foreign language vocabulary learning and forgetting. Language Learning, 56(3), 463–506. https://doi.org/10.1111/j.1467-9922.2006.00374.x

42.

De Groot

A. M. B.

Smedinga

H. E.

(2014). Let the music play! A short-term but no long-term detrimental effect of vocal background music with familiar language lyrics on foreign language vocabulary learning. Studies in Second Language Acquisition, 36(4), 681–707. https://doi.org/10.1017/S0272263114000059

43.

De La Mora Velasco

Hirumi

(2020). The effects of background music on learning: A systematic review of literature to guide future research and practice. Educational Technology Research and Development, 68, 2817–2837. https://doi.org/10.1007/s11423-020-09783-4

44.

Deng

(2020). Impact of background music on reaction test and visual pursuit test performance of introverts and extraverts. International Journal of Industrial Ergonomics, 78, 102976. https://doi.org/10.1016/j.ergon.2020.102976

45.

Downie

J. S.

(2004). The scientific evaluation of music information retrieval systems: Foundations and future. Computer Music Journal, 28(2), 12–23. http://www.jstor.org/stable/3681823

46.

Doyle

Furnham

(2012). The distracting effects of music on the cognitive test performance of creative and non-creative individuals. Thinking Skills and Creativity, 7(1), 1–7. https://doi.org/10.1016/j.tsc.2011.09.002

47.

Echaide

Del Rio

Pacios

(2019). The differential effect of background music on memory for verbal and visuospatial information. The Journal of General Psychology, 146(4), 443–458. https://doi.org/10.1080/00221309.2019.1602023

48.

Evered

Watt

Perham

(2018). Are sound abatement measures necessary in the cytology reading room? A study of auditory distraction. Cytopathology, 29(1), 84–89. https://doi.org/10.1111/cyt.12457

49.

Eysenck

H. J.

(1967). Personality and extra-sensory perception. Journal of the Society for Psychical Research, 44(732), 55–71.

50.

Eysenck

M. W.

Keane

M. T.

(2020). Cognitive psychology: A student’s handbook (8th ed.). Routledge.

51.

Feizpour

Parkington

H. C.

Mansouri

F. A.

(2020). Cognitive sex differences in effects of music in Wisconsin Card Sorting Test. Psychology of Music, 48(2), 252–265. https://doi.org/10.1177/0305735618795030

52.

Fernandez

N. B.

Trost

Vuilleumier

(2019). Brain networks mediating the influence of background music on selective attention. Social Cognitive and Affective Neuroscience, 14(12), 1441–1452. https://doi.org/10.1093/scan/nsaa004

53.

Ferreri

Aucouturier

J.-J.

Muthalib

Bigand

Bugaiska

(2013). Music improves verbal memory encoding while decreasing prefrontal cortex activity: An fNIRS study. Frontiers in Human Neuroscience, 7, 779. https://doi.org/10.3389/fnhum.2013.00779

54.

Ferreri

Bigand

Bard

Bugaiska

(2015). The influence of music on prefrontal cortex during episodic encoding and retrieval of verbal information: A multichannel fNIRS study. Behavioural Neurology, 2015, 707625. https://doi.org/10.1155/2015/707625

55.

Ferreri

Bigand

Perrey

Muthalib

Bard

Bugaiska

(2014). Less effort, better results: How does music act on prefrontal cortex in older adults during verbal encoding? An fNIRS study. Frontiers in Human Neuroscience, 8, 301. https://doi.org/10.3389/fnhum.2014.00301

56.

Fontaine

C. W.

Schwalm

N. D.

(1979). Effects of familiarity of music on vigilant performance. Perceptrral and Motor Skills, 49(1), 71–74. https://doi.org/10.2466/pms.1979.49.1.71

57.

Furnham

Bradley

(1997). Music while you work: The differential distraction of background music on the cognitive test performance of introverts and extraverts. Applied Cognitive Psychology, 11(5), 445–455. https://doi.org/10.1002/(SICI)1099-0720(199710)11:5 < 445::AID-ACP472>3.0.CO;2-R

58.

Furnham

Strbac

(2002). Music is as distracting as noise: The differential distraction of background music and noise on the cognitive test performance of introverts and extraverts. Ergonomics, 45(3), 203–217. https://doi.org/10.1080/00140130210121932

59.

Furnham

Allass

(1999). The influence of musical distraction of varying complexity on the cognitive performance of extroverts and introverts. European Journal of Personality, 13(1), 27–38. https://doi.org/10.1002/(SICI)1099-0984(199901/02)13:1<27::AID-PER318>3.0.CO;2-R

60.

Furnham

Trew

Sneade

(1999). The distracting effects of vocal and instrumental music on the cognitive test performance of introverts and extraverts. Personality and Individual Differences, 27(2), 381–392. https://doi.org/10.1016/S0191-8869(98)00249-9

61.

Geethanjali

Adalarasu

Jagannath

Rajasekaran

(2016a). Enhancement of task performance aided by music. Current Science, 111(11), 1794–1801. https://doi.org/10.18520/CS/V111/I11/1794-1801

62.

Geethanjali

Adalarasu

Jagannath

Rajasekaran

(2016b). Influence of pleasant and unpleasant music on cardiovascular measures and task performance. International Journal of Biomedical Engineering and Technology, 21(2), 128–144. https://doi.org/10.1504/IJBET.2016.077179

63.

Godden

D. R.

Baddeley

A. D.

(1975). Context-dependent memory in two natural environments: On land and underwater. British Journal of Psychology, 66(3), 325–331. https://doi.org/10.1111/j.2044-8295.1975.tb01468.x

64.

Gonzalez

M. F.

Aiello

J. R.

(2019). More than meets the ear: Investigating how music affects cognitive task performance. Journal of Experimental Psychology: Applied, 25(3), 431–444. https://doi.org/10.1037/xap0000202

65.

Greasley

A. E.

Lamont

(2011). Exploring engagement with music in everyday life using experience sampling methodology. Musicae Scientiae, 15(1), 45–71. https://doi.org/10.1177/1029864910393417

66.

Haake

A. B.

(2011). Individual music listening in workplace settings: An exploratory survey of offices in the UK. Musicae Scientiae, 15(1), 107–129. https://doi.org/10.1177/1029864911398065

67.

Han

(2009). Reading comprehension without phonological mediation: Further evidence from a Chinese aphasic individual. Science in China Series C: Life Sciences, 52, 492–499. https://doi.org/10.1007/s11427-009-0048-x

68.

Hanley

J. R.

McDonnell

(1997). Are reading and spelling phonologically mediated? Evidence from a patient with a speech production impairment. Cognitive Neuropsychology, 14(1), 3–33. https://doi.org/10.1080/026432997381600

69.

Herath

R. S.

(2018). How does the effect of background music on the performance of a reading comprehension task differ across musically ‘trained’ and ‘untrained’ individuals? Durham Undergraduate Research in Music and Science, 1, 48–53. https://www.semanticscholar.org/paper/How-Does-the-Effect-of-Background-Music-on-the-of-a-Herath/3aeb69e928bf2f41aa6f69958a5c0cf7234415d7

70.

Mason

Spence

(2007). An investigation into the temporal dimension of the Mozart effect: Evidence from the attentional blink task. Acta Psychologica, 125(1), 117–128. https://doi.org/10.1016/j.actpsy.2006.07.006

71.

Hong

Q. N.

Fàbregues

Bartlett

Boardman

Cargo

Dagenais

Gagnon

M.-P.

Griffiths

Nicolau

O’Cathain

Rousseau

M.-C.

Vedel

Pluye

(2018). The mixed methods appraisal tool (MMAT) version 2018 for information professionals and researchers. Education for Information, 34(4), 285–291. https://doi.org/10.3233/EFI-180221

72.

Hong

Q. N.

Pluye

Fàbregues

Bartlett

Boardman

Cargo

Dagenais

Gagnon

M.-P.

Griffiths

Nicolau

O’Cathain

Rousseau

M.-C.

Vedel

(2018). Mixed methods appraisal tool (MMAT), version 2018. Registration of Copyright (#1148552), Canadian Intellectual Property Office, Industry Canada.

73.

Huang

R.-H.

Shih

Y.-N.

(2011). Effects of background music on concentration of workers. Work, 38(4), 383–387. https://doi.org/10.3233/WOR20111141

74.

Icenogle

Steinberg

Duell

Chein

Chang

Chaudhary

Di Giunta

Dodge

K. A.

Fanti

K. A.

Lansford

J. E.

Oburu

Pastorelli

Skinner

A. T.

Sorbring

Tapanya

Uribe Tirado

L. M.

Alampay

L. P.

Al-Hassan

S. M.

Takash

H. M. S.

Bacchini

(2019). Adolescents’ cognitive capacity reaches adult levels prior to their psychosocial maturity: Evidence for a ‘maturity gap’ in a multinational, cross-sectional sample. Law and Human Behavior, 43(1), 69–85. https://doi.org/10.1037/lhb0000315

75.

Iwanaga

Ito

(2002). Disturbance effect of music on processing of verbal and spatial memories. Perceptual and Motor Skills, 94(3), 1251–1258. https://doi.org/10.2466/pms.2002.94.3c.1251

76.

Jäncke

Brügger

Brummer

Scherrer

Alahmadi

(2014). Verbal learning in the context of background music: No influence of vocals and instrumentals on verbal learning. Behavioral and Brain Functions, 10(1), 10. https://doi.org/10.1186/1744-9081-10-10

77.

Jaušovec

Habe

(2004). The influence of auditory background stimulation (Mozart's sonata K. 448) on visual brain activity. International Journal of Psychophysiology, 51(3), 261–271. https://doi.org/10.1016/S0167-8760(03)00227-7

78.

Johansson

Holmqvist

Mossberg

Lindgren

(2012). Eye movements and reading comprehension while listening to preferred and non-preferred study music. Psychology of Music, 40(3), 339–356. https://doi.org/10.1177/0305735610387777

79.

Juslin

P. N.

Liljeström

Västfjäll

Barradas

Silva

(2008). An experience sampling study of emotional reactions to music: Listener, music, and situation. Emotion (Washington, D.C.), 8(5), 668–683. https://doi.org/10.1037/a0013505

80.

Kahneman

(1973). Attention and effort. Prentice-Hall Inc.

81.

Kang

H. J.

Williamson

V. J.

(2014). Background music can aid second language learning. Psychology of Music, 42(5), 728–747. https://doi.org/10.1177/0305735613485152

82.

Kämpfe

Sedlmeier

Renkewitz

(2010). The impact of background music on adult listeners: A meta-analysis. Psychology of Music, 39(4), 424–448. https://doi.org/10.1177/0305735610376261

83.

Kiss

Linnell

K. J.

(2020). The effect of preferred background music on task-focus in sustained attention. Psychological Research, 85, 2313–2325. https://doi.org/10.1007/s00426-020-01400-6

84.

Kononova

A. G.

Yuan

(2017). Take a break: Examining college students’ media multitasking activities and motivations during study- or work-related tasks. Journalism & Mass Communication Educator, 72(2), 183–197. https://doi.org/10.1177/1077695816649474

85.

Kotsopoulou

Hallam

(2010). The perceived impact of playing music while studying: Age and cultural differences. Educational Studies, 36(4), 431–440. https://doi.org/10.1080/03055690903424774

86.

Kou

McClelland

Furnham

(2018). The effect of background music and noise on the cognitive test performance of Chinese introverts and extraverts. Psychology of Music, 46(1), 125–135. https://doi.org/10.1177/0305735617704300

87.

Kou

McClelland

Furnham

88.

Küssner

M. B.

(2017). Eysenck’s theory of personality and the pole of background music in cognitive task performance: A mini-review of conflicting findings and a new perspective. Frontiers in Psychology, 8, 1991. https://doi.org/10.3389/fpsyg.2017.01991

89.

Küssner

M. B.

De Groot

A. M. B.

Hofman

W. F.

Hillen

M. A.

(2016). EEG beta power but not background music predicts the recall scores in a foreign-vocabulary learning task. PLoS ONE, 11(8), e0161387. https://doi.org/10.1371/journal.pone.0161387

90.

Lehmann

J. A. M.

Seufert

(2017). The influence of background music on learning in the light of different theoretical perspectives and the role of working memory capacity. Frontiers in Psychology, 8, 1902. https://doi.org/10.3389/fpsyg.2017.01902

91.

Leinenger

(2014). Phonological coding during reading. Psychological Bulletin, 140(6), 1534–1555. https://doi.org/10.1037/a0037830

92.

Levy

B. A.

(1978). Speech processing during reading. In Lesgold

A. M.

Pellegrino

J. W.

Fokkema

S. D.

Glaser

(Eds.), Cognitive psychology and instruction (pp. 123–151). Plenum Press.

93.

Lesiuk

(2005). The effect of music listening on work performance. Psychology of Music, 33(2), 173–191. https://doi.org/10.1177/0305735605050650

94.

Liu

Huang

Wang

(2012). The influence of background music on recognition processes of Chinese characters: An ERP study. Neuroscience Letters, 518(2), 80–85. https://doi.org/10.1016/j.neulet.2012.04.055

95.

Liu

Lin

C.-C.

Huang

K.-C.

Chen

Y.-C.

(2017). Effects of noise type, noise intensity, and illumination intensity on reading performance. Applied Acoustics, 120, 70–74. https://doi.org/10.1016/j.apacoust.2017.01.019

96.

Lonsdale

A. J.

North

A. C.

(2011). Why do we listen to music? A uses and gratifications analysis. British Journal of Psychology, 102, 108–134. https://doi.org/10.1348/000712610X506831

97.

Luciana

Conklin

H. M.

Hooper

C. J.

Yarger

R. S.

(2005). The development of nonverbal working memory and executive control processes in adolescents. Child Developent, 76(3), 97–712. https://doi.org/10.1111/j.1467-8624.2005.00872.x

98.

Mammarella

Fairfield

Cornoldi

(2007). Does music enhance cognitive performance in healthy older adults? The Vivaldi effect. Aging Clinical and Experimental Research, 19(5), 394–399. https://doi.org/10.1007/BF03324720

99.

Mansouri

F. A.

Acevedo

Illipparampil

Fehring

D. J.

Fitzgerald

P. B.

Jaberzadeh

(2017). Interactive effects of music and prefrontal cortex stimulation in modulating response inhibition. Scientific Reports, 7(1), 18096. https://doi.org/10.1038/s41598-017-18119-x

100.

Mansouri

F. A.

Fehring

D. J.

Gaillard

Jaberzadeh

Parkington

(2016). Sex dependency of inhibitory control functions. Biology of Sex Differences, 7(1), 11. https://doi.org/10.1186/s13293-016-0065-y

101.

Manthei

Kelly

S. N.

(1999). Effects of popular and classical background music on math test scores of undergraduate students. Research Perspectives in Music Education, 6(1), 38–42.

102.

Martin

R. C.

Wogalter

M. S.

Forlano

J. G.

(1988). Reading comprehension in the presence of unattended speech and music. Journal of Memory and Language, 27(4), 382–398. https://doi.org/10.1016/0749-596X(88)90063-0

103.

Masataka

Perlovsky

(2013). Cognitive interference can be mitigated by consonant music and facilitated by dissonant music. Scientific Reports, 3, 2028. https://doi.org/10.1038/srep02028

104.

Mayfield

Moss

(1989). Effect of music tempo on task performance. Psychological Reports, 65(3), 1283–1290. https://doi.org/10.2466/pr0.1989.65.3f.1283

105.

McKenzie

J. E.

Brennan

S. E.

(2021). Chapter 12: Synthesizing and presenting findings using other methods. In Higgins

J. P. T.

Thomas

Chandler

Cumpston

Page

M. J.

Welch

V. A.

(Eds.), Cochrane handbook for systematic reviews of interventions version 6.2 (updated February 2021). Cochrane. www.training.cochrane.org/handbook.

106.

Miller

L. K.

Schyb

(1989). Facilitation and interference by background music. Journal of Music Therapy, 26(1), 42–54. https://doi.org/10.1093/jmt/26.1.42

107.

Nittono

(1997). Background instrumental music and serial recall. Perceptual and Motor Skills, 84(3_suppl), 1307–1313. https://doi.org/10.2466/pms.1997.84.3c.1307

108.

Norman

D. A.

Bobrow

D. G.

(1975). On data-limited and resource-limited processes. Cognitive Psychology, 7(1), 44–64. https://doi.org/10.1016/0010-0285(75)90004-3

109.

Nguyen

Grahn

J. A.

(2017). Mind your music: The effects of music-induced mood and arousal across different memory tasks. Psychomusicology: Music, Mind, and Brain, 27(2), 81–94. https://doi.org/10.1037/pmu0000178

110.

North

A. C.

Hargreaves

D. J.

Hargreaves

J. J.

(2004). Uses of music in everyday life. Music Perception, 22(1), 41–77. https://doi.org/10.1525/mp.2004.22.1.41

111.

Office for National Statistics (2020). Coronavirus and homeworking in the UK: April 2020. Office for National Statistics. https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/employmentandemployeetypes/bulletins/coronavirusandhomeworkingintheuk/april2020.

112.

Ouzzani

Hammady

Fedorowicz

Elmagarmid

(2016). Rayyan—a web and mobile app for systematic reviews. Systematic Reviews, 5, 210. https://doi.org/10.1186/s13643-016-0384-4

113.

Page

M. J.

McKenzie

J. E.

Bossuyt

P. M.

Boutron

Hoffmann

T. C.

Mulrow

C. D.

Shamseer

Tetzlaff

J. M.

Akl

E. A.

Brennan

S. E.

Chou

Glanville

Grimshaw

J. M.

Hróbjartsson

Lalu

M. M.

Loder

E. W.

Mayo-Wilson

McDonald

McGuinness

L. A.

Stewart

L. A.

Thomas

Tricco

A. C.

Welch

V. A.

Whiting

Moher

(2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ, 372(n71). https://doi.org/10.1136/bmj.n71

114.

Papagno

Valentine

Baddeley

(1991). Phonological short-term memory and foreign-language vocabulary learning. Journal of Memory and Language, 30(3), 331–347. https://doi.org/10.1016/0749-596X(91)90040-Q

115.

Parente

J. A.

(1976). Music preference as a factor of music distraction. Perceptual and Motor Skills, 43(1), 337–338. https://doi.org/10.2466/pms.1976.43.1.337

116.

Patston

L. L. M.

Tippett

L. J.

(2011). The effect of background music on cognitive performance in musicians and nonmusicians. Music Perception, 29(2), 173–183. https://doi.org/10.1525/mp.2011.29.2.173

117.

Pavlygina

R. A.

Sakharov

D. S.

Davydov

V. I.

Avdonkin

A. V.

(2010). Influence of music with different volumes and styles on recognition activity in humans. Neuroscience and Behavioral Physiology, 40(8), 877–884. https://doi.org/10.1007/s11055-010-9336-y

118.

Pavlyugina

R. A.

Karamysheva

N. N.

Sakharov

D. S.

Davydov

V. I.

(2012). Influence of music on the solution of mathematical logical tasks. Human Physiology, 38(4), 354–360. https://doi.org/10.1134/S0362119712030097

119.

Perham

Currie

(2014). Does listening to preferred music improve reading comprehension performance? Applied Cognitive Psychology, 28(2), 279–284. https://doi.org/10.1002/acp.2994

120.

Perham

Sykora

(2012). Disliked music can be better for performance than liked music. Applied Cognitive Psychology, 26(4), 550–555. https://doi.org/10.1002/acp.2826

121.

Perham

Vizard

(2011). Can preference for background music mediate the irrelevant sound effect? Applied Cognitive Psychology, 25(4), 625–631. https://doi.org/10.1002/acp.1731

122.

Proverbio

A. M.

De Benedetto

(2018). Auditory enhancement of visual memory encoding is driven by emotional content of the auditory material and mediated by superior frontal cortex. Biological Psychology, 132, 164–175. https://doi.org/10.1016/j.biopsycho.2017.12.003

123.

Proverbio

A. M.

De Benedetto

Ferrari

M. V.

Ferrarini

(2018b). When listening to rain sounds boosts arithmetic ability. PLoS ONE, 13(2), e0192296. https://doi.org/10.1371/journal.pone.0192296

124.

Proverbio

A. M.

Nasi

V. L.

Arcari

L. A.

De Benedetto

Guardamagna

Gazzola

Zani

(2015). The effect of background music on episodic memory and autonomic responses: Listening to emotionally touching music enhances facial memory capacity. Scientific Reports, 5, 15219. https://doi.org/10.1038/srep15219

125.

Randall

W. M.

Rickard

N. S.

(2017). Reasons for personal music listening: A mobile experience sampling study of emotional outcomes. Psychology of Music, 45(4), 479–495. https://doi.org/10.1177/0305735616666939

126.

Ransdell

S. E.

Gilroy

(2001). The effects of background music on word processed writing. Computers in Human Behavior, 17(2), 141–148. https://doi.org/10.1016/S0747-5632(00)00043-1

127.

Rentfrow

P. J.

(2012). The role of music in everyday life: Current directions in the social psychology of music. Social and Personality Psychology Compass, 6(5), 402–416. https://doi.org/10.1111/j.1751-9004.2012.00434.x

128.

Reynolds

McClelland

Furnham

(2014). An investigation of cognitive test performance across conditions of silence, background noise and music as a function of neuroticism. Anxiety, Stress. & Coping, 27(4), 410–421. https://doi.org/10.1080/10615806.2013.864388

129.

Riby

L. M.

(2013). The joys of Spring changes in mental alertness and brain function. Experimental Psychology, 60(2), 71–79. https://doi.org/10.1027/1618-3169/a000166

130.

Ritter

S. M.

Ferguson

(2017). Happy creativity: Listening to happy music facilitates divergent thinking. PLoS ONE, 12(9), e0182210. https://doi.org/10.1371/journal.pone.0182210

131.

Röer

J. P.

Bell

Buchner

(2014). Evidence for habituation of the irrelevant-sound effect on serial recall. Memory & Cognition, 42(4), 609–621. https://doi.org/10.3758/s13421-013-0381-y

132.

Robb

S. L.

Carpenter

J. S.

Burns

D. S.

(2011). Reporting guidelines for music-based interventions. Journal of Health Psychology, 16(2), 342–352. https://doi.org/10.1177/1359105310374781

133.

Robinson

(2020, March 14). 9 tips to be productive when working at home during COVID-19. https://www.forbes.com/sites/bryanrobinson/2020/03/14/9-tips-to-be-productive-when-working-at-home-during-covid-19/.

134.

RStudio Team (2021). RStudio: Integrated development environment for R. RStudio, PBC. http://www.rstudio.com/.

135.

Salamé

Baddeley

(1989). Effects of background music on phonological short-term memory. The Quarterly Journal of Experimental Psychology Section A, 41(1), 107–122. https://doi.org/10.1080/14640748908402355

136.

Schellenberg

E. G.

Weiss

(2013). Music and cognitive abilities. In The psychology of music (pp. 499–550). https://doi.org/10.1016/b978-0-12-381460-9.00012-2

137.

Scherer

(2018). PropCIs: Various confidence interval methods for proportions. R package version 0.3-0. https://CRAN.R-project.org/package = PropCIs.

138.

Shih

Y.-N.

Huang

R.-H.

Chiang

H.-S.

(2009). Correlation between work concentration level and background music: A pilot study. Work, 33(3), 329–333. https://doi.org/10.3233/WOR-2009-0880

139.

Slowiaczek

M. L.

Clifton

(1980). Subvocalization and reading for meaning. Journal of Verbal Learning and Verbal Behavior, 19(5), 573–582. https://doi.org/10.1016/S0022-5371(80)90628-3

140.

Sloboda

J. A.

O’Neill

S. A.

Ivaldi

(2001). Functions of music in everyday life: An exploratory study using the experience sampling method. Musicae Scientiae, 5(1), 9–32. https://doi.org/10.1177/102986490100500102

141.

Sogin

D. W.

(1988). Effects of three different musical styles of background music on coding by college-age students. Perceptual and Motor Skills, 67(1), 275–280. https://doi.org/10.2466/pms.1988.67.1.275

142.

Spherion Staffing & Recruiting (2022). 6 tips for maintaining productivity. https://www.spherion.com/career-advice/general/6-tips-maintaining-productivity/.

143.

Streich

. (2006). Music complexity: A multi-faceted description of audio content [Doctoral dissertation, Universitat Pompeu Fabra]. Barcelona.

144.

Steinberg

Cauffman

Woolard

Graham

Banich

(2009). Are adolescents less mature than adults? Minors’ access to abortion, the juvenile death penalty, and the alleged APA ‘flip-flop’. American Psychologist, 64(7), 583–594. https://doi.org/10.1037/a0014763

145.

Sterne

J. A. C.

Hernán

M. A.

Reeves

B. C.

Savović

Berkman

N. D.

Viswanathan

Henry

Altman

D. G.

Ansari

M. T.

Boutron

Carpenter

J. R.

Chan

A.-W.

Churchill

Deeks

J. J.

Hróbjartsson

Kirkham

Jüni

Loke

Y. K.

Pigott

T. D.

Ramsay

C. R.

Regidor

Rothstein

H. R.

Sandhu

Santaguida

P. L.

Schünemann

H. J.

Shea

Shrier

Tugwell

Turner

Valentine

J. C.

Waddington

Waters

Wells

G. A.

Whiting

P. F.

Higgins

J. P. T.

(2016). ROBINS-I: A tool for assessing risk of bias in non-randomized studies of interventions. BMJ, 355, i4919. https://doi.org/10.1136/bmj.i4919

146.

Sterne

J. A. C.

Higgins

J. P. T.

Elbers

R. G.

Reeves

B. C.

(2016). Risk of bias in non-randomized studies of interventions (ROBINS-I): Detailed guidance, updated 12 October 2016. http://www.riskofbias.info.

147.

Sterne

J. A. C.

Savović

Page

M. J.

Elbers

R. G.

Blencowe

N. S.

Boutron

Cates

C. J.

Cheng

H.-Y.

Corbett

M. S.

Eldridge

S. M.

Emberson

J. R.

Hernán

M. A.

Hopewell

Hróbjartsson

Junqueira

D. R.

Jüni

Kirkham

J. J.

Lasserson

McAleenan

Reeves

B. C.

Shepperd

Shrier

Stewart

L. A.

Tilling

White

I. R.

Whiting

P. F.

Higgins

J. P. T.

(2019). Rob 2: A revised tool for assessing risk of bias in randomised trials. BMJ, 366, l4898. https://doi.org/10.1136/bmj.l4898

148.

Stratton

V. N.

Zalanowski

A. H.

(2003). Daily music listening habits in college students: Related moods and activities. Psychology and Education: An Interdisciplinary Journal, 40(1), 1–11.

149.

Taylor

J. M.

Rowe

B. J.

(2012). The “Mozart Effect” and the mathematical connection. Journal of College Reading and Learning, 42(2), 51–66. https://doi.org/10.1080/10790195.2012.10850354

150.

Thaut

M. H.

de I'Etoile

S. K.

(1993). The effects of music on mood state-dependent recall. Journal of Music Therapy, 30(2), 70–80. https://doi.org/10.1093/jmt/30.2.70

151.

Thayer

R. E.

(1990). The biopsychology of mood and arousal. Oxford University Press.

152.

Thompson

W. F.

Schellenberg

E. G.

Letnic

A. K.

(2012). Fast and loud background music disrupts reading comprehension. Psychology of Music, 40(6), 700–708. https://doi.org/10.1177/0305735611400173

153.

Threadgold

Marsh

J. E.

McLatchie

Ball

L. J.

(2019). Background music stints creativity: Evidence from compound remote associate tasks. Applied Cognitive Psychology, 33(5), 873–888. https://doi.org/10.1002/acp.3532

154.

Tulving

(1979). Relation between encoding specificity and levels of processing. In Cermak

L. S.

Craik

F. I. M.

(Eds.), Levels of processing in human memory (1st ed., pp. 405–428). Psychology Press.

155.

Vasilev

M. R.

Kirkby

J. A.

Angele

(2018). Auditory distraction during reading: A Bayesian meta-analysis of a continuing controversy. Perspectives on Psychological Science, 13(5), 567–597. https://doi.org/10.1177/1745691617747398

156.

Verga

Bigand

Kotz

S. A

. (2015). Play along: Effects of music and social interaction on word learning. Frontiers in Psychology, 6, 1316. https://doi.org/10.3389/fpsyg.2015.01316

157.

Wells

Littell

J. H.

(2009). Study quality assessment in systematic reviews of research on intervention effects. Research on Social Work Practice, 19(1), 52–62. https://doi.org/10.1177/1049731508317278

158.

Whiting

Wolff

Mallett

Simera

Savović

(2017). A proposed framework for developing quality assessment tools. Systematic Reviews, 6(204). https://doi.org/10.1186/s13643-017-0604-6

159.

Wolf

R. H.

Weiner

F. F.

(1972). Effects of four noise conditions on arithmetic performance. Perceptual and Motor Skills, 35(3), 928–930. https://doi.org/10.2466/pms.1972.35.3.928

160.

Wolfe

D. E.

(1983). Effects of music loudness on task performance and self-report of college-aged students. Journal of Research in Music Education, 31(3), 191–201. https://doi.org/10.2307/3345172

161.

Wong

(2020, June 29). Stanford research provides a snapshot of a new working-from-home economy. Stanford News. https://news.stanford.edu/2020/06/29/snapshot-new-working-home-economy/.

162.

Woo

E. W.

Kanachi

(2006). The effects of music type and volume on short-term memory. Tohoku Psychologica Folia, 64, 68–76. http://hdl.handle.net/10097/54719

163.

C.-C.

Shih

Y.-N.

(2019). The effects of background music on the work attention performance between musicians and non-musicians. International Journal of Occupational Safety and Ergonomics, 27(1), 1–5. https://doi.org/10.1080/10803548.2018.1558854

164.

Xiao

Liu

Chen

(2020). The influence of music tempo on inhibitory control: An ERP study. Frontiers in Behavioral Neuroscience, 14, 48. https://doi.org/10.3389/fnbeh.2020.00048

165.

Zhu

Zhang

Ding

Zhou

(2009). Crossmodal effects of Guqin and piano music on selective attention: An event-related potential study. Neuroscience Letters, 466(1), 21–26. https://doi.org/10.1016/j.neulet.2009.09.026

166.

Zhu

Zhao

Zhang

Ding

Liu

Zhou

(2008). The influence of Mozart's sonata K.448 on visual attention: An ERPs study. Neuroscience Letters, 434(1), 35–40. https://doi.org/10.1016/j.neulet.2008.01.043

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.08 MB

Background Music and Cognitive Task Performance: A Systematic Review of Task,Music,and Population Impact

Abstract

Keywords

Introduction

Music During the Execution of Cognitive Tasks: Good or Bad?

Revisiting the Evidence

Methods

Protocol and Registration

Eligibility Criteria

Information Sources and Search Strategy

Screening Process

Data Extraction Process

Data Classification and Synthesis

Classification

Synthesis

Assessment of Quality

Results

Studies Characteristics

Quality Assessment

Impact of BgM on Different Cognitive Domains

Memory

Language

Task Difficulty

The Contributions of Individual Characteristics

Discussion

Research Question 1: How Does BgM Affect Performance in Different Types of Cognitive Tasks (i.e., Tasks of Different Cognitive Domains and Levels of Difficulty)?

Research Question 2: What Are the Music Characteristics (e.g., Lyrics, Volume, Tempo, etc.) that Contribute to the Effect of BgM on Cognitive Task Performance?

Research Question 3: What Are the Individual Characteristics (e.g., Personality Traits, Music Education, etc.) that Contribute to the Effect of BgM on Cognitive Task Performance?

Comparison with Past Reviews

Limitations of Our Approach

Challenges for Future Reviews

Contributions

Conclusion and Outlook

Supplemental Material

sj-docx-1-mns-10.1177_20592043221134392 - Supplemental material for Background Music and Cognitive Task Performance: A Systematic Review of Task, Music, and Population Impact

Footnotes

Acknowledgement

Action Editor

Peer review

Author Contributions

Declaration of Conflicting Interests

Ethical Approval

Funding

ORCID iDs

Supplemental Material

Notes

References

Supplementary Material