Abstract
Music tempo affects listeners’ mental state, especially arousal levels. However, several studies have demonstrated that the effect of music tempo on arousal while listening to music can be modulated by individual differences, such as the pace of mental activity, that is, spontaneous motor tempo (SMT). Thus, SMT is a candidate factor that affects the relationship between music tempo and arousal. Here, we conducted a psychological experiment to investigate how SMT modulates the effect of music tempo on listeners’ arousal levels. First, the participants were required to tap their finger at their preferred tempo to identify the SMT of each participant. Next, the participants listened to music and then rated their arousal levels on a nine-point scale. A linear mixed model analysis revealed a significant effect of the interaction between music tempo and preferred tapping tempo on arousal levels. This finding indicates that SMT modulated the effect of music tempo on arousal levels while listening to music; the faster the SMT of a listener, the greater the impact of music tempo on arousal levels.
Humans listen to music not only for musical enjoyment but also for the benefits of music. One of the most important benefits of music is control of the listeners’ mental state, especially arousal levels (Gaston, 1951). We occasionally listen to music when we want to make our arousal levels match our intended arousal levels. For example, people may listen to music with a faster tempo to increase their arousal levels and vice versa. However, how listening to music affects listeners’ arousal levels is not fully understood. Recently, music has been used in clinical treatments to control patient arousal levels (Gold et al., 2006; Maratos et al., 2008). Nonetheless, not all patients can respond to these treatments (Vink et al., 2003). In other words, the impact of music changes depending on individual differences. Thus, to enhance the benefits for music listeners, it is important to determine how music affects arousal levels and, more specifically, which factors contribute to individual differences in the impact of music on arousal levels.
Music tempo is one of the factors affecting listeners’ arousal (e.g., Gagnon & Peretz, 2003; Hevner, 1937). When music was manipulated in terms of tempo only, faster tempi induced a higher arousal level than slower tempi (e.g., Peretz et al., 1998; Van der Zwaag et al., 2011). However, the arousal levels induced by listening to music differ among listeners even when the same music is presented at the same tempo (Mas-Herrero et al., 2012; Salimpoor et al., 2009). In addition, it has been reported that music therapy, in which music is used to modulate patients’ arousal (Gold et al., 2006; Maratos et al., 2008), is not effective for all patients (Vink et al., 2003), indicating that there are individual differences in the effect of music tempo on arousal regulation. Thus, it is important to clarify how music tempo modulates arousal levels in each listener not only to understand the role of music in regulating emotion but also to apply music therapy effectively.
Regarding individual differences in the effect of music tempo, it has been reported that the preferred music tempo is related to the spontaneous motor tempo (SMT) of each listener (Hine et al., 2022). Preferred music tempo is defined as a specific tempo preferred by the listener regardless of the original music tempo, intended by the composer or performer. SMT is an inherent, self-determined pace of movements that align with an individual’s preferred and instinctive pace of action (Delevoye-Turrell et al., 2014). SMT is usually measured by the natural speed of tapping and is considered to reflect the pace of an internal clock (e.g., Fraisse, 1982; Hammerschmidt et al., 2021; McAuley, 2010). Each piece of music has a specific preferred tempo (Iwanaga & Tsukamoto, 1998). In a study where musical pieces were presented at original music tempi and at manipulated tempi at 20% level increments from 40 to 223 bpm, listeners showed differences in preferred tempo of music depending on familiarity. For familiar music, listeners preferred original music tempo, whereas for unfamiliar music, a moderate tempo was preferred (Iwanaga & Tsukamoto, 1998). In addition, the SMT for each listener modulates the preferred music tempo even when the same piece of music is played, regardless of familiarity (Hine et al., 2022). Furthermore, the original music tempo affects listeners’ arousal levels (Husain et al., 2002). If this is the case, it is possible that the effect of original music tempo on listeners’ arousal levels will be modulated by SMT. However, this possibility has not yet been directly assessed.
In the current study, we investigated how SMT modulates the effect of original music tempo on listeners’ arousal levels. First, we estimated the SMT of each participant by asking them to tap their finger at their preferred tempo. Next, participants listened to a piece of music and then rated their arousal level on a nine-point scale. While listening to the music, the skin conductance level (SCL) and pulse rate (PR), which are related to arousal levels (Lang et al., 1993), were recorded to evaluate how autonomic tone and SMT affect arousal levels. Previous studies have used SCL and heart rate, which are related to PR (Schäfer & Vagedes, 2013) to assess the effect of music on arousal levels (Etzel et al., 2006; Gomez and Danuser, 2004, 2007). The collected data were analysed by a linear mixed model to determine how the original music tempo, SMT, SCL, and PR predicted the arousal level while listening to music. Noting that both SCL and skin conductance response (SCR) results in sympathetic activity, SCL is thought to indicate broad shifts in automatic processes, spanning from several tens of seconds to several minutes, whereas SCR is perceived to signal quicker alterations, occurring within seconds (Braithwaite et al., 2013). In previous studies, SCL is used when investigating the relationship between situational factors and participants’ psychological factors over an extended period (e.g., Lazarus et al., 1962; Speisman et al., 1964). On the contrary, SCR is utilized when assessing the physiological response that occurs before an individual becomes aware or takes action (e.g., Bechara et al., 1997). In the current study, it was investigated how physiological response relates to arousal level for various temps of music (30 s) rather than how physiological response changed before their behavioural response (arousal level). Thus, SCL was analysed in this study.
Materials and methods
Participants
We calculated the sample size using the samplesize_mixed function in R (from the sjstats package) with an effect size (Cohen’s d) = 0.25 and power (1 − β) = 0.83. A total sample size of 627 was recommended. As described below, 24 pieces of music were prepared, and all music pieces were used once for each participant. Thus, we decided to recruit 26 adults to ensure the number of data points (24 × 26 = 624) approached the recommended number. Twenty-six men (aged 20–27 years, Mage = 22.8 years, SD = 1.4) participated in the current experiment. All participants had normal hearing and normal or corrected-to-normal vision. Informed consent was obtained from all participants. The experimental procedures were approved by the institutional review board of Toyohashi University of Technology (approval no. H31-01). All experiments were conducted in accordance with the Declaration of Helsinki. Data from five participants were excluded from the analysis due to errors in data collection; consequently, data from 21 participants were included in analysis.
Stimulus
Twenty-four pieces of music (Table 1) were chosen from the Classical Piano Midi Page (2018). These pieces were not recorded from standard human performances but were instead constructed by inputting the required musical attributes into a MIDI sequencer, note by note. All the chosen pieces belonged to the classical genre and were designed to be played solo on a piano. Several experimenters verified that there were no alterations in tonal quality or tempo during the course of each piece of music. In cases where a beat did not correspond to a quarter note in the MIDI file, the tempo of the music was calculated with a quarter note as a beat. The original music tempo was between 42 and 199 beats per minute (bpm), in which one beat was defined as a quarter note (crotchet beat in the sheet music). Musical data were analysed with the MIDI Toolbox (Eerola & Toiviainen, 2004) running on MATLAB R2018b (The Mathworks, Natick, MA, USA).
List of Music Used in the Current Study.
Procedure
Tapping task
First, the participants completed a tapping task (Figure 1, left). The procedure was similar to that of Hine et al. (2022). Participants were asked to tap the index finger of their dominant hand on an iPad screen (Apple), which recorded the tapping rate, at the tempo that they felt was the most comfortable, natural, and preferred at that moment. In addition, they were told not to imagine a specific piece of music or song during the tapping task, and no stimuli were presented on the screen or speaker. The data were collected in two trials of 30 s each. Between trials, participants were allowed to take as long of a break as they desired.

Experimental Design.
Music listening task
Subsequently, a music listening task was conducted (Figure 1, centre). The music was processed with software developed in-house on a personal computer and presented to the participants over headphones (MDR-M1ST, SONY). For the music listening task, the SCL and PR were recorded with a BIOPAC MP-160 at a 200-Hz sampling rate (BIOPAC Systems Inc., Goleta, CA, USA). The SCL was measured by an electrodermal activity amplifier (EDA100C, BIOPAC Systems Inc.). PR was measured by using a photoplethysmography (PPG) device (a PPG100C amplifier, BIOPAC Systems Inc.). PPG is a method used to measure changes in blood volume by utilizing an infrared light sensor positioned on the skin’s surface (Allen, 2007; Elgendi, 2012). The SCL and PR were acquired with sensors attached to the distal phalanges of the second and third fingers of the participant’s left hand. The data were collected with AcqKnowledge 5.0 software.
At the beginning of a trial, the participants asked to sit in a seat and remain stationary for 30 s. In our preliminary experiment to validate the methodology, participants were surprised by the sudden presentation of music after 30 s of silence, and their surprise affected their response, especially the physiological data. To avoid participant surprise at the sudden presentation of music, a countdown (numbers changing from 10 to 1, 1 s for each number) was presented on the display 20 s after the beginning of the trial. Once the countdown reached zero, music was presented for 30 s. To prevent surprise due to a sudden stop, the music faded out in the last 5 s. After listening to the music, the participants rated their arousal levels while listening to the music on a nine-point scale (1 = calm to 9 = excited) using the hand that was not attached to the sensors. In addition, valence (1 = negative to 9 = positive), engagement (1 = bored to 9 = amused), and suspense (1 = relaxed to 9 = tense) were evaluated, and these results are presented in the Supplementary Material. During ratings of arousal and valence, the Self-Assessment Manikin rating scale (Bradley & Lang, 1994) was presented on the display. The evaluation order was always the same: arousal, valence, engagement, and suspense. After the evaluations were completed, the next trial began. Twenty-four pieces of music were presented, and the order in which the music pieces were presented was randomized. Data from five participants were excluded from further analysis because we failed to record SCL and/or PR data.
Familiarity judgement task
Finally, all participants completed a familiarity judgement task (Figure 1, right). The participants rated the familiarity of the music piece on a three-point scale (unfamiliar, neutral, or familiar), regardless of whether the presented tempo was familiar to each participant. To check whether participants had known the music pieces before the experiment, they were told to rate their familiarity. The participants wrote their rating on an answer sheet and then pressed the enter key to advance to the next piece of music. The music piece was presented repeatedly from the beginning until the enter key was pressed. The participants indicated their familiarity with all 24 pieces of music, which were presented in a randomized order. The randomized order was not the same as the order in which the participants heard them in the music listening task.
Physiological signals processing
A MATLAB programme built by experimenters was used to pre-process and clean the physiological signals. Regarding the SCL, AcqKnowledge 5.0 was used to extract SCL from electrodermal activity. Recorded data were downsampled to 100 Hz. Then, the baseline SCL was calculated as the median value, which is robust for outliers, during the 15 s before the stimulus was presented in each trial. The SCL during the music presentation was calculated as the baseline SCL subtracted from the downsampled SCL. Finally, the SCL was normalized using the following formula (Lykken & Venables, 1971; Van Den Bosch et al., 2013), and the average SCL norm value for each trial was calculated. This value was used for analysis:
Regarding PR, the PPG data were downsampled to 100 Hz. Next, a median filter was applied using the medfilt1 function in MATLAB for smoothing. Afterwards, the filloutliers function in MATLAB was used to detect and replace them with the nearest non-outlier value. To detect the peak, the findpeaks function in MATLAB was applied. Consequently, the peak-to-peak intervals were calculated. Then, the PR was obtained from the peak-to-peak intervals. The baseline PR was calculated as the median value during the 15 s before the stimulus was presented in each trial. The PR during the music presentation was calculated as the baseline PR subtracted from the PR. The average PR for each trial was calculated. This value was used for analysis. Default parameter values were used for all functions unless otherwise specified.
Results
Tempo in the tapping task
Tempo was calculated in bpm, which is defined as the number of beats detected in 1 min. The tapping task in the current study consisted of two trials of 30 s each. The consistency between the two trials was assessed. The preferred tapping tempo was calculated as twice the sum of taps for each trial, and the correlation between the calculated tempos in the two trials was analysed. The average of the preferred tapping tempo in the first 30-s trial was 107.2 bpm (SD = 24.7). The average in the second 30-s trial was 110.0 bpm (SD = 25.1). The correlation between the two trials was 0.99 (p < .001). Based on these results, we concluded that there was consistency between the two trials, and the preferred tapping tempo was calculated as the sum of the taps in the two trials. The average of the preferred tapping tempo was 110.0 bpm (SD = 24.2), with a range of 59.3 to 170.2 bpm.
Arousal levels in the music listening task
One study reported that familiarity with each piece of music affects arousal levels (Van Den Bosch et al., 2013). Therefore, a linear mixed model was constructed with familiarity as a fixed factor and both the music piece and participant ID as random effects with an intercept; this model was used to assess whether familiarity predicted arousal levels. The analysis was performed with the lme4 package (Bates et al., 2007) in R to construct the model. The linear mixed model analysis shows that there was no significant effect of familiarity; t(487.8212) = −0.21, p = .831. Thus, familiarity did not affect arousal levels, and we did not include familiarity as a fixed effect in further analyses.
To assess whether the original music tempo, SMT, SCL, and PR predicted arousal levels, we constructed the original model, which included original music tempo, preferred tapping tempo, SCL, and PR as fixed factors, along with their interactions, and music piece and participant ID as random effects with an intercept. Then, the best-fitting model was selected using the step function in R with the lmerTest package (Kuznetsova et al., 2015). The results indicated that the model should include original music tempo, preferred tapping tempo, PR, the interaction between original music tempo and preferred tapping tempo, and the interaction between original music tempo and PR as fixed factors; in addition, the results indicated that music piece and participant ID should be included as random effects with an intercept (conditional R2 = 0.407, marginal R2 = 0.263) [Supplemental Appendix 1(a)]. The linear mixed model analysis revealed a significant effect of preferred tapping tempo on arousal levels; t(96.16) = −1.99, p = .049. In addition, there was a significant effect of PR; t(484.9) = 2.65, p = .008. However, the effect of original music tempo was not significant; t(429.5) = 0.71, p = .477. The interactions between original music tempo and preferred tapping tempo and between original music tempo and PR were significant, t(459.9) = 2.45, p = .015, t(483.1) = −2.46, p = .014, respectively. The significant interaction between original music tempo and preferred tapping tempo indicates that the effect of the original music tempo on arousal levels increased with faster preferred tapping tempi. Figure 2 shows the relationship between original music tempo and arousal levels in four participants, two with faster preferred tapping tempi and two with slower preferred tapping tempi. The relationships for each participant are reported in Supplemental Appendix 2. In addition, the significant interaction between original music tempo and PR indicates that PR is related to the effect of the original music tempo on arousal levels.

The Relationship Between Original Music Tempo and Arousal Levels, As Shown With Representative Data From Four Participants: Two With Faster [(a) 170.2 and (b) 158.6 bpm] Preferred Tapping Tempos and Two With Slower [(c) 59.3 and (d) 86.6 bmp] Preferred Tapping Tempos.
In addition to the arousal level ratings (as presented above), participants also reported valence, engagement, and suspense ratings. The results of these ratings are reported in Supplemental Appendix 1(b) to (d).
Discussion
The aim of this study was to investigate how SMT modulates the effect of the original music tempo on listeners’ arousal levels. In addition, we aimed to evaluate how autonomic tone affects arousal levels, the SCL and PR. We found two important findings.
First, SMT influenced the effect of the original music tempo on arousal levels; the faster the SMT was, the stronger the effect of the original music tempo on arousal levels. In other words, the arousal-inducing effect of the original music tempo increased with faster SMT values. It has been reported that subjective arousal levels are higher when listening to a piece of music with a faster tempo than when listening to a piece of music with a slower tempo (Droit-Volet, 2013). In addition to the effect of the original music tempo, we showed that SMT contributes to subjective arousal levels. A previous study demonstrated that SMT influenced the effect of the original music tempo on the preferred music tempo (Hine et al., 2022). Similar to the preferred music tempo, subjective arousal levels while listening to a piece of music were also affected by SMT. It has been reported that participants with higher arousal levels tend to have a faster SMT (Hammerschmidt et al., 2021). This would be caused by the greater impact of arousal on a participant with a faster SMT. This perspective might explain why the impact of the original music tempo was greater for participants with faster SMT values than for participants with slower SMT values in the current study.
Second, autonomic activity, especially PR, affected the relationship between the original music tempo and arousal levels. The interaction between original music tempo and PR significantly influenced arousal levels. It has been reported that PR is a sufficiently accurate estimate of heart rate in healthy participants at rest (Schäfer & Vagedes, 2013). Thus, PR could reflect heart rate in estimations of arousal levels in the current study. Heart rate is often measured as an index of the activation of the autonomic nervous system (Zygmunt & Stanczyk, 2010). Kreibig (2010) argued that the emotional state while listening to music relates to the autonomic nervous system. In addition, it has been reported that heart rate is consistent with subjective arousal levels (Hodges, 2010). Moreover, previous studies have shown that a faster original music tempo increased arousal levels and heart rate to a greater extent (Gomez and Danuser, 2004, 2007). Based on previous studies and the current results, PR might reflect subjective arousal levels. Another possibility is that PR could increase arousal levels. It has been reported that emotional states may be changed by pseudo-heartbeats (Xu et al., 2021). Consequently, PR might modulate the effect of original music tempo on arousal levels.
The present study clearly showed that the preferred tapping tempo and PR influenced the effect of the original music tempo on arousal levels. However, it is still unclear how the preferred tapping tempo and PR affected the effect of original music tempo on arousal level. We conducted further analysis, in which the correlation between the original music tempo and the arousal rating (rbetween original music tempo and arousal rating) and R2 were calculated for each participant. Then, the correlation between the preferred tapping tempo and the rbetween original music tempo and arousal rating was analysed. The correlation was not significant (r = .19, p = .41). Also, the correlation between the preferred tapping tempo and R2 was also analysed. The correlation was not significant (r = .22, p = .34). Thus, it could not be concluded that the arousal level increased with the original music tempo as the preferred tapping tempo increased, nor was the relationship between the original music tempo and the arousal rating tightened with increasing preferred tapping tempo. For the correlation analysis, the number of data points may not be sufficient, and there remains a possibility that participants with a faster preferred tapping tempo have a higher increase in arousal rating with increased original music tempo. In further studies, it should be clarified how preferred tapping tempo affects the effect of the original music tempo on arousal rating. Also, future studies need to assess how PR modulates the effect of original music tempo on arousal rating. This could be investigated in an experiment, in which the wider range of PR data for different music of the same tempo and the arousal rating data were collected. Also, the sample size was determined based on a power analysis in the current study. The actual number of data points was lower than the planned number of data points because of equipment malfunctions. Considering the instability of the equipment’s operation, the planned sample size may have needed to be slightly larger than the value calculated on the power analysis.
In the current study, data on music skill or expertise were not collected. A previous study showed that musicians rated their arousal levels as higher than those of nonmusicians, especially for faster tempo music (Liu et al., 2018). From this perspective, the effect of the original music tempo might differ between musicians and nonmusicians. In further studies, music skill or expertise should be considered when measuring arousal levels while listening to music. Another limitation is that the music pieces used in the current study all involved classical music. Dillman Carpentier and Potter (2007) investigated the effect of playing tempo and music genres on the SCL. The SCL increased with faster playing tempi for classical music, whereas the SCL decreased with faster playing tempi for rock music. In addition, Rentfrow et al. (2012) argued that five genre-free factors affect music preference, which is considered to be related to arousal. Based on these previous studies, there might be both generalized and genre-specific factors that influence listeners’ arousal while listening to a piece of music. In the current study, we could not assess whether musical genre affected the results because we included only classical music. In future studies, a variety of musical genres should be included to clarify which factors are generalized and which are genre-specific regarding arousal levels while listening to music.
The current study showed that SMT, which reflects the pace of mental activity (McAuley, 2010), modulates the relationship between the original music tempo and arousal levels. This finding indicates that musical attributes (e.g., tempo) as well as listener characteristics influenced listeners’ arousal levels while listening to music. Based on the current results, a faster original music tempo should be used to induce a higher arousal level for listeners with slower SMTs, as the impact of the tempo on arousal is smaller for them compared to listeners with faster SMTs. Recently, most music player devices have a function to alter the tempo at which music is played. Even when the same music piece is played, adjusting the playing tempo according to listener SMT can induce specific changes in arousal in the listeners. This idea could be applied in music therapy. Music therapy is not effective for all patients (Vink et al., 2003). Based on the current results, if the therapist is attentive to the SMT and adjusts the playing tempo to patient SMT, music should have a great impact on inducing a specific arousal level. In this manner, the findings of the current study enhance our understanding of the role of music in regulating emotions and have the potential to contribute to the application of effective techniques such as music therapy.
Supplemental Material
sj-docx-1-pom-10.1177_03057356241311288 – Supplemental material for Spontaneous motor tempo modulates the effect of music tempo on arousal levels
Supplemental material, sj-docx-1-pom-10.1177_03057356241311288 for Spontaneous motor tempo modulates the effect of music tempo on arousal levels by Kyoko Hine, Koki Abe and Shigeki Nakauchi in Psychology of Music
Footnotes
Author contributions
K.H., K.A., and S.N. developed the study concept. K.H. and K.A. prepared the materials. K.A. collected and analysed the behavioural data. K.H. wrote the manuscript. All the authors discussed the results and commented on the manuscript.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by JSPS KAKENHI (Grant No. 20H05956 to S.N., and JP22K12218 to K.H.).
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
