Abstract
Our perception of the duration of a piece of music is related to its tempo. When listening to music, absolute durations may seem longer as the tempo—the rate of an underlying pulse or beat—increases. Yet, the perception of tempo itself is not absolute. In a study on perceived tempo, participants were able to distinguish between different tempo-shifted versions of the same song (± 5 beats per minute (BPM)), yet their tempo ratings did not match the actual BPM rates; this finding was called tempo anchoring effect (TAE). In order to gain further insights into the relation between duration and tempo perception in music, the present study investigated the effect of musical tempo on two different duration measures, to see if there is an analog to the TAE in duration perception. Using a repeated-measures design, 32 participants (16 musicians) were randomly presented with instrumental excerpts of Disco songs at the original tempi and in tempo-shifted versions. The tasks were (a) to reproduce the absolute duration of each stimulus (14–20 s), (b) to estimate the absolute duration of the stimuli in seconds, and (c) to rate the perceived tempo. Results show that duration reproductions were longer with faster tempi, yet no such effect was found for duration estimations. Thus, lower-level reproductions were affected by the tempo, but higher-level estimations were not. The tempo-shifted versions showed no effect on both duration measures, suggesting that the tempo difference for the duration-lengthening effect requires a difference of at least 20 BPM, depending on the duration measure. Results of perceived tempo replicated the typical rating pattern of the TAE, but this was not found in duration measures. The roles of spontaneous motor tempo and musical experience are discussed, and implications for future studies are given.
Keywords
Introduction
In perceiving the timing of events, we can attend to the duration of individual events as well as the rate at which a series of events occurs; in other words, the familiar distinction between interval-based and beat-based timing and time perception. In a musical context, event rate is known as tempo, often indexed in beats per minute (BPM). Duration and event rate in music are closely linked, as the tempo of rhythmic sequences causes the perception of these “musically filled” durations to be distorted (Hammerschmidt & Wöllner, 2020; Droit-Volet et al., 2013). Faster rhythmic tempi lead to time seeming to pass more quickly, as equivalent durations are judged to be longer in comparison to the same time interval filled with a rhythm at a slower tempo (Ortega & López, 2008). However, the nature of the duration-lengthening effect in music is not well understood. For example, it is unclear if the duration-lengthening effect of faster tempi is found over different tempo ranges, and what the minimum tempo difference should be in order for this effect to occur. Furthermore, recent studies on tempo perception in music have shown that the judgment of perceived tempo can itself be distorted, since tempo ratings do not coincide with the actual tempo of the music when the same song is presented in multiple versions (London et al., 2016, 2019a, 2019b). The underlying mechanism(s) causing these effects are still under debate. Therefore, the current study aims at gaining further insights into the perception of duration and tempo, and the relationship between them, when listening to music by investigating these distortions in a combined-task experiment.
In cognitive psychology, the most commonly used paradigm in investigating duration perception is
Studies on duration perception in the context of music listening have shown that the tempo of music, measured as BPM, influences perceived duration. In a recent study on the influence of sensorimotor synchronization (SMS) to different metrical levels on perceived time in musical rhythmic patterns (Hammerschmidt & Wöllner, 2020), participants’ duration estimations were influenced by the tempo of the patterns as well as their synchronization rate. Over a range of 83 to 150 BPM, faster tempi caused perceived time to pass by more quickly compared with slower tempi (duration range: 12.8–23.13 s). Likewise SMS, assessed by finger tapping, led to shorter duration estimations than listening-only, and SMS to a higher metrical level (longer inter-tap intervals) shortened perceived durations, which is consistent with the duration-lengthening effect of increased tempo, highlighting the role of motor activity and attention on duration perception (Zakay & Block, 1996). Furthermore, no differences were found for duration estimations between musicians and nonmusicians, suggesting that duration estimations of musical stimuli in the supra-second range are not influenced by musical training. Yet this finding does not rule out the possibility that musical training affects other measures such as reproductions, since this measure depends more on working memory and accurate encoding of musical features, including tempo. A study on the influence of emotional valance and arousal of music on duration perception, which included fast and slow tempo conditions as well, yielded similar results (Droit-Volet et al., 2013). Using a temporal bisection task, participants judged durations with music in the fast tempo conditions to be longer compared with slow tempo conditions (duration range: 0.5–6.8 s). Furthermore, the fast music was more arousing, showing the close link between musical tempo and physiological activation (i.e. arousal). The time distortion (duration-lengthening effect) of tempo has also been shown for simple and complex rhythmic sequences in auditory as well as visual stimuli (Droit-Volet & Wearden, 2002; Ortega & López, 2008; Treisman et al., 1990; Wöllner et al., 2018). Accordingly, in a study on perceived waiting time and background music, participants estimated perceived waiting time (4–15 min) to be longer with fast music compared with slow music (Oakes, 2003).
These findings in the supra-second duration range can be explained by interval timing theories such as the pacemaker-counter model (Gibbon et al., 1984; Treisman, 1963). These models assume that there is a mental pacemaker/timekeeper that generates pulses when a time judgment needs to be made. The more pulses the pacemaker generates during the to-be-judged duration, the longer the durational perception. The pulses are stored in an accumulator over the course of the to-be-judged duration, and the pulse count is compared to the reference memory when a judgment is made. Faster musical tempo increases the arousal level and causes the pacemaker to increase its rate; thus more pulses are generated and therefore more are accumulated over the same duration as compared with pulses generated at a slower rate, resulting in a longer duration judgment.
It is important to note that the “durations” to which we are referring are not single events (i.e. individual notes), but consist of multiple events that often occur at regular intervals (i.e. beats, measures). The metrical beat in music is an endogenous phenomenon, whereby regular or quasi-regular events give rise to a felt sense of pulse or “tactus” (London, 2012), as well as a sense that these pulses are organized in regularly recurring cycles or measures (Large et al., 2015; Large & Palmer, 2002; Phillips-Silver et al., 2011; Tierney & Kraus, 2015). The distinction between regular event structures with a felt sense of pulse versus irregular event structures with no felt sense of pulse is important, since different neural substrates are involved in the auditory processing of each (Teki et al., 2011). Beat perception, which most strongly occurs in the temporal range around 120 BPM (Fraisse, 1984; Moelants, 2002, 2003; van Noorden & Moelants, 1999) activates the striato-thalamo-cortical network, and thus may involve sensory-based, automatic processing that is beyond cognitive control (Grahn, 2012; Karmarkar & Buonomano, 2007). By contrast, durations of several seconds or longer activate the inferior olive and the cerebellum, and involve cognitive resources such as attention and working memory, the so-called
In the same way that duration perception is subject to distortions caused by musical tempo (i.e. duration-lengthening effect), the perception of musical tempo is itself subject to distortion. In a study on auditory and visual cues to musical tempo (London et al., 2016), one of the tasks was to rate the musical tempo of R’n’B songs. The original BPM rates of the songs were 105, 115, and 130 BPM, and in addition the BPM rate of each song was also “tempo-shifted” (+5% and -5% BPM) without changing pitch or timbre. Whereas participants were able to correctly differentiate between the different conditions (i.e. the original BPM rate vs. tempo-shifted versions of each song), the tempo ratings of the tempo-shifted songs did not match the actual BPM rates. Accelerated versions of the songs (+5% BPM) were overrated and decelerated versions (-5% BPM) underrated. The authors named this the tempo anchoring effect (TAE), because the perceived tempo of each song seemed to be “anchored” around the original BPM rate. Thus, the TAE describes a distortion of tempo. Melodic features in music seem to be particularly important for the TAE, since the effect seems to be present in different music genres (R’n’B, Disco) but not in purely percussive rhythmic patterns (London et al., 2019a). In another study investigating the TAE (London et al., 2019b), it was hypothesized that finger tapping to the beat of the music would reduce or even eliminate the TAE. The rationale for this assumption was that synchronous movement to music enhances rhythm perception and improves pulse finding, synchronization abilities, and the perception of rhythmic perturbations (Manning & Schutz, 2013; Su & Pöppel, 2012). However, using the same R’n’B songs as in the previous study, no differences in tempo ratings were found between listening-only and SMS conditions. The typical rating pattern of the TAE was present in both conditions, suggesting a disjunction between SMS and tempo ratings. This indicates that the cause of the TAE might take place in high-level rather than low-level encoding, as synchronized tapping to the beat of the music did not influence tempo ratings. The authors argued that the TAE might be an auditory example of
Research at the intersection of these temporal percepts (i.e. tempo and duration perception) is still sparse in the context of music listening, particularly regarding different effects and interactions. More fine-grained investigations are needed in order to understand these processes in more detail. For example, the minimum tempo difference (i.e. threshold) for the duration-lengthening effect of faster tempi is not known (Hammerschmidt & Wöllner, 2020; Droit-Volet et al., 2013). Thus, one of the aims of this study was to investigate the minimum tempo difference by presenting participants with musical stimuli at different tempi (105, 115, 125 BPM) and more fine-grained tempo differences with the tempo-shifted versions of the same musical stimuli (-5, ±0, +5 BPM), yielding a 5 BPM increment across all stimulus categories. The second aim was to assess the influence of the different duration measures on the duration-lengthening effect by having participants both estimate and reproduce the duration of the same musical stimuli. These two duration measures were chosen since they involve and emphasize different cognitive processes and they differ in terms of response accuracy and variability (Mioni, 2018). For example, duration reproduction might depend more on memory of musical tempo compared with duration estimation. The third aim was to gain further insights into the relationship between duration and tempo perception by letting participants also rate the tempi of the musical stimuli. Relating these different measures could potentially inform about the cognitive mechanism causing the TAE (high-level vs. low-level tempo encoding). If the results for the duration measures yielded a similar pattern as the TAE, this would indicate that the TAE, if present in both beat-based and interval-based judgments and tasks, is caused by a low-level mis-encoding of temporal information, rather than an interaction between high-level representations of the music versus low-level beat rate detectors, and thus would be evidence against the perceptual sharpening explanation of the TAE.
Regarding duration, we hypothesized (a) that faster tempi would lead to longer duration estimations and reproductions (duration-lengthening effect) and (b) that duration reproductions would be more accurate than duration estimations, since they do not involve a translation of experienced duration into clock units and depend more on the memorized tempo of the musical stimuli. Regarding tempo perception, we hypothesized that (c) the accelerated versions of the songs (+5% BPM) would be overrated and decelerated versions (-5% BPM) underrated (i.e. replicating the TAE).
Method
Participants
A total of 32 participants (mean age: 23.41,
Design and Procedure
The stimuli were six Disco songs, two at each of three original BPM rates (105, 115, 125 BPM). Each stimulus was presented at its original tempo (±0 BPM) and in two tempo-shifted versions (i.e. -5, +5 BPM), yielding a 3 (original BPM rate) × 3 (tempo shift) repeated-measures factorial design. The experiment was divided into three blocks, each corresponding to a different task: duration reproduction, duration estimation, or tempo rating.
After providing informed consent, baseline spontaneous motor tempo (SMT) measurements were obtained by having participants tap with their index finger for 30 seconds at their most comfortable speed (i.e. SMT) on a midi touch pad (BopPad, Keith McKillen Instruments), with taps being recorded in Live 9 (Ableton). Next, participants entered basic demographic information into OpenSesame 3.2.6 (Mathôt et al., 2012), which was used for experimental protocol, determination of task order, randomization of stimuli, and response collections. Participants listened to the original songs and rated their familiarity with the songs before being introduced to the experimental tasks. They were then fully informed about the presence of original vs. tempo-shifted versions of the songs, and were able to listen repeatedly to tempo-shifted example songs that were not included in the subsequent experiment. Example songs were given for the low and high end of the original BPM rates.
The actual experiment started with one of three blocks, counterbalanced over participants. In Block A (duration reproduction), participants had to reproduce the durations of the stimuli immediately after presentation by pressing the space bar on the computer keyboard to start the reproduction and pressing the same key a second time to stop it. Before the block, participants practiced the task with example songs (duration range: 10–30 s) and received immediate feedback on their duration reproductions in comparison with the actual example duration. In Block B (duration estimation), participants estimated the duration of the stimuli by entering clock units (seconds + milliseconds) into the experiment computer. Again, participants were able to practice the task with example songs not included in the actual experiment and got immediate feedback on their estimation accuracy in comparison with the actual duration (duration range: 10–30 s). In Block C (tempo ratings), participants’ task was to rate the tempo of each stimulus on a 7-point Likert-type scale (1 = slow, 7 = fast), which was the same task and followed the same training procedure as in London et al. (2019a). Before participants rated the tempo of the stimuli, they rated the tempi of a standard rock drum pattern (Figure 1) presented at the same BPM rates as in the actual experiment (100–130 BPM, 5 BPM increments). The purpose of this task was (a) to familiarize participants with the rating scale, and (b) to evaluate participants’ ability to rate the tempo of simple rhythmic stimuli without tempo-shifting in relation to the BPM rate (London et al., 2019a).
Each block consisted of the same stimuli in different quasi-randomized orders. To account for possible order effects, individual randomization was constrained such that stimuli based on the same song would not be presented consecutively. Between a response of a participant and the presentation of the next stimulus, a 4-second delay was implemented. After performing all three blocks, participants performed the SMT finger-tapping task again, in order to assess potential differences in SMT before and after the experiment.

Notation of the drum pattern. The lowest notes represent the kick drum, the middle notes the snare drum, and the highest notes the hi-hat.
Stimuli
Stimuli consisted of six Disco songs taken from the compilation “The Disco Box” (Inglot, 1999) and were the same songs as in experiment 2a in London et al. (2019a). These excerpts were chosen based on their original BPM rates and acoustic features for tempo cues (Table 1). The acoustic features included as criteria for the song selection were sub-band spectral flux (100–200 Hz), which has been shown to be specifically related to rhythmic features (Burger et al., 2013), and notes per second (event density), which affect tempo perception (Drake et al., 1999). The assessment was identical to London et al. (2019a) using the MIR toolbox for Matlab (Lartillot & Toiviainen, 2007), which aimed at the highest possible similarity of these features. The original BPM rates as reported in London et al. (2019a) were once more checked and confirmed by the authors of this study. Songs’ original BPM rates were at or near at 105, 115, and 125 BPM and therefore in the preferred tempo range for most listeners (Drake & Botte, 1993; Moelants, 2002).
Information on stimulus material and its characteristics.
Note: Reported values of musical characteristics differ from London et al. (2019a) because calculations are based on different parts of the songs (introduction vs. verse/chorus).
For this study, we used longer excerpts of the instrumental introduction (no vocals) of each song compared with London et al. (2019a), so that each stimulus was exactly eight bars long, reaching stimulus durations of up to 19.20 seconds. Since the song “Stayin Alive” by the Bee Gees does not consist of eight bars of introduction without lyrics, this song was replaced by Change’s “Change of Heart” from the same compilation as the other songs. The selection criteria for the replacement were the same as for the other songs (original BPM rate, spectral flux, event density).
The songs were first tempo-shifted to match exactly BPM rates of either 105, 115, or 125 BPM, and further manipulated by tempo-shifting each song precisely 5 BPM in both directions, yielding BPM overlaps at 110 and 120 BPM between tempo-shifted stimuli (Table 2). Manipulations of BPM rates were done via the “Warp” function in Live 9 (Ableton), resulting in 18 stimuli. Since the number of bars was fixed, the BPM rates determined the duration of each stimulus (Table 2). Participants were relatively unfamiliar with the six songs (
Stimuli durations according to factorial levels.
Data Analysis
Duration reproductions from Block A were converted into duration reproduction ratios by dividing the subjective durations (duration reproduction) by the objective durations (stimuli durations). This normalization was done to account for the differences in absolute durations of the stimuli. Therefore, a value of 1 represents a perfect duration reproduction, values below 1 indicate underestimations and values above 1 indicate overestimations of the objective durations. Before entering the duration reproduction ratios into a mixed-model ANOVA with factors original BPM rate and tempo shift as repeated measures and musicianship as a between-group factor, responses from stimuli with the same original BPM rate and tempo shift were averaged.
Duration estimations from Block B were normalized and analyzed in the same way as duration reproductions in Block A. In order to assess the accuracy of both duration reproductions and estimations (i.e. Blocks A and B), one sample
In order to investigate the relationship between the different measures, repeated-measures correlations (Bakdash & Marusich, 2017) between tempo ratings and duration estimation ratios, tempo ratings and duration reproduction ratios, as well as between the two duration measures were performed, including the factors original BPM rate and tempo shift. Furthermore, a paired-samples
Results
The results are presented in the following order: first, results of the factors original BPM rate, tempo shift, and musicianship on duration measures (reproduction and estimation) are reported, followed by the effects of the same factors on tempo ratings for the drum pattern and the Disco songs. Results of the correlation analyses and paired-samples
Duration Measures
Duration Reproduction
Results of the ANOVA on duration reproduction ratios showed a main effect for original BPM rate,
Duration Estimation
The ANOVA on duration estimation ratios yielded no effect for the original BPM rate,

Duration measures. Mean duration reproduction ratios (a) and mean estimation ratios (b) for original BPM rates with no tempo shift (green bars) and tempo shift with -5 BPM (blue bars) and +5 BPM (yellow bars). Error bars indicate 95% confidence intervals.
Tempo ratings
Drum Pattern Stimuli
The results for the ANOVA on tempo ratings for the different tempi of the drum pattern indicate that participants’ tempo ratings corresponded to the different BPM rates,
Disco Stimuli
The ANOVA run on tempo ratings of the Disco stimuli yielded a main effect for factor original BPM rate,

Tempo ratings. Mean tempo ratings for the drum pattern at different BPM rates (a) and mean tempo ratings (b) for the original BPM rates with no tempo shift (green bars) and tempo shift with -5 BPM (blue bars) and +5 BPM (yellow bars). Error bars indicate 95% confidence intervals.
Relationship Between Measures
The average ratios of both duration measure (estimation and reproduction) differed from each other,
The average ITI of participants’ SMT did not change during the experiment,

Repeated-measures correlations. Correlation between duration estimation and reproduction ratios (a), duration estimation ratios and tempo ratings (b), and duration reproduction ratios and tempo ratings (c). Each dot indicates the individual response by each participant in a particular trial, and each line is a linear fit for all responses from each participant.
Discussion
This study aimed at investigating different duration measures (estimation vs. reproduction) and their relation to perceived tempo when listening to Disco songs at different BPM rates, and to tempo-manipulated versions of the same songs. Faster tempo (125 vs. 105 BPM) led to longer duration reproductions, and duration reproductions were generally more accurate in relation to clock time than duration estimations. Small tempo changes produced by the tempo-shift manipulation (± 5 BPM) showed no effect on either duration reproduction or estimation. Therefore, results of this study suggest a higher sensitivity of duration reproductions for the duration-lengthening effect than duration estimation. However, in the tempo-rating task, the small tempo-shift manipulations did have an effect, such that accelerated versions of the songs (+5% BPM) were overrated and decelerated versions (-5% BPM) underrated, replicating the TAE. As no such response pattern was found in the duration measures, this supports the notion that the TAE may be a form of
Based on pacemaker-counter models (Gibbon et al., 1984; Treisman, 1963), the first hypothesis stated that faster tempi lead to longer duration estimations and reproductions as faster tempi generate more pulses of the pacemaker. The results partly confirm the hypothesis, as the factor of original BPM rate influenced duration reproductions but not duration estimations. Results showed longer duration reproductions for 125 BPM compared with 105 BPM, but no differences were found in comparison with 115 BPM. As the factor of tempo shift did not affect either duration estimation or reproduction, the results of this study suggest a minimum difference of 20 BPM for duration reproductions in the preferred tempo range in order for this effect to occur (Fraisse, 1984; Moelants, 2002; van Noorden & Moelants, 1999).
The second hypothesis stated that duration reproductions are more accurate than duration estimations, as duration estimations involve a translation and comparison of experienced duration with a reference memory of clock units whereas duration reproductions do not involve such a translation, and therefore a more direct comparison is made (Block et al., 1998; Mioni, 2018; Zakay, 1990). The results of this study support this hypothesis, as reproductions were generally more accurate and closer to the actual stimuli durations than estimations. Furthermore, no correlation was found between duration estimation and duration reproduction, supporting previous research suggesting that these measures may be independent from each other and involve different cognitive processes (Zakay, 1990). A borderline significance was found regarding differences between musicians and nonmusicians: results suggest that musicians reproduced the stimuli durations slightly more accurately than nonmusicians. No such evidence for a difference was found in duration estimations, which is in line with a previous study were no difference for this task between musicians and nonmusicians was found (Hammerschmidt & Wöllner, 2020). The difference in duration reproduction might be explained by enhanced memory for musical structure and strong encoding of musical features such as tempo, in turn resulting in a transfer benefit helping musicians to reproduce the duration more accurately compared with nonmusicians. This may be due to the musicians’ better ability and greater practice with
The third hypothesis stated that participants would overrate the accelerated versions of the stimuli (+5% BPM) and underrate decelerated versions (-5% BPM), and this did occur, replicating the TAE (London et al., 2016, 2019a, 2019b). Likewise, participants accurately judged the tempi of the drum pattern across all BPM rates, showing that they were able to correctly map the Likert-type rating scale onto the different BPM rates used in the experiment.
In combination, the durational estimation and reproduction results alongside the replication of the TAE shed light on the source of the TAE. Neither results for duration reproduction nor duration estimation were affected by the ±5 BPM tempo shifts. To be sure, these tempo differences might have been too small to cause an effect on duration perception, thus a source for the TAE based on a low-level encoding mechanism cannot entirely be ruled out as an explanation. Nonetheless, the accurate reproduction of original stimulus song durations, in comparison with the inaccurate reporting of stimulus beat rates (i.e. over- and underestimations in tempo judgments), supports the notion that the TAE may be a form of
This study used excerpts from actual Disco music in the preferred tempo range to investigate effects on the intersection of tempo and duration perception. Results of this study raise the question of the role of tempo differences and tempo ranges on the duration-lengthening effect of faster tempo. In other words, the proposed minimum tempo difference threshold for the duration-lengthening effect of 20–30 BPM should be further investigated by using different duration measures (e.g. estimation, reproduction, bisection) and smaller differences in BPM rate, for instance starting from differences of 10 BPM. The duration-lengthening effect should be investigated for other tempo ranges compared with the preferred tempo range used in this study, using both slower (e.g. 60–100 BPM) and faster (e.g. 130–180 BPM) tempi. It is not known if the proposed minimum tempo difference of 20 BPM for the duration-lengthening effect holds for other tempo ranges. Similarly, it should be investigated if the TAE is present in music in tempi outside the preferred tempo range of 100–130 BPM. Furthermore, different stimulus durations should be systematically investigated when listening to music, as the proposed threshold of 20–30 BPM might also vary according to the duration of the music (Buhusi & Meck, 2005). In order to assess the role of different memory processes (working memory vs. long-term memory), future studies could also increase the time period between the presentation of musical stimuli and the reproduction task.
To conclude, this study investigated the relationship between tempo and duration perception in music by presenting participants with Disco songs at different original BPM rates, and digital manipulations of these BPM rates. The tasks were to reproduce and estimate the stimuli durations, and to rate the tempo of each stimulus. The main findings of this study suggest a threshold for a duration-lengthening effect of tempo in the range of 20–30 BPM in duration reproductions, while different BPM rates did not influence duration estimations. Comparing the duration measures, reproductions were generally more accurate than estimations, and musicians tended to be better at reproducing durations than nonmusicians. The TAE was replicated in a tempo-rating task, but was not observed in the duration estimation or reproduction tasks. This points to
Footnotes
Action Editor
Tecumseh Fitch, Universität Wien, Department of Cognitive Biology.
Peer Review
Molly Henry, Max Planck Institute for Empirical Aesthetics. One anonymous reviewer.
Author Contribution
DH, CW, and JL researched literature and conceived the study. DH, CW, and JL were involved in study design. DH and CW gained ethical approval. DH and BB worked on stimuli selection criteria. DH, CW, and BB performed data analyses. DH wrote the first draft of the manuscript. All authors reviewed and edited the manuscript and approved the final version of the manuscript.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This research was supported by a Consolidator Grant from the European Research Council to the second author. The research is part of the project: “Slow motion: Transformations of musical time in perception and performance” (SloMo; Grant No. 725319).
