Abstract
This study investigates two relationships in polyphonic popular music: the relationship between the urge to move to music and perceived catchiness, and the relationship between the perception of a combination of musical patterns and its individual parts. In a listening experiment, 127 participants rated drum beats, keys/guitar patterns, and their combinations on urge to move and catchiness. Our analyses showed that urge to move and catchiness are positively correlated, most strongly in the combinations. Our results revealed that the urge to move and perceived catchiness positively influence each other and are interlinked in polyphonic music, while showing that the relationship between a combination and its parts is complex. We were able to predict a combination's effect based on the separate ratings of the keys/guitar and drums only to a limited extent: unless the individual instruments were rated similarly on urge to move and catchiness, the outcome showed no clear pattern in the ratings. Participants also indicated which instrument was more important for their experiences, and generally found the keys/guitar parts more important than the drums for both groove and catchiness. However, this indication was not reflected in their ratings, which also showed a generally larger influence of keys/guitar, but did not align with participants’ conscious importance judgment.
Introduction
Music listening has the capacity to elicit a plethora of experiences in listeners. Examining which musical and personal characteristics are likely to promote specific experiences has been a focus of music psychology, for example, why music makes us dance or why it sticks in our minds. These two experiences are known as groove and catchiness. They are particularly important for popular music, which is often crafted to excel in eliciting them (Bechtold et al., 2023). Seabrook (2015, p.6) identified a recipe for hits in which they presumably work together: “a rhythmic groove with a melodic hook on top.” How groove and catchiness exactly interact is still an open question. This study aims to examine if, how, and in which cases they interact in polyphonic contexts, which forms the first focus of this study. As a general note, “polyphonic” in this study refers to the presence of more than one instrument and should not be confused with the music theoretical term “polyphonic” in contradistinction to “homophonic” or “monophonic.”
But what are groove and catchiness, and how are they understood in this study? Janata et al. (2012, p. 56) introduced a widely acknowledged definition of groove in music psychology: “a pleasant sense of wanting to move along with the music.” Some qualitative studies have drawn a more nuanced picture (Bechtold et al., 2023; Duman et al., 2024; Hosken, 2020; Pfleiderer, 2010; Stupacher et al., 2023), while a psychological model of groove (Senn et al., 2023a, 2023b) puts the urge to move in the center of a complex web of interacting cognitive processes. As a synthesis of these definitions, the idea of the pleasurable urge to move to music (PLUMM) has emerged as a central aspect of groove (Matthews et al., 2023) that is not necessarily constitutive of the groove experience as a whole (Duman et al., 2024). In this study, we focus on the urge to move aspect of groove. Researchers have identified several musical, listener-related, and situational factors that influence the experience of groove, PLUMM, or the urge to move. For musical factors, studies on groove have largely focused on rhythm or drums (Eaves et al., 2019; Etani et al., 2018; Frühauf et al., 2013; Kawase & Eguchi, 2010; Matthews et al., 2022; Senn et al., 2018; Seeberg et al., 2024; Spiech et al., 2022; Witek et al., 2014). For listener-related factors, studies have found that familiarity, musical taste, and dance preferences are important (Cameron et al., 2022; Janata et al., 2012; Lustig & Tan, 2019; Rose et al., 2022; Senn et al., 2021, 2023a, 2023b), while results regarding musical training or expertise are less conclusive (Frühauf et al., 2013; Matthews et al., 2019, 2020, 2022; Nijhuis et al., 2022; Senn et al., 2018, 2021; Spiech et al., 2022). Overviews can be found in Etani et al. (2024) and Câmara & Danielsen (2018).
Catchiness has been hard to define (Van Balen, 2016) and is often seen as equal to a music's inherent memorability (Russell, 1987), its ease of forgetting (Grevler, 2019), or is broadly operationalized as recognizability (Burgoyne et al., 2013; Korsmit et al., 2017). One study suggested that recognizability is just one aspect of musical catchiness and introduced the concept of “perceived catchiness,” which is a perceived musical quality that “depends on the listener's perception and experience of music, in which memorization and positive affect are central, and engagement, immediacy and clarity are other aspects” (Bechtold et al., 2023, p. 353). Catchiness is relatively understudied. Research on catchiness has focused mostly on musical properties, specifically on melodic aspects (Grevler, 2019; Honing, 2010; Hume, 2017; Kronengold, 2005), low-level audio features (Van Balen, 2016), ease of singing along (Korsmit et al., 2017; Pawley & Müllensiefen, 2012), or instrumentation, such as whether there are female vocals (Korsmit et al., 2017). Listener-related factors are rarely considered, but studies have shown that musical sophistication (Kuiper et al., 2021), taste (Bechtold et al., 2023, 2024), familiarity (Bechtold et al., 2024; Russell, 1987), and age (Korsmit et al., 2017) play a role.
The relationship of groove and catchiness has been examined in two previous studies. A qualitative interview study (Bechtold et al., 2023) suggested that they are related, and can interact positively. A quantitative follow-up study (Bechtold et al., 2024) found a moderately strong positive correlation between the pleasurable urge to move to music and perceived catchiness and suggested a causal relationship between them: pleasure mediates the positive effect that perceived catchiness has on the urge to move. However, that study only investigated a limited context for this relationship, namely individual popular music patterns (drum beats, bass lines, keyboard or guitar accompaniment), while the music creators in Bechtold et al.'s (2023) interviews also mentioned strategies to foster groove and catchiness by combining musical parts with specific characteristics. That study suggested that there are three different ways in which groove and catchiness can interact in a multi-instrumental context:
- Mutual support, meaning combining catchy with groovy parts leads to music that is experienced as even catchier and groovier but in which the groovy and catchy parts are still clearly identifiable as sources for the different experiences. - Fusion, meaning combining parts that are groovy and catchy at the same time blends the experiences of catchiness and groove into a single indivisible experience. - Deliberate independence, meaning there is music that is catchy but not groovy, or vice versa, such as a rubato blues guitar intro or a stripped-back techno track.
Investigating these different forms of interaction requires the study of what happens when patterns are combined—that is, the relationship between groove and catchiness needs to be studied in polyphonic music and not just in the context of individual parts, thereby bringing the experimental stimuli closer to the music we usually listen to.
Research that examines empirically how an effect of an isolated voice or instrument changes when others are added is relatively rare. Studies have shown that music-evoked emotions can change and become more positive with added voices (e.g., Broze et al., 2014; Rasch, 1981). There is no research in that regard on catchiness, hooks, or memorability. For groove, there are two studies. Seeberg et al. (2024) found that full drum set patterns with snare drum, hi hat, and bass drum elicit higher PLUMM compared to versions with just one or two of these instruments. Düvel et al. (2022) conducted a study in which participants heard short full-band popular music excerpts in three versions (original, without drums, drums only), and rated each of them on PLUMM. The original versions elicited the highest urge to move, although the drums-only versions were rated only nominally lower. The individual song played a significant role, as for some, the version without drums elicited as much urge to move as the original. Their study showed that combining and isolating parts is not a zero-sum game when it comes to the experienced urge to move. However, it was limited by the source separation of the original tracks, which left audible artifacts. Additionally, the separation into drums in one condition and the rest of the band in the other resulted in unbalanced conditions: in fact, 50% of their participants stated that they did not even notice that drums were missing in the version without drums. Consequently, it is unclear whether a more controlled and balanced separation of parts and no artifacts would lead to similar results. The relationship between isolated musical patterns and their combination forms the second focus of this study.
As we are examining listeners’ urge to move in conjunction with perceived catchiness in this study, we require musical stimuli that likely promote these to varying degrees. Based on previous research emphases on drums for urge to move and melodic aspects for catchiness, we combine drums with keyboard or guitar patterns in this study. Similar combinations were mentioned in Bechtold et al. (2023), e.g., the guitar riff in Deep Purple's “Smoke on the Water” makes the song catchy, while the pumping rhythm of drums and bass makes it groovy. Yet, as there is no study for combined patterns and catchiness, and the predominance of drums for the urge to move has been found to be volatile (Düvel et al., 2022), it is still unclear which instrument contributes more to each respective experience. Additionally, it is unclear whether listeners are conscious of an instrument's contribution. It is also unknown whether listeners perceive a groovy drum beat with a catchy pattern as a complementary combination of two parts or judge it more holistically.
In this study, we focus on how music-induced urge to move and catchiness change when two parts are combined. We combine short popular music drum and keys/guitar patterns, and compare how they were experienced in isolation to how they were experienced in combination. We focus on two relationships: one between the urge to move and catchiness, and one between the experience of the combination and the experience of its parts. For the relationship between groove and catchiness, we first investigate a general interaction, but also look at parts with similar or divergent effects specifically. For example, we look at what happens when a groovy drum beat (high urge to move + low catchiness: HMLC) and catchy keys/guitar riff (low urge to move + high catchiness: LMHC) are combined or what happens when combining a high urge to move + high catchiness (HMHC) drumbeat with a low urge to move + low catchiness keys/guitar pattern (LMLC). We examine the following five hypotheses:
H1: The urge to move and perceived catchiness are positively correlated in all three musical conditions: drums, keys/guitar, and in combination. H2: The experienced urge to move and catchiness of the combination is related to the experienced urge to move and catchiness of the individual parts it consists of. H3: The urge to move and catchiness of the parts interact in the experience of the combination. H4: If H2 and H3 are true, a follow-up analysis examines specific rating conditions and their likely outcomes. We hypothesize that when a rating is high for an individual part, it will also be high in the combination, e.g., a catchy keys/guitar pattern produces a catchy combination, regardless of the drums. H5: The drums and keys/guitar parts contribute differently to urge to move and catchiness of the combination. We assume that listeners are able to identify which instrument contributes more to their experience and hypothesize that drums are more important for the urge to move, while keys/guitar are more important for catchiness.
Materials and Methods
Participants
We recruited 127 participants on the Prolific platform. They were roughly gender-balanced (62 females, 65 males) and had a mean age of 30.142 (SD = 8.309). They reported living in Europe (85), South Africa (35), and other developed countries (7). The majority (73) played no musical instrument, and accordingly, the self-assessed musical expertise was low (on a scale of 1–101: mean = 20.732, SD = 26.973). They showed some proneness to dance (on a scale of 0–6: mean = 4.157, SD = 1.427), and affinity for popular music (on a scale of 0–6: mean = 3.842, SD = 0.748). Participants demonstrated sufficient English proficiency for the experiment through a short writing task.
Stimuli
We selected our stimuli from the corpus in Bechtold et al. (2024), which consists of 80 self-composed 8-bar tracks of different popular music styles, each featuring a drumbeat, a bass line, and either a keyboard or guitar pattern. The respective parts vary in many ways, i.e., structural elements (rhythm, tempo, pitches, key, mode), instrument timbres (i.e., sound), general gestures (e.g., slow-moving, stagnant, driving, percussive), and styles. As detailed above, we focus on the drum beats, the keys/guitar patterns, and their combination, while dropping the bass lines, and we operate only with a subset of the corpus.
The corpus includes behavioral urge to move and perceived catchiness ratings for the drums and keys/guitar parts. However, as individual differences were an important factor in the respective study, we decided to have each drum and keys/guitar part rated again by the same participants that rate the combinations. Nonetheless, the previous ratings are a guideline to choose a set of 10 combinations (and the respective drum and keys/guitar patterns) that are suitable for the present study. We aimed for patterns that fulfilled specific rating conditions which allow for the examination of our hypotheses, and are expected to provide a variety of ratings. By rating condition, we mean a specific profile of urge to move and catchiness ratings, e.g., high urge to move + low catchiness (HMLC) or low urge to move + high catchiness (LMHC). Unfortunately, the correlation between urge to move and catchiness in the corpus meant that parts with very different ratings were rare, which limited the number of potential stimuli. We selected 10 combinations for the present study (Table 1).
The 10 selected stimuli with their respective identifier number in the corpus and the rating condition based on Bechtold et al. (2024). These rating conditions were used only for stimulus selection, and all stimuli were rated again in this study.
When it comes to combining parts, we define four different types of combined ratings conditions: congruent (HMHC + HMHC or LMLC + LMLC), varied similarly (HMLC + HMLC or LMHC + LMHC), varied contrastingly (HMHC + LMLC), and varied complementarily (HMLC + LMHC). The audio stimuli, transcriptions, experimental data, and analysis script are available at https://osf.io/ez7g5.
Measures
We measured PLUMM with the experience of groove questionnaire (Senn et al., 2020), which assesses a participant's experienced urge to move and pleasure with three items each on 7-point Likert scales. We measured perceived catchiness with the questionnaire used in Bechtold et al. (2024). This questionnaire contains four items that each target a different dimension of perceived catchiness. As the recognition task in Bechtold et al. (2024) was only weakly related to perceived catchiness while prolonging and complicating the experiment, we decided to drop it and rely only on the self-report questionnaire. Subsequently, we averaged the respective items to create the three scales urge to move, pleasure, and perceived catchiness. We did not center the scales, as we wanted to preserve the different means for drums, keys/guitar, and combinations. These measures are used for the analysis of all hypotheses.
We assessed participants’ familiarity with the music with a forced-choice item that we presented only for the combinations. Further listener-related measures were gathered as in Bechtold et al. (2024): one item from each Gold-MSI subscale (Müllensiefen et al., 2014) and a self-assessment on a slider from music listener (1) to professional musician (101) are condensed into one dimension with confirmatory factor analysis to create our expertise measure. The mean of two 7-point Likert items asking whether participants enjoy dancing and how often they want to dance constitutes our dance preference variable. Lastly, participants indicated their preference for 13 popular music styles on 7-point Likert scales. The average of these ratings serves as our measure for popular music affinity.
For H5, we examine whether a participant's conscious assessment of which pattern contributes more to the urge to move or perceived catchiness of a combination is congruent with their rating data. For this purpose, we asked them directly which pattern contributed more to the urge to move (or catchiness, respectively), with possible answers being “Drums”, “Keys/Guitar”, “Both equally (or none at all)”, or “I don’t know”.
As the question on contribution asks directly about catchiness, and the questionnaire includes a straightforward item that includes the word “catchy,” we included an additional check at the beginning of the experiment. We asked participants to provide their understanding of catchiness (being conscious not to mention “groove” in any text beforehand). These data may be used for a qualitative analysis of how participants understand catchiness in the future, but for the present study, we only used them to exclude participants that could either not provide an understanding of catchiness or one that did clearly not align with catchiness as understood in this or related work. We did not ask about groove, as the word “groove” does not appear in any of the items or questions.
Procedure
The experiment was set up on the Sosci Survey platform (www.soscisurvey.de). The participants were informed about the study and gave consent to participate, for data use, and publication by button click. At the beginning of the survey, participants answered questions on their background: age, country of residence, gender, musical expertise, and preferences for dancing and popular music styles. Afterwards, we asked them to define catchiness in their own words. In the following three-part listening experiment, each participant rated all stimuli on PLUMM and catchiness. The order of instruments and questionnaires was counter-balanced throughout the experiment, but the combinations always appeared in the last part. In the first part, each participant heard 10 stimuli consisting of either all drums or all keys/guitar. These stimuli were presented on separate pages. Participants answered either the questionnaire on PLUMM or perceived catchiness. Afterwards, the stimuli were repeated, and participants answered the respective other questionnaire. In the second part, participants heard the stimuli with the other instrument (keys/guitar if they heard drums in the first part, and vice versa). They again answered either the PLUMM or catchiness questionnaire, followed by the respective other. In the third part, participants heard the combinations and indicated their familiarity with the music. They answered either the PLUMM or catchiness questionnaire and indicated the contribution of the individual instruments for the related experience. Finally, they completed the respective other questionnaire and contribution question in response to the combinations. The experiment took 27 min on average (SD = 5.895). Participants were remunerated with £6.
Statistical Analyses
All statistical analyses were conducted with R (version 4.4.2) in the RStudio environment (version 2024.09.1). We discuss the exact statistical analyses and packages in relation to the respective hypothesis.
For the Examination of H1
We calculated Pearson correlations for urge to move and catchiness in the three musical conditions (keys/guitar, drums, and combination) with the correlation package (Makowski et al., 2020), and proceeded to compare the strength of correlations using the BFpack package (Mulder et al., 2021), which quantifies evidence towards or against specific hypotheses.
For the Examination of H2 and H3
For assessing whether the ratings of the parts and their interaction allow a prediction of the respective combination's rating, we calculated Bayesian regression models with the brms package (Bürkner, 2017). We used Student-t distribution models, which are more robust to heavy tails and potential outliers. As for all models in this study, we set flat uninformative priors for the fixed effects, meaning all coefficient values were considered equally plausible, and hence, the models are data-driven. We assigned weakly informative priors to the intercepts and standard deviations of random intercepts and slopes, as per the default settings of the brms package. The model selection follows the procedure outlined in Bechtold et al. (2024). In the respective models, we predicted either the perceived catchiness or urge to move rating of the combination based on the ratings of the parts and their interactions (i.e., Catch Combi ∼ Move Drums × Move Keys/Guitar × Catch Drums × Catch Keys/Guitar and Move Combi ∼ Move Drums × Move Keys/Guitar × Catch Drums × Catch Keys/Guitar), and relevant listener-related measures. We empirically tested for model structure, i.e., whether the model requires by-participant and by-stimulus random intercepts or slopes, by comparing ELPD (theoretical expected log pointwise predictive density) differences obtained through leave-one-out cross-validation with the loo package (Vehtari et al., 2023). We followed their suggestion and view ELPD differences greater than 4 as significant. Once the most efficient model structure was established, we eliminated unnecessary variables stepwise until we reached our final model. We calculated the models’ loo-adjusted R2 with the performance package (Lüdecke et al., 2021).
For the Examination of H4
We calculated a categorical Bayesian regression model with the brms package (Bürkner, 2017). Such a model allows calculation of the likelihood of each rating condition (e.g., HMHC Drums + HMHC Keys/Guitar) to lead to a target category (e.g., HMHC combination). This analysis allows us to test the validity of H4, i.e., whether high individual pattern ratings being retained in combinations works as a simple rule of thumb.
Aside from data that fulfill the specified rating conditions, such a model requires a contrast category. As our conditions are defined by high and low ratings, we define neutral ratings around the mean as a contrast category. To obtain the categories from our continuous data, we first split the rating data at the relative mean. As we did not center our scales, the split point for Catchiness Drums (3.384) differs from the split point for Catchiness Keys/Guitar (3.217) and for Catchiness Combi (3.672). For the contrast, we defined all ratings within a margin of 1 above and below the mean as neutral (see Figure 1 for a visualization).

Visualization of an example illustrating the categorization of the ratings for the individual parts (and the resulting rating condition) and the combination (and the resulting target category). Pink horizontal lines indicate the mean of the respective rating, which serves as the basis for the high, low, and neutral categorizations.
For the actual modeling, we selected all observations that fulfilled one of eight rating conditions for the individual parts (Table 2) or were neutral, and whose target category was HMHC, HMLC, LMHC, LMLC, or neutral (N = 769). The other 501 observations (on average 3.945 per participant), mostly partially neutral conditions, were excluded.
Number of observations per rating condition included in the categorical model.
For the Examination of H5
To compare the reported conscious contribution of parts to the actual ratings, we calculated one regression model for Move Combi and another for Catchiness Combi. These models included only the interaction between the individual parts’ ratings and the contribution variable (i.e., a participant's categorical indication of whether the main contributor to urge to move or catchiness is drums, keys/guitar, or both equally) without including them as separate main effects, e.g., Catch Combi ∼ Catch Drums:Contribution Catch + Catch Keys/Guitar:Contribution Catch. These models test whether drums influenced the combination more when participants identified them as the main contributor, or whether keys/guitar had a greater impact when they were selected as the main contributor.
Results
Correlations Between Urge to Move and Catchiness (H1)
Urge to move and catchiness are positively correlated across the three musical conditions: keys/guitar, drums, and combinations (Figure 2, Table 3). The strength of the correlation is medium to large in all conditions, but it is stronger for combinations than for keys/guitar (BF = 17), and the latter is stronger than for drums (BF > 1000). In general, the combinations were rated higher than the mean of the respective drums and keys/guitar ratings for urge to move and perceived catchiness (BF > 1000 for both).

Scatterplot of the individual ratings for the combinations (pink), drums (blue), and keys/guitar (green). The lines represent the correlations between urge to move and perceived catchiness in the respective colors, with shaded standard errors.
Correlations between urge to move and catchiness in the three musical conditions.
Predicting Combination Ratings from Individual Parts Ratings (H2 and H3)
In the variable selection process, expertise, dance preferences, and popular music affinity proved redundant. This is not unexpected, since they influence the individual ratings (i.e., the other predictors; see Bechtold et al., 2024) as well as the combination ratings, and thus appear not to require additional modeling. Hence, the resulting models include the four ratings of the individual parts (Drums Move, Drums Catchiness, Keys/Guitar Move, Keys/Guitar Catchiness) and familiarity as predictors. We tested for interactions between the rating variables, but ultimately did not include them as they were not efficient (ELPD difference > 4 for both the urge to move and catchiness models).
For predicting the combinations’ urge to move, a model with by-participant slope and by-stimulus intercept proved to be efficient. This model explains a sizable amount of variance of the combination ratings (R2marginal Move = 0.390 [0.347, 0.429]; R2conditional Move = 0.483 [0.434, 0.530]). A summary of the model's fixed effects is shown in Table 4. Post-hoc hypothesis tests confirmed that the urge to move ratings of the individual parts are better predictors than the catchiness ratings (BF = 132), and that the estimates of keys/guitar are only nominally higher compared to the drums (BF = 3.768). Familiarity had a positive effect, comparable in strength to the catchiness ratings.
Summary of the fixed effects of the combination urge to move prediction model.
For predicting catchiness, a model with by-participant and by-stimulus slopes proved efficient. The model's fixed effects explain slightly less variance compared to the urge to move model (R2marginal Catchiness = 0.336 [0.295, 0.381]; R2conditional Catchiness = 0.520 [0.471, 0.566]). A summary of the model's fixed effects can be found in Table 5. We can see that Move Drums has no relevant effect. Presumably, all its explanatory potential is taken over by the remaining three variables. Post-hoc hypothesis tests provided clear evidence that catchiness is more important than urge to move, and keys/guitar are more important than drums (BF > 1000 for both). Familiarity showed a strong positive effect.
Summary of the fixed effects of the combination catchiness prediction model.
Predicting Combination Ratings in Specific Rating Conditions (H4)
Figure 3 shows an overview of the average ratings by rating condition. A visual inspection suggests that our hypothesis H4, that high levels of individual parts are retained in the combination, is not confirmed (e.g., HMHC Drums + LMLC Keys/Guitar does not lead to HMHC combinations). At the same time, the visual inspection does not suggest a different consistent simple rule of thumb: some cases may suggest averaging the individual ratings (e.g., HMHC Drums + HMHC Keys/Guitar; HMLC Drums + LMHC Keys/Guitar), others suggest summing (LMLC Drums + LMLC Keys/Guitar), subtracting (HMHC Drums + LMLC Keys/Guitar), or nothing of the sort (HMLC Drums + HMLC Keys/Guitar).

Mean urge to move and catchiness ratings for the eight selected rating conditions.
After fitting the categorical model (with by-participant and by-stimulus random slopes, and including familiarity as a covariate), we conducted four hypothesis tests for each rating condition. First, we checked whether the expected target category was likely by testing whether the corresponding estimate was positive. We then proceeded to check whether this target category was more likely than the three other options by testing whether the respective estimate was larger than each of the alternatives. If we found evidence that our expectation was wrong, we checked for another target category as well. The results of these tests are shown in Table 6.
The eight rating conditions with their respective hypothesis tests. Bayes factors indicating very strong evidence (> 30 and < 1/30) are shown in bold, and those indicating strong evidence (> 10 and < 1/10) are shown in italics.
We can gather from Table 6 that only two outcomes can be reliably predicted, namely those where all individual ratings are either high or low (HMHC + HMHC = HMHC, LMLC + LMLC = LMLC). For the others, there is at best moderate evidence (BFs between 3–10 or 1/10–1/30). The two complementary rating conditions (HMLC Drums + LMHC Keys/Guitar and LMHC Drums + HMLC Keys/Guitar) are particularly hard to predict. The relatively low amount of data points for these (N = 25 and N = 18) potentially diminishes the model's predictive power for these categories. We found evidence against two of the expectations. HMLC + HMLC = HMLC is unlikely, while moderate evidence suggests that other target categories are likelier. But the respective hypothesis tests for the target categories HMHC (BF = 1.756), LMHC (BF = 0.342), and LMLC (BF = 1.712) also did not provide a clear outcome. There is only anecdotal evidence against HMLC + LMHC = HMHC, and none of the other target categories is more likely.
Contribution of the Individual Instruments (H5)
We have two different kinds of data on the contribution of the instruments to the combination ratings. First, we can deduce from the estimates of the prediction models above how much an instrument predicts the combination rating: keys/guitar contributed more than drums for catchiness (BF > 1000) and also—though not conclusively—for urge to move (BF = 3.678).
Second, we asked participants directly about the conscious contribution of the individual instruments. They frequently indicated that both instruments contributed similarly or not at all (36% for catchiness, 40% for urge to move). Participants rarely indicated that drums are the main contributor to perceived catchiness (14%) or urge to move (20%). Keys/guitar were deemed to contribute more to the urge to move in 40% of the cases and to catchiness in 50%. As with the data above, we see a predominance of keys/guitar for catchiness, and—a bit less strongly—for urge to move. To assess whether participants’ impressions of contribution are reflected in their ratings, we calculated models for Urge to Move Combi and Catchiness Combi predicted by an interaction between the rating of an individual parts and the indicated instrument contribution, e.g., Move Combi ∼ Move Drums:Contribution Move + Move Keys/Guitar:Contribution Move.
For the urge to move model, a by-participant random slope and a by-stimulus random intercept are efficient. The model provided no evidence (BF = 1.943) that the estimate for Move Drums is larger for Contribution Drums (B = 0.277) than it is for Contribution Keys/Guitar (B = 0.251). Similarly, there is no evidence (BF = 1.210) that Move Keys/Guitar is larger for Contribution Keys/Guitar (B = 0.261) than for Contribution Drums (B = 0.254). For catchiness, by-participant and by-stimulus random slopes are efficient. Again, we found no evidence (BF = 1.331) that Catchiness Drums is more important for Contribution Drums (B = 0.154) than it is for Contribution Keys/Guitar (B = 0.140). The effect of Catchiness Keys/Guitar for Contribution Keys/Guitar (B = 0.333) is nominally lower than for Contribution Drums (B = 0.356), and there is anecdotal evidence against it being larger (BF = 0.700). In summary, the evidence for all these hypotheses suggests that listeners’ perception of instrument contribution is not reflected in their ratings of the combinations—the selected instrument did not have a greater impact on the combination’s rating.
Discussion
Our results show a complex picture regarding our hypotheses and implications for understanding the relationship between groove and catchiness, as well as between polyphonic music and its individual parts. First, we discuss the correlation between the urge to move and catchiness across the musical conditions (H1). Second, we discuss the general feasibility of predicting a combination's effect on participants’ ratings from the effects of the individual parts and how we interpret these results regarding the relationship between polyphonic music and its individual parts (H2). Third, we examine how the urge to move and catchiness of the parts interact in predicting the experience of the combination (H3) and what that means for relating groove and catchiness. Fourth, we focus on specific conditions and how these results relate to the theories about groove-catchiness interaction (H4). Fifth, we discuss the contribution of the drums and keys/guitar to the experience of the combinations (H5).
Correlations Between Urge to Move and Perceived Catchiness (H1)
We found medium-to-large positive correlations between urge to move and perceived catchiness across the three musical conditions: keys/guitar, drums, and combination. This corroborates previous findings (Bechtold et al., 2024) for drums and keys/guitar, and supports our hypotheses for the combinations. The correlation is slightly stronger for the combinations, which can be cautiously interpreted with one of the interaction theories in Bechtold et al. (2023): in polyphonic music, groove and catchiness can fuse into a single, indivisible experience—which would manifest in a stronger correlation. The combinations are generally rated higher on urge to move (comparable to full drum sets in Seeberg et al., 2024) and perceived catchiness compared to the individual parts. This supports another interaction theory: if groove and catchiness support each other or fuse together, the resulting music is groovier and catchier than the individual parts.
Relationship Between Combinations and Parts (H2)
Predicting a combination's rating from the ratings of its individual parts was generally feasible. Thus the perception and effect of the combination has some relationship to the perception and effect of its individual parts. However, the explained variance by the individual ratings and familiarity (39% for urge to move and 34% for perceived catchiness) is deceiving. In the present experiment, the same participants rated the individual parts and the combinations, i.e., their personal background, taste, and experience are kept constant (which shows in the unnecessity to model them), and we can assume the same for the listening situation, i.e., the environment (Swarbrick et al., 2019; Trost et al., 2024) and participant's mood (Kawase, 2024). The musical content is also well controlled: we presented the drums, the keys/guitar, and then the two exact same audio tracks simultaneously for the combinations. In this context, 61% and 64% unexplained variance are large. As a comparison, Witek et al. (2014) were able to explain 35% of the variance in urge to move ratings just with the music's degree of syncopation as an objective measure. Possibly, participants did not perceive the combination as two segregated streams of the same two individual parts as before. Instead, they perceived a single stream of music in which the parts are integrated (Bigand et al., 2000) and interact in some form—potentially, as foreground-background (Sloboda & Edworthy, 1981), which we will discuss below. The amount of unexplained variance suggests that this change in perception is substantial and thus the effect of the whole is different from the combined effects of its parts. This is comparable to findings that adding voices can change the perceived valence of music (Broze et al., 2014). As a result, if the effect of the individual parts is not sufficient to explain the effect of the combination, any conclusions we might draw from musicological analysis about why an individual part has had an effect can only speculatively be transferred to a combination. As a hypothetical example, Senn et al. (2018) and Matthews et al. (2019) identified musical structures as groovy (certain pattern categories and mid-complex rhythms, respectively), but we cannot conclude from these that a combination featuring such structures in one of its parts is likewise likely to be perceived as groovy. In summary, our results suggest that the groove of a drum beat or the catchiness of a melody can give a limited indication about the experience of the whole music (here, with just two parts involved). As a more general consequence, our results show the necessity to study groove and catchiness in a polyphonic context, in the best case using naturalistic music recordings.
Interactions Between Urge to Move and Catchiness (H3)
The individual part ratings all showed positive effects on the combination ratings in our models, indicating that these work in the same direction: there is no compensation or negative correction between the instruments and ratings. In the catchiness prediction model, the urge to move of the drums had no relevant effect, but this does not mean that it is by itself unrelated to the combination's catchiness. Rather, it became redundant because the other predictors accounted for its explanatory potential. Our results show that both experiences as well as both instruments are valuable predictors for the combinations. Including the catchiness ratings for drums and keys/guitar improved the predictive power for the combination's urge to move compared to a model that only accounts for the urge to move of the individual parts, and vice versa. This means that the perceived catchiness of polyphonic music is influenced by the urge to move associated with its parts, and that the experienced urge to move of polyphonic music is affected by the perceived catchiness of its parts. One explanation for this (and the predominance of keys/guitar) is the perception of the combination in a foreground-background dichotomy (Sloboda & Edworthy, 1981). This is reminiscent of Seabrook's (2015) figurative description of a hit as rhythmic groove with a hook “on top,” which implies such a hierarchy. The musicians in Bechtold et al. (2023, p.366) noted that “catchiness often works mainly through the relation between melody and groove” and thus a background can make a melody shine. For groove, it has also been theorized that such an integrated perception of instruments as foreground-background is beneficial, while an analytical perception of segregated streams would be disrupting (Roholt, 2014). In summary, as a main takeaway, our analysis showed that catchiness and the urge to move in polyphonic music are interlinked and mutually benefit each other. Consequently, this suggests including catchiness as a factor in comprehensive groove models (Senn et al., 2023a, 2023b) or definitions (Duman et al., 2024), and experienced urge to move as a factor when analyzing music's catchiness (Burgoyne et al., 2013), and by extension its hooks (Byron & O’Regan, 2022) or memorability (Tseng et al., 2023).
Combining Patterns with Specific Conditions (H4)
Our categorical model revealed that predictions work well when drums and keys/guitar ratings are congruent but become uncertain or even impossible when these ratings are varied (reflecting the explained variance in the other models above). This contradicts our expectations: we postulated a rule of thumb that high ratings in the individual parts would be retained in the combination. But apparently, this is not the case. For example, a drum beat that elicits a high urge to move in isolation can lose this effect when combined with a keys/guitar pattern. This aligns with theories and findings that most listeners develop strategies to integrate musical streams in perception (Bigand et al., 2000; Disbergen et al., 2018), which means that the individual parts are not attended to in isolated form when listening to a combination of patterns. Consequently, the experience of the combination interferes with and overrules the experience of the individual parts.
One rule of thumb was supported: nothing comes from nothing. If the individual parts neither elicit an urge to move nor catchiness by themselves, they also have no such effect when combined. This means that any experienced catchiness or urge to move in response to polyphonic music is ultimately traceable, at least to some extent, to one of its individual parts.
In Bechtold et al. (2023), the musicians described some music as groovy/non-catchy parts being combined with catchy/non-groovy parts that result in a groovy and catchy polyphonic outcome and reported this as a common strategy for composing. Our results in this study do not support the idea that listeners reliably perceive this outcome—observations in which the individual parts were perceived in this complementarily varied way could plausibly lead to any outcome according to our model. Possibly, participants perceived the complementarily varied parts as not fitting together, or could not integrate the streams, hindering positive effects of combining. Presumably, professionally composed, produced, and performed music, such as the music under scrutiny in Bechtold et al. (2023), can avoid such cases and they are unlikely to be found in successful and famous popular music.
There is another potential takeaway for the higher-level relationship between the urge to move and catchiness. For similarly varied cases (HMLC + HMLC or LMHC + LMHC), the aggregated means in Figure 3 suggest that the low-rated experience became higher in the combination while the high-rated experience became slightly lower. This is evidence for the expected fundamentally positive interaction between groove and catchiness (Bechtold et al., 2023). The reinforced presence of the high-rated experience fosters the increase of the low-rated one: a catchy drum beat and a catchy keys/guitar riff create some urge to move as a spill-over effect (likely via pleasure, see Bechtold et al., 2024). However, this interaction seems unidirectional: in this example, combining two high catchiness patterns increased the low urge to move compared to the individual parts, but that, in turn, does not further increase the already high catchiness compared to the individual parts. In consequence, against the general tendency, such a spill-over effect does not lead to overall higher levels of groove and catchiness compared to the individual parts, as catchiness remained similar. This is a contradiction within limits of the theory in Bechtold et al. (2023) that groove and catchiness are completely independent when one of them is not or hardly present. Combining the results of the similarly varied ratings with the results regarding congruent ratings suggests that polyphonic music is only likely to make people move and to be perceived as catchy when the individual parts are by themselves efficacious in both these regards. In the previous interview study, this is the condition that the music creators saw as the basis for fusion, i.e., groove and catchiness blending into one experience. The fact that congruent high ratings in the parts reliably lead to high(er) rated combinations, and that the correlation between urge to move and catchiness becomes stronger for the combinations, can be explained by fusion. Yet we cannot claim that our results directly confirm the blending of experiences, as we have no explicit data on this. All we can say is that our findings are indirectly compatible with this explanation.
Contribution of Instruments and Parts (H5)
We asked participants directly which instrument contributes more to the perceived catchiness or the urge to move. Cases in which the drums contribute more were reported rather rarely, and most responses were in favor of the keys/guitar or indicating a similar contribution, even more so for catchiness than for the urge to move.
The results between our prediction models (which also pointed towards a predominance of keys/guitar) and participants’ indications appeared compatible, but our analysis showed that a participant's impression of which instrument contributes more is not well reflected in how well the respective part rating affects the combination rating by the same participant. This might have had methodological reasons. Possibly, the direct question was hard to answer, especially for less musically inclined participants, such as our sample: it required participants to segregate the music and assess the effect of each instrument separately. This contrasts with learned strategies to integrate streams (Bigand et al., 2000; Disbergen et al., 2018), and might have benefited a foreground-background perception (Sloboda & Edworthy, 1981) to better cope with the task. Additionally, hearing the patterns together might have changed their assessment compared to hearing them in isolation before (e.g., because a part might not have made much sense without the context, and once heard in context could hardly be imagined without it), i.e., hearing the combination in an integrated way already interfered.
Neither of the analyses supported the hypothesis that drums were more important for the urge to move. In contrast, the expectation that keys/guitar are more influential for catchiness was fulfilled. There is a general bias that melody (i.e., keys/guitar) dominates the accompaniment (i.e., drums) in perception (Ragert et al., 2014; Tagg, 2003a, 2003b), which could explain the predominance of keys/guitar. But this does not correspond to Düvel et al. (2022), in which drums were more important for the urge to move than melody and remaining accompaniment, and the elevated position of drums in many groove studies. In our experiment, it is likely that the two instruments vied for attention (Kronengold, 2005; Sloboda & Edworthy, 1981), and the keys/guitar came out on top due to being more related to catchiness. In consequence, this means that drums are only seen as being most important for groove when the drum parts are deemed catchy. The stimuli in Düvel et al. (2022) were mostly chosen from the Lucerne Groove Library (Senn et al., 2018), which consists of songs selected because of the drummers. Hence, in their selection, often including famous drum beats, the drums might generally have been catchier and thus played a larger role than the ones we used in this study.
Limitations
The study has four main limitations that confine the extent of general conclusions that can be drawn. First, with only 10 combinations, and the 20 parts they consist of, the number of stimuli, i.e., the researched context, is relatively small. The style variation of the music is limited, as other characteristics, specifically previous ratings, were more important in selecting the stimuli from a larger corpus. Future investigations are necessary to corroborate the findings with more numerous and more stylistically varied music. Second, the context is further limited by the type of music we used. We gave our reasons why we opted for drums and keys/guitar for this initial investigation, but we can hardly transfer our conclusions one-to-one onto more complex polyphonic music, e.g., full band recordings with vocals. In the future, a similar experiment with carefully decomposed multi-track recordings could clarify whether our main findings for the relationship between groove and catchiness, as well as between the whole and individual parts are valid for a more comprehensive whole. Third, the amount of data points in two rating conditions is rather low, which potentially increased the uncertainty in the prediction for these in the categorical model. Related to this is the last limitation: we set relative split points for low, high, and neutral using the respective means, and hence, different ones for urge to move and catchiness, and for drums, keys/guitar, and combinations. In consequence, a drum beat with a specific urge to move rating might be categorized as low, whereas a keys/guitar pattern with the same catchiness rating might be neutral. Results could slightly differ if an absolute cut-off value (e.g., 3 for all ratings as neutral) was chosen. Yet the relative method allowed us to keep more data for the analysis.
Conclusion
In this study, we investigated how groove, or more specifically, the urge to move to music, and perceived catchiness are related in polyphonic music. We were able to show that the two often co-occur in listeners in response to the same music. But it is not mere co-occurrence: the two are often interlinked when parts are combined, and we found evidence of mutual support or positive interaction. We also found a positive unidirectional interaction in the form of a spill-over, which refines the theory about independence of groove and catchiness when one of them is not or hardly present. We found no direct evidence for the third form of interaction, a fusion of groove and catchiness, but the related theory can be used to explain our results. In summary, the study suggests a close and interlinked relationship between groove and catchiness.
However, we were not able to confirm simple rules for how combined parts are experienced. We know what likely happens if the drumbeat and the keys/guitar pattern are both groovy and catchy: the combination will also be groovy and catchy. For other conditions, we cannot predict the perceived outcome of a combination of patterns—it might depend on the exact music and how the parts fit together. But we can assume that in popular hit songs, the parts do fit together in a way that makes a groovy and catchy outcome likely, as the music creators in Bechtold et al. (2023) proposed.
The study also investigated how the effects of “a whole” relate to the effects of its parts in the context of two combined parts. We showed that the individual parts influence the perception and effects of the whole to some extent, but also showed how the whole is more than just the sum of its parts, and predictions based on parts can be difficult. The experience of the whole can overwrite the experience of the parts. In consequence, investigating stems or isolated patterns of music allows for limited conclusions on our common music listening experiences.
Footnotes
Action Editor
Alexander Refsum Jensenius, University of Oslo, RITMO Centre for Interdisciplinary Studies in Rhythm, Time, and Motion, and Department of Musicology
Peer Review
Connor Spiech, Concordia University, Department of Psychology. Patti Nijhuis, University of Jyväskylä, Department of Music, Art and Culture Studies.
Author Contribution
TB: Conceptualization, Data Curation, Formal Analysis, Investigation, Methodology, Writing—original draft, Writing—review and editing. BC: Writing—review and editing. MW: Conceptualization, Methodology, Writing—review and editing.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical Considerations
The experiment was conducted in accordance with the declaration of Helsinki and its design was approved by University of Birmingham's Humanities and Social Sciences Ethical Review Committee (ERN_20-0007).
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Consent to Participate
The participants were informed about the study and gave consent by button click.
Consent for Publication
Not applicable.
Data Availability Statement
The experimental data, audio stimuli, transcriptions, and analysis script are available at https://osf.io/ez7g5/ (Bechtold, 2025).
