Abstract
This paper explores the correlation between body movements and the voice in Hindustani Dhrupad vocal improvisation. It focuses on the effort exerted by singers during manual interactions with imaginary objects (MIIOs) they are often observed executing, such as stretching, pulling, pushing, throwing etc. Despite the recognized role of effort in music expressivity, its systematic study has been surprisingly overlooked. The paper employs a video observation analysis on originally recorded audio-visual material of vocal improvisations. It relies on an iterative process of observations to identify, label, and classify repeated gestural events that allude to MIIOs, and on a correlation analysis between MIIO gesture classes, effort levels, and melodic aspects. Despite variations in the way individual vocalists may gesturally demonstrate MIIOs, results reveal a degree of consistency across them. Distinct patterns of effort-related cross-modal associations were observed, with effort pertaining to expressive gestural aspects in rendering MIIOs, the pitch space organization of melodic modes, the mechanical aspects of voice production, the macro-structure of the improvisation, and cross-modal structural analogies understood as morphological similarities in shape and contour. The study demonstrates an essential paradigm shift in gesture–sound studies, moving beyond traditional melographic representations to emphasize the dynamic aspects of effort-related gesture–sound associations.
Keywords
Introduction
Recent years have seen a shift towards embodied approaches to music performance and cognition (Cox, 2016; Leman, 2007; Reybrouck, 2021), prompting cross-modal studies of associations between movement and sound—whether literal or metaphorical—that range from strict gesture–sound couplings to looser correspondences between bodily movement and sound-related features (Jensenius, 2007). Notwithstanding this recent shift, much of this research has largely focused on geometric and topological representations of sound, such as shapes and textures (Küssner & Leech-Wilkinson, 2014; Shinohara et al., 2016)—even when addressing the manipulation of objects and materials (Deroy et al., 2013; Mesz et al., 2023)—often overlooking the more experiential, bodily perspectives, which emphasize the dynamic properties of such interactions.
In the context of strict gesture–sound couplings—where movement directly causes sound production, as in instrumental gestures (Bianco et al., 2010; Rasamimanana, 2012)—the significance of mechanical energy exchange that is dictated by physical laws has attracted attention in previous studies (Mion et al., 2010). In this scenario, force applied through movement against the resistance of the physical artifact is considered as the cause and sound as the resultant effect, a concept approached through the field of kinetics in biomechanics (Winter, 2009). In contrast, the ergotic (Cadoz & Wanderley, 2000; Cadoz, 1994; Luciani, 2007)—force-based—qualities of ancillary or sound-accompanying gestures (Nusseck & Wanderley, 2009)—that do not directly produce sound but co-occur with music, such as those involved in singing—have remained largely unexamined. In these cases, despite the absence of actual material interaction and sensation of resistance, executed gestures may still convey the impression of varying degrees of power and exhibit what could be termed quasi-ergotic characteristics, in that they simulate the enaction of energy transfer. However, most existing research has focused predominantly on the relationship between melodic contours and the spatial geometry of hand gestures, largely neglecting their dynamic aspects. This gap calls for exploring the significance of how energy and effort are implicated in the way gestures without physical contact may be linked to sound in musical contexts, such as in singing.
Surprisingly, until recently, the role of accompanying gestures in singing performance had received relatively little scholarly attention. However, there has been a notable surge of interest in this field in recent years, evidenced by a growing number of publications—many centered on Indian classical music—for example, Clayton et al. (2022), Clayton & Leante (2013), Erdemir et al. (2012), Fatone et al. (2011), Luck & Toiviainen (2008), Liao & Davidson (2007), Moran (2007), Paschalidou (2022), Pearson & Pouw (2022), Rahaim (2012) etc. Specifically, in Dhrupad vocal music—a sub-genre of Hindustani or north Indian classical music—singers often simulate interactions with objects during improvisation, engaging in gestures that resemble stretching, pulling, pushing, or throwing. During these manual interactions with imaginary objects—labeled in this manuscript as “MIIOs”—singers appear to engage with the melody by sculpting spaces as if they were real to convey different types of sonic information. These spaces are imagined as filled with malleable substances (Paschalidou, 2024), allowing for a variety of interactions that convey a unique sense of resistance and call for different levels of energy input or effort exertion by the performer.
These ideas are not unique to Hindustani vocal music; listeners and performers across various music genres report experiencing virtual worlds of forces when engaging with music (Eitan & Granot, 2007; Fatone et al., 2011)—particularly in slower passages (Eitan & Granot, 2006)—through analogues to gravity, magnetism, inertia and other physical metaphors (“musical forces”; Larson, 2004), with the exerted effort being recognized as a fundamental aspect of expressive potency (Ryan, 1991). What distinguishes Dhrupad, however, is the readily recognizable gestural expression of physically inspired concepts through MIIOs. Familiar, tangible interactions with the real world, which do not typically produce sounds but are spontaneously imitated by Hindustani vocalists during gestures that lack a real mediator, may unveil significant cognitive processes linked to deeper concepts, surpassing straightforward mechanical connections between a specific instrument and its sensory output.
Previous studies in Hindustani music have shown that vocalists navigate melodic spaces (Neuman, 2004) not only through the spatial geometry of hand movements and their trajectories (the melographic representation in 3D Euclidian space; Rahaim, 2012) but also through embodied sensations of force, resistance, and the effort exerted to overcome them while moving an (only imagined) object (Paschalidou, 2017). This idea is backed up by interview material in Dhrupad vocal music, in which vocalists often resort to motor-based metaphors and embodied interactions with imaginary objects and materials when discussing about sound and melody (Paschalidou, 2024; Paschalidou & Clayton, 2015). Effort during MIIOs in Dhrupad singing has also been studied by inferring vocalists’ effort levels through linear regression models that are based on acoustic and movement features (Paschalidou et al., 2016; Paschalidou, 2022). The current paper extends this line of research by relying on the annotation and analysis of audio-visual recordings of Dhrupad vocal improvisations. It examines the role of effort during MIIOs by exploring its relationship to MIIO types and their melodic counterpart. It further seeks to assess the consistency of such effort-related gesture–sound links across performers, discern idiosyncrasies, and explore different dimensions of effort that may correspond to either conceptual elements—such as the melodic organization—or mechanical aspects—for instance, the demands of vocal production.
Background
Introduction to Dhrupad
Dhrupad, one of the two predominant styles of Hindustani music (Sanyal & Widdess, 2023), is characterized by its monophonic (primarily) vocal tradition heavily reliant on improvisation known as alap. This improvisational style adheres strictly to rule-based structures within the raga system, a melodic mode lying between scale and tune (Powers & Widdess, 2001). The tonic is defined by the main performer’s comfortable pitch, and all other pitches are tuned relative to this. The improvisation unfolds gradually, starting at a notably slow pace in the middle tonic, then moving in the vocalist's lowest pitch range (the lowest octave) and building up dramatically in pace, pitch, and melodic tension as it ascends over approximately 2.5 octaves towards the climax, typically the 3rd scale degree of the highest octave, before finally descending to the middle tonic again. The rendition is sung without apparent rhythm, using a repertoire of non-lexical syllables like “ra,” “na,” and “num.” Melodic tension is periodically released through longer stops on the tonic and shorter stops on the 5th, ultimately resolving when the climax is reached.
Dhrupad as Case Study
The conceptualization of melody as a continuous pitch “space” (Fatone et al., 2011), where Dhrupad vocalists often approach distinct notes through smooth trajectories instead of simple scale steps (Battey, 2004), establishes Dhrupad as particularly conducive for investigating associations with the non-discrete nature of movement. Since Dhrupad is an oral music tradition, its knowledge is passed down through direct demonstration and imitation rather than through music notation, and it encompasses not only sound but also movement. The resulting gestural resemblance that is evident between teacher and students (Rahaim, 2012 and personal observations)—despite the lack of explicit gestural instructions–supports the intentional decision to gather material from a single music lineage. This approach allows for an examination of both the similarities (inherited bodily dispositions) and differences (idiosyncrasies) in the gestural habits of musicians who share the same teacher.
Gestures in Hindustani Vocal Improvisation
In Dhrupad, singers appear to engage with the melodic content in two distinct modes, as suggested by Rahaim (2012), reflecting different body-voice relationships: the open-handed and the closed-handed. The open-handed mode features hands tracing curves and trajectories effortlessly in space, offering a simple melographic representation of the sound (ibid.). In contrast, the closed-handed mode involves powerful movements that start with the formation of a grip and comprise gripping, intensification, and releasing phases, resembling actions of engagement with an (only imagined) object, such as when stretching or compressing an elastic material, pulling or pushing a heavy object, or throwing and bouncing a ball. These powerful movements, however, do not straightforwardly represent the melodic sound in terms of spatial pitch height and they do not carry any symbolic meaning. Instead, they seem to indicate proprioceptive sensations of resistance employed by singers to manipulate notes as smooth pitch glides and, hence, they predominantly convey dynamic aspects of their acoustic counterpart. Despite the absence of a real object, a noticeable correspondence between voice and manipulative gestures can be visually perceived by a third person, seemingly mediated through the imagined material and its assumed resistance. The way these gestures are executed convey to an observer a heightened sense of effort exertion, reminiscent of the effort required to overcome the resistance felt when manipulating a real tangible object, either fighting against or yielding to it.
Based on the conceptualization of melody as activity occurring within the imagined pitch space of the raga, melodic movements in MIIOs appear to be regulated by the effortful gestures enacted by the performer as a force agent. These gestures often express an embodied negotiation with an imagined counterforce, as if interacting with a physical object that either resists or yields in response to the applied force. The degree and dynamics in exerting effort appear to be governed by the perceived materiality and physical properties of the imagined object and to be regulated by the force applied against its imagined resistance—forming a delicate equilibrium that defines the energetic congruence between gesture and sound.
Drawing on embodied (music) cognition theories (Leman et al., 2018; O’Regan & Noë, 2001; Varela et al., 1993) and extending Gibson's ecological theory of affordances (Gibson, 1979) to encompass objects in the imagistic domain too, the current study explores the premise that musical thinking in Dhrupad singing is rooted in ubiquitous patterns of actions we are familiar with from interacting with the real world. It also investigates the assumption that the link to the sound, when Dhrupad singers seemingly engage with an imaginary object, resides in the interaction possibilities the object may afford and the effort it requires for its manipulation. Enactive theories and ecological psychology highlight the importance of sensorimotor skills and robust movement–sound contingencies (“know-how”) developed through real-world interactions with objects (Freed, 1990; Gibson, 1979; Warren & Verbrugge, 1984). Vygotsky and Gal’perin argue that mental acts stem from material actions (Parreren & Carpay, 1972), while Bakker et al. (2012) assert that advanced cognitive functions develop through patterns of gestural manipulation of physical objects. These patterns serve as the foundation for our understanding of any sound (Godøy, 2009; Maes et al., 2014), both actual and virtual (Clarke, 2005), as we continually simulate sonic features or actions in our imagistic domain (Cox, 2011; Reybrouck, 2012; Zbikowski, 2002). This perspective allows us to understand MIIOs as carriers of archetypal patterns and behavioral opportunities that are defined by the physical properties and resistive forces (viscosity, elasticity, weight, and friction) of the imagined materials and are linked to certain patterns of sonic outcomes. However, the relationship between hand gestures and voice in MIIOs has not been systematically studied with respect to perceived levels of effort.
Effort
The term “effort” is commonly used in everyday language to convey the level of exertion needed to achieve a goal. While it underscores the forcefulness of actions—as often preferred over more effortless ones (Inzlicht et al., 2018)—and it highlights intentionality, defining it precisely in scientific fields like physiology, kinesiology, biology, neuroscience, and psychology is challenging (Massin, 2017; Richter & Wright, 2014). This challenge stems from effort's perceptual and subjective character (Steele, 2020) and the intricate nature of goals, which may include both physical and cognitive aspects, making effort elusive and hard to measure (Dewey, 1897). While physical force-related proxies, such as “weight” (Niewiadomski et al., 2013), “pressure” (Moore & Yamamoto, 2012), kinetic energy (Piana et al., 2013), or “Quantity of Motion” (Mazzarino et al., 2009), are tempting, given that individuals possess varying capacities (intrinsic factors, like fitness) to accomplish a certain physically and/or mentally demanding task (extrinsic factor), makes effort difficult to quantify.
In psychology and neuroscience, mental (or cognitive) effort refers to the subjective feeling of cognitive strain needed for a task (Westbrook & Braver, 2015; Mulder, 1986), linked to mental imagery (Papadelis et al., 2007) and attention (Bruya & Tang, 2018; Kahneman, 1973). In dance and kinesiology, effort is a subjective measure of expression tied to inner intention and the forces shaping and constraining movement (Laban & Lawrence, 1974). In music, it reflects the tension in a piece (Cox, 2016; Krefeld & Waisvisz, 1990) and is deemed crucial for both performers and audiences (Olsen & Dean, 2016). Performers must endure a certain degree of hardship to highlight particularly intense musical segments, while audience excitement is often fueled by performers’ gestural virtuosity and energy exertion (Vertegaal et al., 1996). Godøy (2006, 2009) emphasized how the sensation of effort and energy transfer is conveyed through the gestural shaping of musical sound, proposing an extension of Schaeffer's (1966) concept of the sonorous object into what he termed the gestural-sonorous object. Despite its significance, effort in music has received limited systematic and experimental attention, with related studies and publications emerging only in recent years (e.g., Bennett et al., 2007; Paschalidou et al., 2016; Paschalidou, 2022; Tomás et al., 2021). This study relies on third-person annotations of perceived effort levels that the performer appeared to commit to each MIIO event. Despite the absence of a real object, observers were expected to be able to make such judgments through visual observation of the dynamic and kinetic properties of the observed movement.
Aims and Objectives
The primary aim of this study is to better understand the role of effort in MIIOs and investigate its potential correlation with types of MIIOs and distinctive melodic attributes during Dhrupad vocal improvisation. The objectives of the study comprise:
Identifying specific cross-modal associations, between effort levels, classes of gestures, and coded melodic elements when the singer seems to engage with imaginary objects; Deducing findings about whether the abovementioned relationships—in case they exist indeed—are limited to a particular performer and performance or if they are rather generic; Exploring whether levels of bodily effort correlate predominantly with the physical demands of vocal production or rather with melodic aspects.
Methodology
Methodological Approach
To ensure ecological validity, designed experiments were avoided and instead the study relied on a non-participant third-person video observation analysis of qualitative data, namely audio-visual material from selected improvisation performances, that were specifically recorded for this study in the field, in India. This allowed the examination of gesture–sound links that vocalists have established over years of practice rather than spontaneous responses to stimuli.
Data Collection
To ensure gestural consistency often seen within teacher–student lineages (Rahaim, 2012), all selected performers were disciples of Zia Fariduddin Dagar, who was also recorded. Participants were only informed that the recordings were part of a research project on music and movement, and were asked to perform only the opening, slowest, un-metered section of an alap improvisation, sung to non-lexical syllables without percussion accompaniment, to isolate melodic factors without the influence of meter or lyrics. The setup included soloist and tanpura player, and videos were captured in night-shot (infrared light) due to low lighting required for concurrent motion capture (not analyzed here). Ethics approval was granted prior to data collection. Written informed consent and recording agreement release forms were signed by all participants before each performance, that specified both the collection and the use of the data (including publication), retaining the right to withdraw at any time should they wish to do so. A small compensation per session was offered to the participants.
Of the 17 recorded performances, four performers (three performances, one of which a duet) were selected for detailed analysis based on the frequency and clarity of MIIOs: Afzal Hussain, Lakhan Lal Sahu, and the Gundecha brothers (Umakant and Ramakant), as shown in Table 1. All participants were recorded for this research during field work in India in January 2011; Hussain (11.01.11) and Sahu (06.1.11) in Fariduddin Dagar's music school in Palaspe, and the Gundecha brothers (16.01.11) in their music school in Bhopal. To compensate for the brief performance by the Gundecha brothers (less than 15 min of slow alap each) which limited opportunities to observe MIIOs, an alternative recording of the two brothers in duet by Clayton, Leante, and McGuiness at Aikatan Auditorium in Kolkata in February 2007 was utilized.
Data overview for vocalists Afzal Hussain, Lakhan Lal Sahu, Ramakant, and Umakant Gundecha. It includes performer name, raga name (melodic mode), total duration of recorded improvisation, tonic tuning, number of clear-cut MIIOs that were identified, annotated, and analyzed, range of individual MIIO durations, and finally number, percentage (out of total unambiguous MIIO events) and fine types of IwEOs and IwROs classes (interactions with elastic objects and interactions with rigid objects).
Analysis
Code Development and Annotation
In the first stage of analysis, repeated manual effortful gestures that allude to MIIOs were first visually identified, labeled, and classified in terms of recurrent types of hand gestures (the “action-oriented ontology” (Leman & Godøy, 2009) of MIIOs). The identified movement events were manually segmented, resulting in an audio-visual database of MIIOs, and each of them was manually annotated in terms of (a) MIIO type (categorical descriptors), (b) perceived levels of effort on a scale between 0 and 10 (ordinal-numerical, with 10 being the highest) and (c) various melodic aspects (categorical descriptor).
The audio material (48 kHz sampling rate) was coded and segmented in Praat 1 and then imported in the ANVIL 2 annotation environment, where the video (29.97 fps frame rate) was also coded and segmented visually on a frame-by-frame basis (33.3 msec resolution). Unique labels (the “code book”) were defined in an XML script for ANVIL and in a TextGrid for Praat and they were loaded prior to annotation in the corresponding software. The integrated ANVIL file contained separate audio and movement groups (as shown in Figure 1), each comprising multiple attribute coding tracks. The coding scheme evolved iteratively, informed by observations and prior interview analysis. Sensorial descriptors (adjectives, verbs, nouns) of motor-based metaphors or pictorial terms (Sanyal & Widdess, 2023) from transcribed interviews were organized into meaningful overarching themes, forming a table of performer-object interactions that guided the development of the coding scheme. The produced annotation was cross-validated, as explained next.

Annotation in the ANVIL Environment (example taken from annotation of Afzal Hussain's performance).
Code Validation (Inter-Coder Agreement)
Inter-coder validation of gesture classes (MIIO types) and perceived effort levels was conducted by comparing annotations from two professional dancers/choreographers with those of the main researcher (Guest et al., 2012). This stage was only applied as a proof of concept to a single improvisation, the one by Afzal Hussain, and only for one third of the samples (30) due to the large size of the corpus. Agreement was measured by Cohen's kappa value and tested against chance levels (Kipp, 2012).
Code Application
What followed was a pair-wise association analysis for the investigation of consistency in the co-occurrence of coded MIIO classes with certain melodic aspects and effort levels.
Analysis
Coding Scheme
The coding scheme for all performers comprised {gesture; melody; effort}, whereby gesture refers to MIIO classes and melody was coded by {melodic movement classes, octave range, melodic intention, pitch interval}. The exact coding for each of these components is explained in what follows here.
Coding of Gestures
MIIOs were identified based on closed-hand gestures suggesting the performer was grasping and manipulating an imaginary object under resistance, often accompanied by facial strain. Initial fine-grained categories were simplified through repeated observations of the video footage into two main types: interactions with elastic objects (IwEO), involving stretching or compressing of deformable objects, and interactions with rigid objects (IwRO), involving displacement of non-malleable objects (e.g., pushing, pulling, collecting, throwing). These were visually distinguished by their dynamic profile (constant vs. variable effort) and the presence or absence of recoil toward equilibrium. Ambiguous gestures were excluded. Pushing gestures were further split into “pushing-away” (rigid mass) and “pushing-to-compress” (elastic object). “Hold steady” gestures involve static bimanual grips: the grappolo (light fingertip contact with diverging hands) and the kite-flying grip (clenched fists), following Rahaim (2012). Their function is analyzed either separately or jointly, depending on context. Following Rahaim (2012)—similar to Kendon (1967)—“hold steady” gestures, reflecting static bimanual grips, were distinguished between “grappolo grip” (light fingertip contact with diverging hands) and the “kite-flying grip” (clenched fists). Their function is analyzed either separately or jointly, depending on context.
Table 1 summarizes all relevant information about the data that was used for each of the performances and illustrates the correspondence between annotated fine and coarse gesture classes.
The analysis focuses only on clearly defined, commonly observed gesture classes across vocalists. Ambiguous gestures and idiosyncratic ones specific to individual performers were excluded. While this reduced the dataset, it improved ecological validity, facilitated annotation cross-validation and inter-coder agreement testing, and increased the reliability of the findings. The gesture coding scheme was specifically developed for each performer through iterative observations of the video material and is outlined in Table 2.
Overview of gesture coding schemes used per vocalist, including fine and coarse MIIO classes. The schemes are performer-specific. They were developed through iterative observation of the video material and informed by interviews.
Coding of the Melody
For each of the gestures, the associated melodic counterpart was also annotated and classified. Features, selected through repeated viewing, varied by performer, performance, and raga, and included recurrent melodic movement types—primarily raga-specific pitch glides—glide direction (ascent, descent or both in succession), pitch interval, octave range, and melodic context (probability of resolution toward the tonic immediately after the annotated phrase). Melodic aspects are described using the tone material specific to each raga.
Table 3 captures all types of melodic movement classes used in the coding scheme of each performer.
Overview of melodic movement coding schemes used per vocalist. The schemes are performer- and raga-specific. According to the unique case of each individual performance/performer, either the coarse melodic movements are used in the analysis or the individual cases that feature in some of the cells of this table.
“Steady single note” refers to the prolonged singing of a specific note, usually the tonic or the 5th degree, sometimes approached through an initial glide. Gamaks only featured as linked to MIIOs for Sahu. Specific to the Gundecha brothers, the category “weight & release” represents a melodic movement where the initial pitch is emphasized as a standalone note, succeeded by a (gentle) monotonically ascending or descending slide (the release) to another note. This glide differs from the typical pitch glides, where emphasis is on the target note, creating the impression of the first note being drawn towards the target during the ascent. Finally, only Umakant performed ascending glides that spanned over an octave or even more, symbolized here as “1 octave+”. Although they fall under the “straight ascent” single glides, they have been classified separately due to the extremely large span of pitches.
Additional Melodic Aspects
In addition, the melodic intention, the pitch interval and the octave range were also included in the coding for the voice.
An overview of additional annotated melodic aspects can be found in Table 4.
Overview of additional melodic movement coding schemes used per vocalist. The schemes are performer- and raga-specific. The decision for how to represent pitch intervals (in cents, number of semitones, or scale degrees) was taken based on individual requirements for representing the respective variations in intervals of each individual performance/performer.
Coding of Effort
Third-person effort annotations were conducted with audio purposefully kept on, in order to account for the complex and compound nature of effort.
Table 5 gives an overview of effort level histograms and mean effort values per vocalist. It is important to emphasize that these values are specific to each performer's effort range, observed during the duration of the entire alap improvisation. Hence, they cannot be compared to those of other vocalists.
Overview of mean effort levels and effort histograms per vocalist. Mean values refer to the annotated effort level range specific to each performer, precluding comparisons with those of other vocalists. By plotting effort histograms, it has been possible to identify deviations from a normal distribution. While the paper does not encompass detailed statistical results regarding gesture–sound associations, the departure from a normal distribution underscores a limitation in the application of statistical methods that must be acknowledged. For Instance, calculating mean, median, and mode values would yield different results.
Gesture and Effort Inter-Coder Agreement Test
The reliability of the main researcher's annotation for gesture classes and effort was first examined. The calculation of inter-coder agreement coefficients for categorical gesture classes and ordinal (numerical) effort levels was based on three coders, 30 cases, and thus 90 decisions. The cross-validation of manual annotations was computed based on an inter-coder agreement test by Krippendorff (2011), which gave the following agreement coefficient values:
- α = .81 on the ordinal values of the effort level - α = .62–.91 on the categorical values of gesture classes
3
.
In the absence of a defined threshold for inter-coder agreement, the study has adopted typical values from the social sciences: α ≥ .8 for reliable data, α ≥ .667 for tentative conclusions, and even lower values for discarding unreliable data (Popping, 1988). The alpha value for effort level annotation surpasses the reliability threshold, indicating a high level of agreement. For the annotation of gesture classes, the alpha values lie between 0.62 and 0.91, but taking into account the fact that the identification of interaction types is carried out in the absence of a real object, these values can be considered sufficiently high for deeming the annotation reasonably reliable. Consequently, for the purposes of this work, the manual annotations by the main researcher were regarded as valid for the subsequent analysis stages.
Results
The stacked bar chart of Figure 2a visualizes the frequency of co-appearance between multiple categorical movement and melodic variables. Figure 2b illustrates (mean) effort values for each of these movement-melody pairs of co-occurring categorical variables.

Afzal Hussain: (a) Stacked bar chart displaying associations between a number of melodic movement classes vs. gesture classes. (b) Heatmap displaying the association between gesture classes, melodic intention, melodic movement classes, and effort levels, with colors indicating the level of effort associated with each gesture–sound pairing, scaled using the min-max range of effort values and with gray representing missing data (NaN). (c) Stacked bar chart, displaying associations between gesture classes and the intention to rise to the steady tonic (boolean: yes/no), after each annotated gesture. (d) Scatterplot of effort levels vs. pitch interval (in cents).
Afzal Hussain
The following summarizes observed trends that occurred regularly in effort—gesture—sound associations for Afzal Hussain in raga Jaunpuri.
Gesture–Melody Associations
The cross-tabulation of Table 6 in the Appendix displays the number of co-occurrences between gesture classes and melodic movement classes for Afzal Hussain. As can be also visually deduced from the stacked bar chart of Figure 2a that derived from this Table, interactions with elastic objects (stretching and pushing-to-compress) are more likely linked with melodic activity in the upper part of the octave (such as …/b7\b6 and 5/2’\b7\b6) 4 , while interactions with rigid objects (pulling, collecting) can occur in both octave parts (with a higher effort in the upper part). Interactions with elastic objects (stretching and pushing-to-compress) are primarily used with double pitch glides in the upper part of the octave (Figure 2a), without an intention to ascend to the tonic, as can be deduced from Figure 2c. Interactions with rigid objects (pulling and collecting) are mostly associated with double pitch glides (Figure 2a), which—contrary to the elastic—tend to lead to higher pitches (Figure 2c); either reaching the tonic in the upper octave or ascending into the upper part of the octave when the pitch glide occurs in the lower part of the octave. Collecting gestures are employed during monotonic ascending melodic movements towards the tonic without any particular emphasis in their rendering (Figure 2a). Both grip types (grappolo and kite-flying) serve to hold and prolong a note (Figure 2a). Throwing gestures appear to be linked to other, non-classified types of melodic movements.
Effort–Melody Associations
The effort levels that appear to be exerted by the vocalist are associated with:
the pitch range within each octave, indicating higher effort for melodic activity (double pitch glides) in the upper part of each octave, as illustrated in the heat map plotted in Figure 2b; the melodic intention of ascending to or towards the tonic in what follows the annotated melodic movement, with lower effort observed when ascending, as in Figure 2c; the size (interval) of the (ascending part of the) melodic movement, exhibiting higher effort levels for larger pitch glides, as depicted in Figure 2d; the morphology of melodic glides, where the ascending vs. descending parts of a pitch glide are associated with an intensification vs. abatement in effort, respectively (conclusion derived through iterative observation of the audio-video material and supported by earlier interview-based research (Paschalidou, 2024)).
Effort–Gesture Associations
Interactions with elastic objects (stretching or pushing-to-compress) tend to be more effortful in this performance than interactions with rigid objects (pulling gestures: medium levels, throwing: low, and the two types of grips: effortless), as illustrated in Figure 2b.
Effort–Gesture–Melody Associations
Interactions with elastic objects (stretching and pushing-to-compress) tend to be the most effortful and most frequently used gesture types. They occur exclusively in ascending into the upper part of each octave, particularly the 7th degree through the common for raga Jaunpuri …/b7\b6 pitch glides, when no immediate rise to the tonic follows. Drawing a parallel with manipulating a deformable elastic object, the ascent to the 7th degree surpassing the goal pitch (the 6th) resembles overshooting a target through excessive stretching effort, while the pitch resolution to the steady 5th mirrors the subsequent recoil to rest, driven by an opposing retraction force. The microtonal fluctuation of the highest pitch reflects the performer's varying force, governed by the imagined object's stiffness and displacement. The stronger recoil force experienced with larger displacement may explain the failure of …/b7\b6 pitch glides to reach the higher tonic, aligning with Jaunpuri's descending character and the instability of the 7th degree, which evokes insufficient effort and a return of the hands to rest position. Thus, it can be argued that stretching can be viewed as an active process of controlling time and pitch accentuation, akin to an acceleration or gradient of the pitch slope.
The 7th degree is approached with slightly lower (moderate to high) effort through pulling gestures involving rigid objects, when the intention is to subsequently ascend to the tonic, where tension is released. Even less effortful (moderate) interactions with rigid objects (pulling or collecting) are associated with melodic movements in the lower part of the octave, ascending over a five-semitone interval to the 2nd or 4th degree, followed by a rise towards the tonic or upper octave, momentarily releasing melodic tension. Collecting involves a straight movement toward the tonic without any particular emphasis in its execution. The kite-flying grip—using a closed fist—serves to firmly sustain a note without fluctuations and is typically employed at the performance's outset and when establishing the tonic. Likewise, the grappolo grip—using the fingers alone—functions to sustain a note with a straight airflow as the hands move apart, yielding a softer vocal quality compared to the kite-flying gesture.
Figure 3 captures the overall trends of effort in relation to gesture–sound associations observed in Afzal Hussain's performance:

Afzal Hussain: Overview of associations among octave pitch range, gesture classes, melodic intention, and effort. Colors illustrate effort levels (blue for low, red for high). Arrows depict melodic intention. Numbers correspond to scale degrees.
The annotated gesture types and effort levels added to the transcription of a short excerpt from Afzal Hussain's Jaunpuri alap performance in Figure 4 aim to illustrate these points.

Transcription of a brief segment of raga Jaunpuri performed by Afzal Hussain, illustrating gesture types, and effort levels. Red shading indicates interactions with elastic objects (E), green represents interactions with rigid objects (R), and blue denotes ambiguous cases (A). The accompanying numbers indicate the effort level, and the width of each shaded region corresponds to the duration of the MIIO.
In summary, the results indicate that bodily effort and gestures are systematically related to their melodic counterparts based on the following factors:
The melodic organization of the raga pitch space, specifically:
the melodic tension of particular degrees of the scale according to the specific raga, with higher effort required in approaching unstable (7th) rather than more stable notes (5th and 6th); the melodic intention in ascending to or towards more stable notes, such as the tonic or the 5th in what follows the melodic movement, with higher effort for melodic movements failing to ascend and finally retracting to a lower, more stable note, such as the tonic or the 5th. Mechanical aspects related to vocal production, specifically:
the pitch interval size of the (ascending part of the) melodic movement, with larger pitch glides requiring greater bodily effort; the overall pitch range of the octave, with higher effort levels accompanying larger gestures and larger melodic movements that start from the lower part and ascend to a higher note in the upper part of the octave.
In conclusion, findings for Hussain in raga Jaunpuri indicate that MIIOs are not randomly paired to the melody, and that bodily effort is not uniform over the entire pitch range. Instead, they imply a systematic association that goes beyond the mechanical demands of vocalization (associated with absolute pitch height, such as the increased strain in the extreme pitches of the vocalist's comfortable pitch range). This association additionally aligns with raga-specific aspects, such as distinct areas of melodic activity, melodic treatment of individual notes (morphology and pitch interval of characteristic melodic movements, usually glides) as well as melodic context, i.e., the intention to move towards stable notes. This reveals that effort not only reflects the mechanical requirements of vocalization, but is also tied to the rules and melodic organization of the raga, confirming findings from interview testimony (Paschalidou, 2024).
Lakhan Lal Sahu
The following summarizes observed trends that occurred regularly in effort—gesture—sound associations only for the clear-cut cases (unambiguous gesture class annotations by the main annotator) of Lakhan Lal Sahu in raga Malkauns.
Gesture–Melody Associations
As with Hussain, the cross-tabulation of Table 7 in the Appendix displays the number of co-occurrences between gesture classes and melodic movement classes for Lakhan Lal Sahu. Many melodic phrases consist of chained double pitch glides (a/b\c, with b higher than a and c), likely paired with similarly chained bi-directional gestures. However, ambiguously classified gestures caused by coarticulation (Godøy et al., 2016)—such as seamless transitions between stretching and pulling—were excluded to maintain analytical clarity. From the remaining non-ambiguous gestures, the following trends can be deduced based on Figure 5a and attentive video observations:

Lakhan Sahu: (a) Stacked bar chart displaying associations between a number of melodic movement classes vs. gesture classes. (b) Heatmap displaying the association between gesture classes, melodic intention, melodic movement classes, and effort levels, with colors indicating the level of effort associated with each gesture–sound pairing, scaled using the min-max range of effort values and with gray representing missing data (NaN).
The small number of isolated (not chained in a group) stretching gestures are more likely associated with double pitch glides. Gamaks are clearly associated with compressing an elastic object between the hands, with the first part—that of object compression—being performed with pitch descent and the second part—that of object release—aligning with pitch ascent. Single (monotonic) ascents are rare and are exclusively performed with pulling gestures. Single (monotonic) descents are performed equally with either pulling or collecting gestures. Steady notes are mostly associated with pulling and collecting gestures, but a few cases of pushing-away and stretching gestures can also be identified.
Overall, the results indicate a clear and consistent association between melodic phrases and coded gesture classes across the entire improvisation performance, rooted in analogous cross-domain morphologies. Notably, melodic movements with two slopes (double pitch glides and gamaks) align mostly with gestures featuring two opposing phases (stretching, pushing-to-compress), such as intensification vs. abatement or a change of direction in physical space (away vs. closer). Similarly, monotonic melodic movements are linked to interactions with rigid objects (comprising only one phase in a single direction).
Effort–Melody Associations
The analysis revealed a notable association trend between gamaks and higher levels of effort. However, other effort–gesture class correlations (double pitch glides and straight ascending pitch glides with medium levels of effort; descending pitch glides and steady notes with less effortful gestures) appeared less consistent.
However, scatter plots of Figure 6 display a positive dependence of effort levels on four acoustic features (extracted from audio recordings in Praat), namely:
elapsed time of the alap performance; mean frequency of each melodic movement; maximum frequency—and thus also pitch interval—of each melodic movement; pitch interval of a melodic glide.

Lakhan Lal Sahu: Scatterplots of effort levels vs. (a) pitch interval (in semitones), (b) elapsed time (event start time) per gesture class, (c) mean pitch (logarithmic scale) per gesture class, (d) maximum pitch (logarithmic scale) per gesture class.
All elements are closely interconnected, mirroring the typical ascent of melody towards its climax in a Dhrupad alap performance.
Effort–Gesture Associations
As illustrated in Figure 6b–d, interactions with elastic objects (red), such as in stretching or compressing a malleable object, tend to be more effortful than those with rigid objects (blue), as in moving a solid item in space. Indeed, stretching and pushing-to-compress gestures (both with elastic objects) display the highest effort values among all gesture types in Figure 5b.
Effort–Gesture–Melody Associations
The findings of the analysis point to some level of consistency in the way effort, gesture, and sound are associated with each other for the performance of Lakhan Lal Sahu.
Gesture–sound associations appear to rely predominantly on cross-modal morphological analogies, in particular the asymmetry between increase vs. decrease, with e.g., intensification vs. abatement in effort associated with ascent vs. descent in pitch as well as approach vs. retraction of the hands, respectively.
Gamaks, followed by double pitch glides (albeit less significant), are the most effort-demanding melodic movements, both tied to the notion of elasticity: gamaks to the compression (with the hands moving away from the body) and double pitch glides to the expansion (with the hands moving closer to the body) of an elastic object. This confirms interview testimony by Lakhan Lal Sahu, according to whom a meend feels like “pulling a rubber band” and a gamak feels like “applying pressure” (interview, Palaspe, India, January 6, 2011). The infrequent monotonic ascents are executed with moderate effort through pulling gestures, succeeded by even less demanding monotonic descents through either pulling or collecting gestures—suggesting interactions with rigid objects. Steady notes are also mostly performed with pulling and collecting gestures of even lower effort levels.
Effort seems to be independent of the specific raga. Instead, it appears to be linked with the macro-organization of the improvisation; as the performance progresses, increased effort is required, particularly for larger melodic pitch glides that ascend to higher maximum pitches. It remains, however, unclear whether effort is linked to the mechanics of voice production in generating progressively more demanding, higher pitches or if it merely functions as a time indicator associated with the progressive intensification of the alap development.
In conclusion, both effort levels and MIIO classes are not combined in an arbitrary way with their melodic counterpart, but instead findings suggest consistent embodied strategies. However, unlike in the case of Hussain, Sahu's bodily activity during MIIOs does not seem to be associated with the conceptualization of the raga as a pitch space with regions of particular interest and potential activity. Instead, the observed effort-related gesture–melody correspondences appear to be grounded in the mechanics of voice production and the macro-level structure of the alap improvisation.
Gundecha Brothers
The following summarizes the results of the analysis on consistently observed effort—gesture—sound associations for the Gundecha brothers in raga Bhupali. As the brothers form a peculiar case of having identical musical backgrounds and singing in duet in the same performance, hence in the same raga, it is also compelling to make links between the two. Therefore, findings are presented together for each type of association discussed.
Gesture–Melody Associations
Umakant Gundecha
Umakant Gundecha's gesture–sound associations are less clear-cut than those of other performers. Still, a discernible trend shows IwEO mostly linked to double pitch glides and IwRO with monotonic melodic glides and prolonged individual notes. As shown in Figure 7a and the cross-tabulation of melodic movement versus gesture classes in Table 8 of the Appendix, the opposition to a varying force linked to the concept of elasticity (as in stretching or pushing-to-compress) appears to be again associated with double-sloped melodic movements (double pitch glides and weight & release phrases), pointing to cross-modal morphological analogies. However, these trends are less clear-cut compared to those observed with the other vocalists analyzed in this study, with large ascending pitch glides (spanning an octave or more), straight ascents, and even sustained, unmodulated notes also associated with the notion of elasticity. The opposition to a constant force (pulling, toward one's body, or pushing-away) is associated with either straight ascents or the prolonging of a single note, with a few straight descents also appearing. Collecting gestures and holding gestures are mostly paired with ascending melodic movements and prolonged single notes respectively. A few throwing gestures are also linked to single notes (not connected through glides).

Umakant Gundecha: (a) Stacked bar chart displaying associations between a number of melodic movement classes vs. gesture classes. (b) Heatmap displaying the association between gesture classes, melodic intention, melodic movement classes, and effort levels, with colors indicating the level of effort associated with each gesture–sound pairing, scaled using the min-max range of effort values and with gray representing missing data (NaN). (c) Scatterplot of effort levels vs. elapsed time per gesture class.
Ramakant Gundecha
Similarly, Figure 8a illustrates gesture class–melodic phrase associations for Ramakant Gundecha, which are also presented in the cross-tabulation of Table 9 in the Appendix. Ramakant most frequently employs stretching gestures with double-sloped melodic movements (double pitch glides and weight & release melodic phrases), with straight ascents being only scarce. No pushing-to-compress gestures were observed. As with Umakant, pulling gestures by Ramakant appear paired with monotonic ascending pitch glides, but also with descending glides and the holding of prolonged notes. Collecting gestures are used exclusively with descending melodic glides. Finally, the steady-hold gestures are employed when singing a steady note, yet this is also connected to the previously mentioned concept of pulling.

Ramakant Gundecha: (a) Stacked bar chart displaying associations between a number of melodic movement classes vs. gesture classes. (b) Heatmap displaying the association between gesture classes, melodic intention, melodic movement classes, and effort levels, with colors indicating the level of effort associated with each gesture–sound pairing, scaled using the min-max range of effort values and with gray representing missing data (NaN).
Effort–Melody Associations
Umakant Gundecha
Based on Figure 7b and attentive visual observation, double pitch glides are linked with the most effortful gestures, followed by straight ascending glides and the maintaining of a steady note. Contradictory to expectations, “weight & release” phrases and large ascending pitch glides spanning over an octave were not performed with high bodily effort levels. As anticipated, straight descents were associated with less effort. Furthermore, effort levels show a positive linear correlation with elapsed time, more evident in interactions with elastic than rigid objects as can be observed in Figure 7c. This trend indicates a continuous rise in effort, aligning with the progressive expansion of the alap improvisation and the gradual building up of tension with the progressive increase of both pace and pitch. While this might suggest a connection with the mechanical strain of voice production, a rise in effort for both extremely high and extremely low pitches would be also anticipated, which is not observed.
Ramakant Gundecha
According to Figure 8b, with respect to their melodic counterparts, double pitch glides and (the ascending part of) weight & release phrases are linked to the most effortful gestures. Conversely, straight descents and the maintaining of a steady note are associated with the least effortful gestures. These associations may possibly signify the mechanical requirements of voice production or images of instrumental gestures. Playing the rudra vina
Effort–Gesture Associations
Umakant Gundecha
The most effort-demanding gestures—in stretching—are the ones that involve simulating the manipulation of an elastic object by counteracting a variable force. Following closely are the pulling and pushing-away gestures, which require less effort, involving a consistent force that acts against the execution of the movement. Throwing and collecting gestures, as well as the steady-hold gesture involve relatively low levels of effort.
Ramakant Gundecha
Ramakant's most effortful gestures are stretching and weight & release, followed by pulling gestures. Collecting and holding gestures register on the lower end of the effort level scale.
Effort–Gesture–Melody Associations
Umakant Gundecha
In general lines, imitations of stretching an elastic object tend to be the most effortful gestures. Stretching gestures align with double-sloped melodic movements (double pitch glides and weight & release phrase types), but also with straight ascending glides or large ascending pitch glides spanning an octave or more, and even the singing of a single note. Following closely in effort levels are pulling and pushing-away gestures, where a steady force is opposed. They are mostly associated with straight ascents and single notes, but a few straight descents have also been noted. On the lower end of the effort scale are throwing, collecting, and steady-holding gestures, commonly linked with melodic descents and single notes.
Ramakant Gundecha
The most effortful gestures involve stretching, particularly associated with double-sloped melodic movements (double pitch glides and weight & release phrases), followed by pulling gestures, mostly associated with straight melodic ascents and single notes, while collecting and steady-holding gestures are situated at the lower end of the effort level scale.
Umakant vs. Ramakant Gundecha
The analysis of annotated material for the Gundecha brothers reveals that MIIOs are not performed arbitrarily, but are systematically associated with the melody. Unlike in Hussain's case, there is no evidence of an association with the raga-specific pitch space organization. Each brother maintains a rather consistent gestural repertoire throughout the improvisation, yet, despite their shared background, the two brothers showcase both common and distinct gestural traits in performing raga Bhupali. Notably, stretching is the most effortful gesture, a notion shared by both and, in fact, by all performers in this study. It aligns with double pitch glides and weight-and-release phrases, reflecting a key trend of cross-modal morphological analogy where double-sloped pitch glides (ascent and descent, or vice versa) are typically linked to dual-phase stretching gestures (intensification and abatement).
Interestingly, the specific melodic phrases accompanying these gestures vary between the two brothers. In Umakant's case, some deviations from this cross-modal pattern occur, with dual-directional stretching gestures aligning at times with single-sloped melodic movements, likely due to spatial constraints in extending the hands away from the body during large, slowly ascending pitch glides that necessitate a return. Also, Umakant progressively intensifies his movements as the improvisation progresses, unlike Ramakant. Or, as another example, although both singers devote time to presenting and establishing the tonic through various ascending glides, their gestures differ in form (e.g., spatial trajectories), direction, grip type, and the musician's active or passive attitude to apparent resistance. This observation highlight performer gestural idiosyncrasy, encompassing both unconscious personal style and conscious, deliberate interpretive choices.
Hence, despite their gestural idiosyncrasies, the brothers share underlying similarities in the way they interact with objects and in matching effort levels to their melodic counterparts, likely reflecting shared musical foundations.
Discussion
This study examined the intricate relationship between effort, gesture, and sound in Dhrupad vocal improvisation, with a particular focus on MIIOs. Here, we revisit the objectives and findings of the study and briefly address its limitations.
Identification of Specific Effort-Related Gesture–Melody Associations in MIIOs
The analysis presented compelling evidence of consistent cross-modal relationship trends between gesture classes, melodic aspects, and perceived bodily effort, underscoring the embodied nature of musical expressiveness through MIIOs. The findings signal a clear link between bodily effort and musical or vocal tension. This association is particularly evident in relation to raga-specific pitch areas—for example moving toward unstable scale degrees in contrast to resolving onto stable notes—or in navigating specific tonal regions within the general pitch space, such as an octave. It further manifests in the gradual build-up toward the melodic climax over the course of the improvisation, as well as in both the magnitude (pitch interval size) and direction (ascending versus descending) of pitch movement.
Across all performers, interactions involving elastic objects—such as stretching or compressing—consistently register as the most effortful, exceeding those involving rigid objects. These gestures reflect a shared embodied interpretation of dynamic melodic movements that unfold in two opposing directional phases, with effort peaking during the stretching phase of double pitch glides and the compression phase of gamaks. This dual-phase structure resonates morphologically with the intensification-abatement pattern of these gestures, indicating a perceptual and kinesthetic alignment between the temporal shape of melodic movement and gestural effort contours. All performers demonstrate these cross-modal morphological analogies, particularly evident in the directional asymmetry between increase and decrease: for example, double-sloped melodic movements (e.g., double-sloped pitch glides or gamaks) align with bi-directional gestures (e.g., stretching or compressing of elastic objects), and monotonic movements correspond to single-phase interactions (with rigid objects). This directional asymmetry—manifesting as intensification versus relaxation in effort, expansion versus contraction in (e.g., a stretching) gesture, pitch ascent versus descent or melodic tension versus release in melody—suggests a shared energetic structure between effort, gesture, and melody. This structure appears to arise from the interplay between the physical properties of imagined objects and the energy input of the force agent, supporting the notion of embodied negotiation with imagined forces.
Such patterns of directional asymmetry are shaped by the energetic profiles of archetypal interactions with the imagined objects, conceptualized as sequential event units—or “motion bells” (Camurri et al., 2003)—that peak at moments of maximal effort, coinciding with gestural emphases (e.g., full hand extension in a stretch) and heightened melodic or vocal tension. This phenomenon resonates with Daniel Stern's concept of vitality dynamics (Stern, 2010)—patterns of peaks and decays or impulses and rebounds—arising from our inherent ability to navigate between approaches and withdrawals. Similarly, it aligns with the conception of melody as curves of “significant energetic shaping through time” (Hatten, 2004), characterized by phases of intensification and relaxation that arise from internal patterns of energetic tension and release and result in a “play of psychological tensions” (Kurth, 1922), and finally with the conceptualization of musical experience as a dynamic ebb and flow of tension that evokes affective responses (Lerdahl & Krumhansl, 2007; Vines et al., 2004). Collectively, these perspectives support the view that MIIOs operate not merely as expressive artifacts but as structured interactions that share a common energetic structure with the melody.
Performer-specific trends are further revealed through the analysis. For Hussain, beyond the general patterns discussed earlier, bodily activity during MIIOs—whether involving opposition to elastic (IwEO) or rigid (IwRO) objects—shows pronounced and distinct links to raga-specific zones of heightened melodic interest and tension, alongside raga-independent features of the general pitch space, such as direction and magnitude. In the raga-specific domain, bodily effort appears linked to the approach of unstable scale degrees versus melodic resolution to stable notes. In the more general pitch space, it appears linked to octave pitch range (upper/lower) and the size of ascending pitch intervals. These associations suggest a pitch-sensitive interaction between gesture and sound, in which bodily effort is finely attuned both to the hierarchical structure of the raga and to a broader, nuanced embodied awareness of pitch navigation. The findings suggest that Hussain not only embodies the tonal framework and expressive grammar of the raga, but also responds to structural functions of pitch space through effortful gestural engagement.
Consequently, with respect to the raga-specific associations identified above and as illustrated in Figures 3 and 4, a raga does not appear as a homogeneous pitch space of bodily activation, but rather as a differentiated landscape of pitch-effort regions. This insight, particularly with respect to bodily effort, offers a novel contribution not previously explored with such specificity through observation video analysis. Hindustani musicians often refer to a raga as a melodic pitch “space” (Fatone et al., 2011) or “jagah” (Neuman, 2004, 2012) within which melodic action may unfold as trajectory in a continuum between scale steps (Battey, 2004); however, this space is not uniform. Rather, it is “colored” at different points, reflecting the grammatical rules of the melodic mode—the raga's “topography” Rahaim (2012)—and the aesthetic preferences of the musical lineage, implying that not melodic paths are permissible nor all points are reachable in the same manner. The findings for Hussain further substantiate this concept by demonstrating how bodily effort and gesture-class distribution vary across scale degrees, revealing a structured, context-dependent interaction between the physical and melodic elements of the raga.
Unlike Hussain, Sahu's effort exertion during MIIOs does not appear to reflect a hierarchical conceptualization of the raga's pitch space but to align with two distinct dimensions: with the large-scale progression of the alap improvisation—a gradual intensification over time, in tandem with the progressive ascent toward its climax—and with the local articulation of individual melodic movements—marked by brief surges of exertion shaped by cross-modal morphological analogies, particularly for ascending pitch intervals. Among these, morphological analogies emerge as the most salient factor. These findings suggest that Sahu's bodily effort and observed cross-modal correspondences are coupled both with the fine-grained melodic phrasing and the large-scale structural progression of the improvisation, as well as to the physical demands of voice production. However, it is uncertain whether the latter reflects the mechanical effort required in producing higher pitches or the gradual intensification typical of improvisational development over time.
For the Gundecha brothers too, unlike Hussain, no strong links emerged between gesture or effort and the tonal hierarchy or pitch space organization of the raga, suggesting that in their approach, gestural-melodic associations are driven more by phrase shape and movement dynamics than by pitch-centric semantic organization. For Umakant, bodily effort and cross-modal correspondences appear to be coupled both with local cross-modal morphological analogies—reflecting the shape and phrasing of individual melodic movements, as is also the case for Ramakant—and with the gradual intensification across the performance—corresponding to the large-scale structural progression of the improvisation toward its climax. This time-dependent progressive intensification in effort was not observed in Ramakant's performance, suggesting different strategies of expressive embodiment or alternative prioritization of vocal or gestural cues. Broadly, even when performers engage in similar melodic functions during their improvisations, the associated gestures can vary in form, spatial directionality, grip configuration, expressive nuance of gestures linked to similar melodic structures, and in the degree of active or passive bodily engagement in response to perceived resistance. This points to a layered embodiment of musical material, where common morphodynamic patterns coexist with idiosyncratic and situational displays.
Findings About the Extent of Cross-Performer/Ance Consistency and Agreement vs. Idiosyncracy
While there is a certain degree of cross-modal consistency among performers, a notable level of flexibility in how each singer employs hand gestures to illustrate these relationships is also observed, reflecting the idiosyncratic character and the gestural habits of the individual, and possibly also the peculiar characteristics of the raga. For instance, while interactions with elastic objects (stretching out or pushing-compressing) tend to require higher levels of effort than those with rigid objects (moving a heavy object in space) across all performers, the exact types of MIIOs and melodic phrases are not always consistent among different performers and cannot be easily generalized. The analysis of the Gundecha brothers is particularly noteworthy, revealing that while MIIOs are systematically connected to melody and effort for each performer, the two brothers display both common and unique gestural styles in the same raga, despite their shared background. For instance, stretching is the most effortful MIIO for both brothers and tends to be associated with double pitch glides. However, Umakant—unlike Ramakant—intensifies his movements as the improvisation progresses. Additionally, while both focus on establishing the tonic through ascending glides, their gestures differ in form and execution, reflecting personal idiosyncrasies. Despite these differences, both share underlying cross-modal structures in their gestures, particularly in matching effort levels to melodic features. In sum, the findings affirm that cross-modal associations in MIIOs are both structurally grounded and flexible, modulated by personal gestural styles and expressive strategies, as well as situational factors.
The findings on MIIOs align with the account of the “paramparic 6 body” in Hindustani vocal practice (Rahaim, 2012), which frames singing-related movement as shaped by both unconscious inheritance and personal volition. On one hand, frequent gestural mirroring between students and teachers (Paschalidou, 2024) suggests that movement–sound associations are transmitted through oral pedagogy. On the other hand, observable divergences in individual gesturing styles—such as the distinct MIIOs employed by the Gundecha brothers within the same raga, as highlighted in this study—indicate idiosyncratic bodily tendencies shaped by personal gestural habits and expressive preferences. These observations support the view that gesture-voice relationships are grounded both in the performer's rudimentary ecological knowledge of the imagined object affordances (Gibson, 1979) and in enculturated motor patterns developed through prolonged training, involving direct demonstration, emulation, and repetition. They may reflect cognitive structures developed for various reasons: recurrent, slightly varied sensorimotor experiences with real objects (Varela et al., 1993), the tacit transmission of bodily dispositions through visual engagement with the teacher (Rahaim, 2012), and the iterative reinforcement of “correct” gesture–sound associations under a teacher's guidance (Rodger et al., 2007).
Findings About Whether Effort is Mostly Linked to Mechanics of Vocalization or to Melodic Aspects
The analysis revealed that effort in MIIOs cannot be merely attributed to biomechanical aspects, but also involves conceptual considerations. Biomechanical aspects refer to pitch-related physical demands of vocal production 7 —observed in all performers, though less so with Ramakant Gundecha—such as the interval of ascending pitch glides (most prominent for Hussain and Sahu), the lowest and highest pitches within a melodic movement (examined only for Hussain and Sahu), and the average pitch height (for Sahu). Conceptual considerations are reflected in the structural rules of the raga regarding pitch space organization (Hussain), the macro-structure of alap improvisation (Sahu and Umakant Gundecha), the melodic context or intention and anticipation of the performer in ascending to or towards a stable note, such as the tonic or the fifth (all performers apart from Ramakant Gundecha), and in analogous cross-modal morphologies that express the fundamental concept of directional asymmetry between ascent and descent, such as withdrawal versus approach in hand gestures, ascent versus descent in melody, and intensification versus abatement in effort—which applies across performers.
Limitations and Future Directions
A key achievement of this study is its high ecological validity in relying on real performances. Yet the limited and variable occurrence of MIIO gestures in such contexts resulted in a small dataset, limiting the reliability of intra- and inter-performer consistency assessments and restricting broad generalizations about the entire Dagar style of Dhrupad. To sustain this ethnographic emphasis over controlled experiments, future research should expand the dataset to include multiple performances of the same raga per performer, enabling more robust intra- and inter-performer comparisons, and incorporate a second-person perspective on gestural intent and embodied strategies (Leman & Godøy, 2009) through performers’ self-annotations. In addition, the perceptual basis of effort attribution warrants systematic investigation: dedicated experiments should disentangle the relative contribution of visual and auditory cues in observers’ assessments of effort. Finally, integrating biosensing technologies—such as electromyography (EMG) or force sensors—could provide physiological indices of effort exertion, offering complementary insights into the embodied mechanisms underpinning MIIOs (Dahl, 2011; Gibet, 2010).
Conclusions and Future Considerations
The current paper emphasized the role of bodily effort in musical expressiveness and aimed to understand the functionality of manual interactions with imaginary objects (MIIOs) in Hindustani Dhrupad singing. Specifically, it sought to assess perceived interaction effort levels and force-related concepts, explore their intrinsic relationships with MIIO types and their melodic counterparts, and determine whether—in case they do exist—they are unique to individual performers or shared across performers, but also whether they reflect predominantly mechanical or mental aspects. Utilizing video observation analysis in four case studies across three ragas, the study employed a third-person perspective for identifying and classifying MIIOs, leading to an action-oriented MIIO ontology—broadly distinguishing between interaction possibilities with malleable versus rigid objects—and a basic typology of prominent melodic movements for each performance.
According to the findings, effort-related gesture features used by Hindustani vocalists during MIIOs—though enacted with objects that are only imagined—are neither symbolic nor incidental. Instead, they reflect systematic embodied strategies of structured interactions: they are defined by a combination of musical training, idiosyncratic elements of expressivity, and ecological knowledge. Despite the limited generalizability of the findings due to the small dataset, the analysis revealed recurrent effort-related cross-modal associations linking gestures and melodic qualities in Dhrupad vocal improvisation—particularly involving opposing forces and variations in exerted effort across pitch ranges. However, the way these associations manifest in performers’ hand movements appears more individualized, less evident, and potentially less consistent. Notably, MIIOs seem to serve a dual role: they may mechanically support vocal production or mentally emphasize particular pitch regions of activity within the raga space or the melodic structure of the improvisation, thus serving the potential requirements of melodic expression.
Hence, it could be argued that musicians’ ability to imagine musical sound is enhanced by retrieving motor programs and image schemata from familiar interactions with real objects, which may explain why imaginary objects are so spontaneously utilized. The imagined objects can be conceptualized as carriers of patterns and potential behaviors (Camurri et al., 2001), governed by their imagined properties—such as size, shape, or material—that imply specific sonic outcomes. Our findings highlight the significance of prior sensorimotor experience in action-based knowledge or “know-how” and the vital role of bodily effort, central to enactive theories and ecological psychology (Noë, 2006; O’Regan & Noë, 2001; Varela et al., 1993). In line with previous work (Godøy, 2006; Krueger, 2014; Menin & Schiavio, 2012; Tanaka et al., 2012; Warren & Verbrugge, 1984; Zbikowski, 2002), movement–sound associations reflect not just mechanical links but profound and ubiquitous cognitive schemata shaped by recurrent patterns of interactions with materials of the real world. These associations are integral to musical intention and expression, with gestures driven by acoustic goal-points of melodic expression, like peaks or accents (Dahl, 2006; Godøy et al., 2012). In MIIOs, such goal-points fuse physical (“pragmatic-like” reflecting the imitation of real-object interactions), acoustic, and imagined elements: spatial targets or objects states (position, size, condition), salient pitch features, and quasi-perceptual multimodal sensory imagery (kinesthetic, tactile, and proprioceptive).
Effort emerges as a key conduit in dynamically coordinating movement and sound and encoding raga-specific melodic features and intentions. Findings indicate that hand movements and the voice in MIIOs are intricately connected in our cognition with the interactions and effort possibilities that the implicated objects can afford. In the absence of actual physical resistance, singers appear to employ imagined objects to invoke proprioceptive sensations of resistance, which serve to materialize and shape melodic movement—particularly smooth pitch glides. The resulting gesture–sound pairings, while lacking physical referents, are temporally synchronized and perceptually salient in terms of their dynamics, indicating that imagined material properties mediate expressive coordination. The study underscores how imagined materiality and perceived effort exertion are integral to the articulation of melodic movements.
Despite its limitations, the paper advances our understanding of how musical meaning is physically enacted beyond the auditory domain, particularly within oral music traditions. In line with Luciani (2007), its findings suggest that effort is inherently tied to artistic expressivity in gesture–sound associations—a perspective potentially applicable across music genres and performance traditions beyond Hindustani music. By foregrounding effort as a compelling cross-modal—or even amodal—dimension of musical expressivity that integrates physical movement, mental impulse, and musical intentionality, this paper proposes a significant shift and a novel analytical perspective in gesture–sound research: one that shifts focus from geometrical or topological gestural features (e.g., shape or trajectory) to effort as a force-related, temporally evolving property through which performers organize.
Footnotes
Acknowledgments
In acknowledgment of the contributions made, heartfelt gratitude is extended to all participating musicians.
Action Editor
Youn Kim, The University of Hong Kong, Department of Music
Peer Review
Two anonymous reviewers
Author Contributions
SP conceived the study, researched literature, recruited participants, did part of data collection (videos of Afzal Hussain and Lakhan Lal Sahu), conducted data analysis, and wrote manuscript.
MK did part of data collection, namely the recording of the video featuring the Gundecha brothers.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical Approval
All procedures involving research participants were approved prior to fieldwork by Durham University for the recordings of Afzal Hussain and Lakhan Lal Sahu in 2010, and by the Open University for the recording of the Gundecha brothers in 2007. Written informed consent was obtained from all participants involved in the study.
Funding
Financial support for the recording of the Gundecha brothers was received by the second author from the Arts and Humanities Research Council (AHRC), grant reference MRG-AN6186/APN19244. The authors received no financial support for the remainder of the research, for the authorship, and/or publication of this article.
Data Availability Statement
The datasets (videos) analyzed during the current study are available in the Durham University repository (Paschalidou, 2011a; Paschalidou, 2011b; Clayton et al., 2007). Hyperlinks to the original recorded video material are available and included in
. These videos are not part of the current publication. Any reuse or redistribution requires the consent of the authors.
Notes
Appendix
Ramakant Gundecha: Cross-tabulation displaying the number of co-occurrences between gesture classes and melodic movement classes.
| Ramakant Gundecha melodic movement class/ gesture class | melodic movement class | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| double pitch glide | straight ascent | straight descent | 1 octave+ pitch glide | steady single note | cadence | gamak | weight & release | total per gesture class | |
| stretching | 11 | 1 | 4 | 16 | |||||
| pushing-to-compress | |||||||||
| pushing-away | |||||||||
| pulling | 2 | 2 | 3 | 7 | |||||
| throwing | |||||||||
| collecting | 2 | 2 | |||||||
| grappolo (hold steady) | 8 | 8 | |||||||
| kite-flying (hold steady) | |||||||||
| total per melodic movement class | 11 | 3 | 4 | 11 | 4 | 33 | |||
