Abstract
We contrast two related hypotheses of the evolution of dance:
Introduction
Dance may have been common to human societies throughout history (Hauser & McDermott, 2003; Laland, Wilkins, & Clayton, 2016; Richter & Ostovar, 2016). Although a common and seemingly unproblematic behavior, dancing is a complex act, involving correspondence between auditory input and motor output, and coordinating movements with that of partners (Fitch, 2016; Laland et al., 2016). Rhythm is an important component of both dance and music (Honing & Ploeger, 2012; Thaut, Trimarchi, & Parsons, 2014). The capacity for beat anticipation and rhythm perception is a requirement in dance, at least in Western cultures (Honing, 2012; Phillips-Silver & Trainor, 2005). Newborns and infants are able to extract and anticipate a rhythmic pulse and to move spontaneously when exposed to an external rhythm (Hannon & Johnson, 2005; Ostovar, 2016; Phillips-Silver & Trainor, 2005), and a cross-culture survey over three continents observed that children constantly exhibit dance-like behaviors when exposed to a rhythm (Ostovar, 2016). Some basic forms and functions of music are recognised by listeners in dissimilar cultures worldwide (Mehr, Singh, York, Glowacki, & Krasnow, 2018). In contrast, nonhuman primates largely lack musical abilities, and our closest living relatives, the chimpanzee Pan troglodytes and the bonobo Pan paniscus (E. O. Wilson, 2011), essentially lack music or dance-like behavior (though see work reviewed in e.g. Ravignani, 2019a; Ravignani, 2019b; Ravignani et al., 2013; Lameira et al., 2019). Several hypotheses have tried to account for this difference in musical rhythmic capacities between humans and other animals (Fitch, 2010). Here, we explore two hypotheses of the evolution of dance and music, both connected with human gait.
Finally, we speculate about the contribution of these two “gait-related” hypotheses in the development of dance and music.
Possible Benefits of Synchronized Movements
Walking in pace represent one form of synchronized behavior. Our article will only explore a small selection of synchronized animal behavior. A review by Wilson and Cook (2016) explores, among others, possible benefits of synchronized movements across species in more detail. Investigating the evolution of (human) behavior is inherently difficult, because most behaviors leave little to no fossil evidence. All species are related to different degrees. Based on these phylogenies, one can compare human behaviors with those of other species, in particular our closest living relatives, the nonhuman primates.
Focusing on a branch of the phylogenetic tree further away from humans, Larsson (2012b) proposed that synchronized locomotion in fish could be adaptive. Fish swimming together produce overlapping and confusing acoustical signals likely resulting in confusion of the lateral line of predatory fish (Larsson, 2009). Schooling fish have the capacity to interrupt movements simultaneously providing quiet intervals, hence improving predator detection (Larsson, 2009). Animals of a similar size moving almost concurrently produce similar sounds, facilitating auditory grouping (Larsson, 2012a; Popper & Fay, 1993). Accordingly, the vertebrate brain may have become pre-programmed to develop synchronized behavior in appropriate ecological niches (Larsson & Abbott, 2018).
Wilson and Cook (2016) stated other possible advantages of moving in a synchronized manner: for example, animals swimming in choppy water may synchronize their limb strokes to the frequency of the waves, and a group of individuals coordinating movements may be more efficient in capturing preys and frightening a potential predator (Merker, 2000). Positive social interactions due to synchronization have been observed in macaques (Nagasaka, Chao, Hasegawa, Notoya, & Fujii, 2013). Synchronous movement has emotional impact on humans (McNeill, 1995): synchronization of movements in a group may be a potent way of creating and sustaining community and communication. A recent study (Cirelli, Einarson, & Trainor, 2014) has shown that 14-month-old children who were bounced to the beat of music played by an adult were more likely to retrieve an object dropped by the adult than were children bounced without keeping time to the beat (see also Cirelli, Wan, & Trainor, 2014; Cirelli, Wan, & Trainor, 2016; Trainor & Cirelli, 2015). The fact that synchronised movements may facilitate communication is, in principle, consistent both with the “bipedal experience in utero” and the “acoustical advantage” hypothesis.
Locomotion in Primates
Chimpanzee and bonobos, the genetically closest species to humans, essentially lack the ability of complex vocal learning and are unable or almost unable to tap in synchrony with other individuals (but see Large & Gray, 2015). Larsson (2014) hypothesized that the transition to bipedal locomotion may have stimulated the evolution of rhythm and vocal learning, which motivates a brief review of primate locomotion and the associated sound. First, a few words about the physics. Walking sound can be defined as a sequence of isolated impact sounds generated by a temporally limited interaction of two objects (Visell et al., 2009). The foot and ground exert an equal and opposite force on one another, the ground reaction force (GRF) (Novacheck, 1998), which is associated with the movement of the center of the mass of the individual (Galbrait & Barton, 1970). The GRF usually produces sounds of frequencies lower than approximately 300 Hz (Ekimov & Sabatier, 2006, 2008).
Primate Locomotion Patterns
Human locomotion is peculiarly specialized compared to that of other primates; in fact, most nonhuman primates retained locomotion mechanisms that are more flexible than those of humans.
Many primates are arboreal, and terrestrial species such as baboons (genus Papio) and patas monkeys (Erythrocebus patas) spend time in trees. Primates moving in trees usually strive to maintain contact with at least one limb, resulting in little or no aerial phase (O’Neill, 2012; Schmitt, Cartmill, Griffin, Hanna, & Lemelin, 2006) and reducing the GRF. The distance between tree-limbs and their degree of flexibility is likely to vary, implying that arboreal primates will typically lack regular appendage movements (Thorpe, Holder, & Crompton, 2009). Primates moving at intermediate speeds exhibit a form of “ambling,” unlike movement in most mammals, which notably lacks an aerial phase. Lemurs and tarsiers mostly climb, cling, and leap. New World monkeys exhibit a range of locomotor patterns, mostly arboreal. Old World monkeys likely descend from a terrestrial common ancestor, but several species have returned to a partially arboreal habitat (Schmitt, 1998). Those that are terrestrial mainly move by quadrupedal walking or running. A large variation in locomotion patterns can be found among apes. Orangutans often swing from branch to branch (Thorpe et al., 2009). In chimpanzee, gorilla, and bonobo quadrupedal gait often takes the form of knuckle-walking (Dainton & Macho, 1999). Leaping is not seen in great apes but is common in humans.
Bipedal gait is not exclusive to humans, but humans are the only habitual biped primate. Chimpanzees, bonobos, gibbons, capuchins, spider monkeys and several other species of New World monkeys have all been occasionally observed walking bipedally (Mittermeier, 1978). Although gibbons primarily brachiate, they are the most proficient bipeds among nonhuman primates (Vereecke, D’Aout, & Aerts, 2006).
The primary differences in bipedalism of humans and other primates lie in two empirically measurable quantities: force patterns and frequency of use. The GRF curves sharply differ. Since human walkers first put down the heel, and very soon after that the front-foot, human walking generates a two-peaked GRF curve, or at slow speeds, a trapezoidal curve (Alexander & Jayes, 1978; Schmitt, 2003). In contrast, bipedal walking in nonhuman primates is characterized by a single-peaked GRF curve, with the peak close to body weight (Kimura, Okada, & Ishida, 1979; Schmitt, 2003). The GRF in capuchin monkeys is greater in bipedal gait than in quadrupedal locomotion (Demes & O’Neill, 2013). During locomotion on the ground, stride length and walking speed of chacma baboons were reported to vary significantly (Sueur, 2011) in contrast to the strict regularity of human unconstrained walking. Many nonhuman primates use bipedal gait opportunistically, usually moving on flexed limbs, bending at the hip and knee (Demes & O’Neill, 2013), which is likely to reduce the GRF .
Only a minor proportion of locomotion time in nonhuman primates is bipedal. Data from bearded capuchin monkeys Sapajus libidinosus and adult African apes (Subfamily: Homininae) indicate that the average proportion of bipedal gait is no more than 1–2% of total locomotion (Duarte, Hanna, Sanches, Liu, & Fragaszy, 2012), and the proportion is not significantly greater in the gibbon, the most proficient bipedal walker (Vereecke et al., 2006). Human walking and nonhuman primate bipedal gait differ along several dimensions (Demes & O’Neill, 2013).
Data on the features of nonhuman primate locomotion sounds are lacking, and it is unclear how a fetus may perceive its mother’s GRF. A crucial difference between human and nonhuman perception may be that humans are exposed to a single roughly isochronous pattern (Strogatz, 2003; Strogatz & Stewart, 1993) in utero (due to the mother’s bipedal gait), while other species are exposed to one or more patterns generated by complex limb alternation. This key distinction might make humans a mostly “isochrony-focused” species (Ravignani & Madison, 2017), while other primate species could be primed to different, non-isochronous patterns derived from their fetal environment. Notably, walking is largely controlled by the spinal cord, which executes rhythmical and sequential activation of muscles in locomotion. However, the mechanisms by which locomotor rhythm is generated are not identified at present (Dougherty & Ha, 2019). The central pattern generator (CPG) delivers the fundamental locomotor rhythm and integrates commands from various sources to meet the requirements of the environment. Therefore, it would be of interest to investigate how locomotor sound—the individual’s as well as that from nearby individuals—may interact with rhythm generating neurons in the CNS.
Human Walking Sound
Human walking displays long-term constancies (Dingwell & Cusumano, 2010; Hausdorff et al., 1996). In unhindered, over-ground walking, regularities can be found in stride time, pace length, and speed (Terrier, Turner, & Schutz, 2005). Walking and running are periodic activities, with a single period known as the gait cycle. By definition, the gait cycle begins when one foot comes into contact with the ground and ends when the same foot again contacts the ground and is comprised of stance and swing phases (Novacheck, 1998). Human walking rates are generally in the range of 75 and 125 steps per minute (Sabatier & Ekimov, 2008). In laboratory studies as well as during long periods of unconstrained locomotor activity, the preferred cadence of walking is around 120 steps per minute (MacDougall & Moore, 2005).
In walking, the two initial portions of the stance phase, the initial contact and the loading response, normally produce more sound energy than other stance phase segments, although their combined duration is less than 10% of the gait cycle (Novacheck, 1998). In other words, the major amplitude component of the step is distinctly produced, and in that sense, it may resemble a beat. Footfall is likely to produce substantial self-perception of sound due to bone conduction (Moore, 2003), especially when running. During barefoot running at 4 m/s on a hard surface, the magnitude of the peak of the GRF is between 1.5 and 2.5 times the body weight. This sends a shock wave up the body that can be measured in the head within about 10 ms (Lieberman et al., 2010).
Walking sound of others conveys information to listeners about the sound source, and listeners learn to draw conclusions based on the features of the sound (Visell et al., 2009) about properties of the ground surface (Giordano et al., 2012), the gender (Li, Logan, & Pastore, 1991), and the posture of the walker (Pastore, Flint, Gaston, & Solomon, 2008).
Human locomotion may influence, and interact with, emotions, and locomotion sounds may be influenced by the emotion of the walker (Giordano & Bresin, 2006). Sounds produced on a firm surface lead to more aggressive walking patterns (Bresin, de Witt, Papetti, Civolani, & Fontana, 2010). Runners were shown to alter step length and, thereby, speed when presented with music of different emotional character (“relaxing” or “activating”), while retaining a pace of 130 beats per minute (BPM) (Leman et al., 2013).
Walking and running produce rhythms in the range of 75–190 BPM. Humans can synchronize walking movements with music over a broad spectrum of tempos, but synchronization is optimal in a narrow range around 120 BPM (Styns, van Noorden, Moelants, & Leman, 2007). Music is often played at a tempo similar to that of walking (Changizi, 2011). The tempo of popular dance music peaks at around 120–130 BPM (Leman et al., 2013). In healthy individuals attempting to walk in time to a metronome at 120 BPM, the average pace was 119.52 ± 3.12 steps per minute, suggesting a good potential to synchronize steps with rhythmic auditory sounds (Bilney, Morris, Churchyard, Chiu, & Georgiou-Karistianis, 2005). In brief, we have seen that humans skilfully extract information from footsteps, there is an emotional impact associated with footsteps, and the optimal synchronization to a metronome is around 120 BPM; all this seems consistent with both the “bipedal experience in utero” and the “acoustical advantage” hypothesis.
Walking Together
Walking side by side, people often unconsciously synchronize steps, suggesting that the perception of one’s partner directly influences gait in the absence of conscious effort or intent (Nessler & Gilliland, 2009; Zivotofsky & Hausdorff, 2007). When two individuals stroll on neighboring treadmills, their walking patterns tend to be substantially coordinated (Nessler & Gilliland, 2009). Each person makes fine adjustments to locomotion kinematics in order to adapt to their partner’s behavior (Nessler et al., 2013). In paired walking, participants can be phase-locked with a phase difference close to 0° (in phase), or they can be phase-locked with a phase difference close to 180° (antiphase or antisynchrony) with walkers contacting the ground simultaneously with opposite-side feet (Nessler & Gilliland, 2009; Nessler, Gonzales, Rhoden, Steinbrick, & De Leone, 2011). Walking in phase or antiphase is likely to produce a similar overall acoustical pattern and rhythm. These synchronous and antisynchronous patterns seem particularly common in humans, together with a few phylogenetically distant species (Ravignani, 2015). Similar leg length is significantly related to locking of step, and the level of frequency locking was not shown to significantly differ with variation in visual and auditory information, suggesting that minimal sensory information such as mechanical vibrations caused by the partner’s steps may induce unintentional synchronization (Nessler & Gilliland, 2009). Data on synchronization in runners are lacking.
Synchronization in the Evolution of Musical Rhythm?
Human dance and music are closely associated with moving in synchrony. This raises the question: did the human tendency and ability to move in synchrony evolve before or after the evolution of music? If the human preference to move in synchrony evolved first, what possible benefits may early humans have achieved by moving in synchrony? A related question is – why have apes, our closest relatives, not developed a similar tendency to move in synchrony (with synchrony intended in its more restrictive sense, e.g. (Ravignani, 2017) rather than loose coordination)? Some have hypothesized that groups of hominids walking in synchrony may have confused and frightened their enemies through mimicry of a large animal (Merker, 2000). The “acoustic advantages due to bipedal walking” hypothesis suggests that two or more humans walking in synchrony likely achieve a short period of silence in the middle of the step cycle, improving auditory awareness of their environment (Larsson, 2014). Larsson, Ekstrom, and Ranjbar (2015) showed that the masking potential of two individuals’ footsteps was reduced when walking in pace compared with not in pace although the decibel level was identical. While walking in such manner, the ability to perceive differences in pitch, rhythm, and harmonies could help the hominid brain to distinguish sound sources and facilitate synchronization of movements. Such attentive listening in nature in association with rhythmic group locomotion may have resulted in reinforcement possibly through dopamine release (Larsson, 2014). A primarily survival-based behavior may eventually have attained similarities to dance and music (Meehan, Abbott, & Larsson, 2017).
Associations Between Walking and Music
There may be an association between human locomotion and music/dance (Larsson, 2014). For example, passive listening to music, or imagining it, activates areas of the brain associated with motor behavior (Chen, Zatorre, & Penhune, 2006; Grahn & Brett, 2007). The connection between auditory and motor systems is important for the execution of rhythmic movements in humans, and music as structured auditory input has a remarkable ability to drive rhythmic, metrically organized, motor behavior such as dance (Zatorre, Chen, & Penhune, 2007). Meehan et al. (2017) proposed that music listening may mimic the sense of walking with other people, which might contribute to the positive effects of music and dance therapy in Parkinson’s disease (see also Pereira et al., 2019).
Both dance and music are extremely complex behaviors. Therefore, rather than thinking about one particular advantage from which they originated as a whole, it may be better to think about how adaptive their constituent components are. Particular traits of music and music-like behavior might have been advantageous and have had a more influential role under specific circumstances. For example, early mother-infant interaction (Dissanayake, 2000) motherese (Falk, 2004), and coalition signaling (Hagen & Bryant, 2003) have been discussed.
Dance
Dance is defined as body movements coordinated to a basic rhythm. For discussions on rhythm, see e.g. (Fitch, 2011; Kotz, Ravignani, & Fitch, 2018; Richter & Ostovar, 2016).
Leg Movements in Dance
Perception of music, particularly in regards of rhythms, can be studied, among others, through the theoretical framework of embodied cognition (Grahn & Brett, 2007; Penhune & Zatorre, 2019; Su, 2016a, 2016b). Even in non-embodied frameworks, rhythm perception and movement are tightly linked neurobiologically (Grahn & Brett, 2007; Penhune & Zatorre, 2019).
Given this auditory-motor coupling, rhythm information derived from music may be represented and retained in the brain as information about body movements (Konoike et al., 2012). Sensorimotor mechanisms similar to musical rhythm perception have been shown when human observers perceive dance movement (Su, 2016a, 2016b). Observers extract a visual beat from regular movement patterns of body parts, with leg movements most often chosen as the primary beat and vertical trunk movement occupying the space between the beats. When the four limbs move in tandem, visual beat perception in dance relies mainly on the pattern of leg movement (Su & Salazar-Lopez, 2016). In practice, across dance genres it is the footwork that is most often performed in time to the musical beat (Y. H. Su, personal communication in 2 April 2018). There are many dances, among them many African dances, German Schuhplattler, tap dance, Zapateado Peruano, or Flamenco, in which footwork is used to provide percussion.
Down-on-the-Beat Movements in Dance
Internal noise such as vibrations in the skeleton produced by walking, dance, and other movements is scarcely investigated. However, it is likely that down-on-the-beat movements appear more salient because they cause more internal sound to vibrations in the skeleton (Moore, 2003). A study of auditory-motor entrainment in street dancers (Miura, Kudo, & Nakazawa, 2013) showed that in rhythmic knee bending to the beat, up-on-the-beat (knee extension on the beat) was unintentionally replaced by down-on-the-beat (knee flexion on the beat) at high movement frequency. It may be that, in terms of motor control, knee flexion is biomechanically easier to perform than knee extension. Thus, synchronizing the former to an auditory target is a more economical and more natural form of movement. This action preference seems to be mapped in perception, as observers also perceive the downward trajectory of the knee-bending movement as more congruent with an auditory downbeat than upward movement (Su, 2014). Through training, dancers acquire the skill to perform the more demanding up-on-the-beat movement, which may require internal synchronizing to the beat in an antiphase manner. Footsteps in locomotion tend to be accompanied by moving the body downward, which largely generates the sound. Thus, we hypothesize that the downward movement is more coupled to the beat than is the upward movement, both in action and in perception.
Rhythm In Utero
Sensory Experience In Utero
Perception of sound and rhythm in utero has been suggested to influence the individual’s development of musical abilities (Parncutt, 1993; Parncutt & Chuckrow, 2017). Brain development is largely shaped by early sensory experience (Figure 1) (Webb, Heller, Benson, & Lahav, 2015). For instance, prenatal sensory experiences may influence taste preference of offspring in humans (Schaal, Marlier, & Soussignan, 2000) and other animals. Fetuses are exposed to scents of the mother’s diet, which influences taste preferences after birth (Schaal et al., 2000). Neonates seem to remember the smell of amniotic fluid, which attracts them more than other cues (Tyzio et al., 2006). Full-term infants indicate orientation of sound by turning the head towards the source. If they are shown an object at the same time, they will move their gaze to the sound, implying that hearing is more mature than vision at birth (Lagercrantz, 2014; Wilkinson & Jiang, 2006). Visual perception is unlikely to be experienced in utero, while sound is a primary source of varied and consistent stimulus to the developing brain (Teie, 2016).

A pregnant woman.
Sound Perception In Utero
The human fetus experiences approximately four months of audible sound exposure prior to birth (Birnholz & Benacerraf, 1983). Intrauterine recordings taken in humans and animals have shown that the sounds of the mother’s vocalizations, breathing, heartbeat, body movements, footfalls, and digestion are all audible to the fetus (Parncutt, 1993, 2009). The cochlea is structurally developed from approximately the eighteenth gestational week (Lim & Brichta, 2016). After the 26th week, brainstem-evoked responses may be recorded (Wilkinson & Jiang, 2006). Cortical activation to sound has been observed in the fetus from the 33rd week (Jardri et al., 2008).
Newborns show reaction to sounds, melodies, and rhythmic poems to which they have been exposed during gestation (Hepper, 1996). Soon after birth, infants show preference for their mother’s voice over the voice of another female and their mother’s language over a foreign language (Decasper & Fifer, 1980). Exposure to speech in utero affects vowel perception after birth (Moon, Lagercrantz, & Kuhl, 2013). Neonates react more strongly to passages spoken by the mother each day of the final six weeks of pregnancy than to novel passages (Decasper & Fifer, 1980). Neonate cry melody is formed by their native language (Mampe, Friederici, Christophe, & Wermke, 2009), and neonates demonstrate the ability to discriminate between languages of different rhythmic families (Nazzi, Bertoncini, & Mehler, 1998). Older infants discriminate between synchronous and asynchronous audiovisual musical displays (Hannon, Schachner, & Nave-Blodgett, 2017). Despite the immaturity of the auditory pathways in preterm babies, the auditory cortex is more adaptive to maternal sounds than to environmental noise, and three hours of daily exposure to the mother’s voice and heartbeat sound can yield structural changes in the developing auditory cortex (Webb et al., 2015). The structures of the limbic system are almost completely formed at birth. The trajectories of limbic fibers, the cingulum and the fornix, two dominant tracts in the fetal brain are developed at 19 gestational weeks (Huang et al., 2006). Thus, brain structures responsible for emotions are well developed at birth and may store, and later respond to, sounds that resemble those of the fetal environment (Teie, 2016).
The Fetal Acoustic Environment and Associations With Musical Elements
It has been proposed that features of music correspond to sounds that are present in the womb, and that the fetal acoustic environment may provide the basis for the fundamental musical elements found in the music of all cultures (Parncutt, 1987, 1993; Parncutt & Chuckrow, 2017; Teie, 2016). Although the role of footfalls in rhythm development has been previously discussed (Parncutt, 1987, 2009), heartbeat and pulse have been more often considered in this context (Teie, 2016; Ullal-Gupta, Vanden Bosch der Nederlanden, Tichko, Lahav, & Hannon, 2013).
Fetuses of all mammals perceive maternal heartbeat to some extent, thus perception of heartbeat in utero has little explanatory value with respect to musical ability as a strictly human phenomenon. In other words, chimpanzee and human fetuses will hear a similar sound from the mother’s heart. Thus, that fetal experience relating to heartbeat will not contribute to explain the difference between human and nonhuman primates in regards of musical and rhythmical skill. Instead, we argue for a relatively more influential role of maternal footfall.
Sound In Utero
Respiration-Locomotor Coupling
Human gait is usually in the range of 100–120 BPM (Nessler et al., 2011) and breathing is 12 to 20 cycles per minute (cpm) (Barbosa Pereira et al., 2017). Since locomotion and respiration are frequently coupled (Bramble & Carrier, 1983; Funk et al., 1992) this may result in coupling of footfalls, breathing sounds, and passive tactile stimulation. Thus, a coupling of vestibular-tactile-somatosensory and auditory signals may now and then take place during fetal life. The mother’s respiration and walking will produce audible rhythmic movements that are associated with movement of the fetus. Therefore, bouncing to the rhythmic movements produced by maternal walking/breathing is likely to be the human brain’s first experience of isochrony.
Fetus and Newborn Reaction to Rhythmic Stimuli
Studies have shown changes in the frequency of fetal and new-born heart rate with external rhythmic stimulation (Provasi, Anderson, & Barbu-Roth, 2014). However, since breathing influences the heart rate (Dick et al., 2014), the changes might be secondary to change in the fetal breathing patterns (or may be due to simple arousal). In premature infants, rhythmic stimuli affect the respiratory rate (Sammon & Darnall, 1994). These authors recorded respiration in 18 pre-term infants being manually rocked at rates ranging from 30 to 60 cpm. Coherence spectra were estimated between the respiratory and rocker signals, and their magnitudes were evaluated at the rocking frequency, with coherence spectra > 0.85 indicative of strong entrainment to rocking. At least one occasion of entrainment was seen in 15 of the infants, with a 2:1 ratio of breath: rocker cycle at rocking frequencies of 30 to 40 cpm (8 of 18 subjects) and 1:1 entrainment at rates of 42 to 50 cpm (5 of 18 subjects). More compound synchronization was observed in three infants. Since the rocking movements were experienced passively, it is unlikely that rocking influenced the breathing of the fetus as a consequence of change in metabolic activity (Sammon & Darnall, 1994). The capacity of pre-term infants to adapt breathing rhythm to the frequency of linear displacement of their body suggests some capacities for motor synchronization to external auditory stimuli (Provasi et al., 2014).
Active voluntary movements such as rhythmic leg swinging produce repetitive endogenous stimuli that can be compared with repetitive exogenous stimuli such as rocking of a baby (Soussignan & Koch, 1985). Notably, the rocking of a fetus due to maternal walking and breathing would provide similarly repetitive exogenous stimuli. Caregivers from three different continents have been interviewed, asking which option between rocking or singing they would opt for to soothe their baby. The great majority opted for rocking (Ostovar, 2016)
Sleep state and the regularity of quiet sleep respiration were investigated in pre-term infants provided with a “breathing” teddy bear that produced rhythmic stimulation reflecting the breathing rate of the individual infant. The breathing teddy bear-infants eventually showed slower and more regular respiration during quiet sleep and a correlation between respiratory regulation and the amount of quiet sleep, suggesting that preterm infants may entrain to acoustic stimuli that reflect their own biological rhythms (Ingersoll & Thoman, 1994).
In late pregnancy, the fetus exhibits different heart rate patterns during maternal walking from those seen when the mother is resting (Cito et al., 2005). The fetal heart rate may be used as an index of responsiveness to displacement (Provasi et al., 2014). When the mother-to-be is walking, the fetus will be moving rhythmically and simultaneously experience the associated auditory signals generated by the mother’s feet against the walking surface. Fetal reactions to vibro-acoustic stimulation have been monitored by recording fetal heart rate and movements (Kisilevsky & Hains, 2011). Fetal reaction has been reported to increase with rhythm presented both acoustically and through vibrations (Provasi et al., 2014). In general, studies that have used both vibratory and acoustic stimuli report higher response rates compared with those that have used only one modality (Provasi et al., 2014). In children, presentation of a rhythmic pattern in two modalities increases the ability to identify and respond in synchrony with the pattern compared to stimulus in only one modality (Bahrick & Lickliter, 2000; Provasi et al., 2014). Neonates recognize different vestibular-tactile-somatosensory rhythms and alter behavior in response to these rhythms (Provasi et al., 2014). The perception of auditory rhythms in human infants is stimulated by passive movement; moving passively produces somatosensory, vestibular, and tactile stimulation (Phillips-Silver & Trainor, 2005, 2007). Phillips-Silver and Trainor (2005) demonstrated a strong multisensory connection between body movement and auditory rhythm processing in infants. When the same rhythm is presented acoustically and in the somatosensory modality, the fetal capacity to process simultaneous cross-modal sensory input is improved (Lecanuet & Schaal, 1996).
The tendency to tap or move in rhythm to music is rare during the first year of human life but steadily increases until the age of puberty (Drake, 1997), a timetable that shows some analogies with the child’s increasing capacity to walk. The preferred tempo of music decreases with age and leg length (Drake, Jones, & Baruch, 2000). But since tempo biases and the detection of a regular pulse in an auditory signal can be demonstrated in children from birth, even before walking (Winkler, Haden, Ladinig, Sziller, & Honing, 2009), it has been proposed that movements of the mother may influence such rhythmic behavior more than does the child’s own movements (Ullal-Gupta et al., 2013). Newborns distinguish regular features in the acoustic environment despite alteration and they have spectral as well as temporal processing prerequisites of music perception (Winkler et al., 2009). Visual/auditory cross-modal synchronous signals are unlikely to be of importance in utero. This might explain why the acoustic modality is more efficient than vision in processing rhythms, especially for durations around 100 ms (Fujisaki & Nishida, 2009).
Acoustic and Vestibular-Tactile-Somatosensory Perception in Adults
In adults, the combination of acoustic and vestibular-tactile-somatosensory perception increases rhythmic perception more than visuo-tactile or audio-visual combinations (Fujisaki & Nishida, 2009). Vestibular-tactile-somatosensory rhythms may have a role in the development of movement similar to the way in which auditory rhythms influence speech production (Phillips-Silver & Trainor, 2005; Provasi et al., 2014). Phillips-Silver and Trainor (2007) found that movements of the body influenced adults’ auditory encoding of an ambiguous musical rhythm. This disambiguation could be achieved also by direct galvanic stimulation of the vestibular nerve. Simulating head-movements in either of two different tempos, without any actual movement, strongly biased adults’ perception of the beat, implying that the vestibular system has a crucial role in the perception of musical rhythm and performance of dance (Trainor, Gao, Lei, Lehtovaara, & Harris, 2009).
Discussion
The transition to bipedal gait may be related to the evolution of human rhythmic and musical abilities via two mechanisms. First, bipedal gait resulted in predictable and rhythmic incidental sound of locomotion which in turn may have stimulated the evolution of human rhythmic and musical abilities. Second, maternal bipedal walking is likely to influence the fetal environment, increasing the exposure to rhythmic motion and auditory cues in early brain development. The human brain will be significantly exposed to isochronous sound and movements both in utero and during the years of being carried and potentially be better prepared to perceive and enjoy the similar rhythmic stimulation of music and dance. This may in turn have stimulated the cultural evolution of music and dance. We have discussed above the different strands of empirical evidence in principle compatible with both hypotheses.
Both the “bipedal experience in utero” and the “acoustical advantage” hypotheses are compatible with the fact that nonhuman primates display scarce bipedal locomotion and essentially lack musical and rhythmic abilities. Also, the salience of leg movement in dance observation (Su & Salazar-Lopez, 2016) and ubiquitous connections between the beat in music and downward body-movements seem consistent with the proposal that the evolutionary origin of dance is linked to human gait. So far, maternal heartbeat sound has been relatively more discussed than gait in research on musical/rhythmic abilities of offspring (Teie, 2016). However, the characteristics of footfalls might be even more interesting in this regard. Music is often played at tempos similar to walking (Changizi, 2011). The sound of footfalls will be accompanied by passive rhythmic movements of the fetus. Heartbeats will be heard 24 hours a day, footfalls will be heard intermittently over limited periods, which is more like the timing of music/dance activities. Contrary to heartbeats, the perception of footfalls is strongly reinforced by the simultaneous passive movement in utero. Thus, only maternal footfalls will create a combination of acoustic and vestibular-tactile-somatosensory perception, a significant component in dance. Locomotion sound of the human mother differs substantially from that of nonhuman primates, while heartbeats heard in utero are likely to have similar character. Thus, heartbeats do not explain why flexible rhythmic abilities have developed solely in humans and a few other vertebrates (Kotz et al., 2018), but footfalls might.
Testable Hypotheses
Useful hypotheses need to be testable. Although the hypotheses presented here are preliminary, we believe it is important to suggest empirical avenues to support or refute speculations. One possibility would be to investigate whether children of non-walking mothers show differences in musical and general rhythmic abilities (H1). Do offspring of mothers who experienced severe incapacitating girdle pain show selectively reduced rhythmic abilities but intact musical skills in melody, harmony, and timbre (H1)? Does this even extend to grammatical linguistic (dis)advantages (Gordon et al., 2015)? Across geographical areas, does average human leg length correlate with preferred tempo, as potentially predicted by our hypothesis? How do the recently-found genetic correlations between rhythm and walking or breath connect with the literature we review here (Niarchou et al., 2019)? Although it is an everyday experience that dyads of similar leg-lengths tend to walk in pace, the scientific study of human paced walking is scarce. For example, we were unable to find even a single empirical study about perceptual mechanisms of militaries walking in pace. A future empirical question is how militaries (vs. non-militaries) use senses to achieve common pace. Is it hearing or vision or combination of both that is used (H2)? Future zoological research may map the number of limbs to locomotion patterns, and hence sounds produced, across species. Systematically mapping the bio-physics of animal gaits and the rhythmic patterns they produce will allow testing our hypothesis in a comparative framework (H1 and H2). Applied to other species, our hypothesis (H2) predicts that pronking (a gait of quadrupeds, involving jumping high into the air by lifting all four feet off the ground simultaneously) animals should be more sensitive to, or prefer, isochronous patterns, while horses should be more sensitive to quaternary patterns (H1 and H2). It would also be of interest to study the role of incidental sound of locomotion in synchronized animal groups, such as the flapping of wings in bird and bat flocks, or the sound of body movements and breathing in moving cetaceans (H2).
Conclusion
These two “bipedal” hypotheses do not necessarily compete or contradict one another. Bipedal stimuli in utero may primarily boost the ontogenetic development and interest in music and dance (H1), while the acoustical advantage hypothesis (H2) proposes a mechanism in the phylogenetic development of musical abilities.
The question of their relative importance in the development of dance/music is relevant but possibly unanswerable at present. It seems more fruitful to explore interaction mechanisms in the phylogenetic and ontogenetic development of music and dance. This may be relevant for both biological and cultural development.
Footnotes
Acknowledgements
We are grateful to Yi-Huang Su for many insightful comments, David Teie for information and ideas about the fetal acoustic milieu, and Akito Miura, for valuable information about street-dance.
Author contribution
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Action Editor
Dr. Tecumseh Fitch, University of Vienna, Department of Cognitive Biology, Vienna.
Peer review
Richard, Parncutt, University of Graz, Centre for Systematic Musicology.
Two anonymous reviewers.
