Abstract
While many researchers working in spoken languages have used modality to distinguish language and gesture, this is not possible for sign language researchers. We argue that co-sign gestures must be considered alongside co-speech gestures in theories of language acquisition. Focusing on how the same function is served in embodied communication in speech and embodied communication in sign promotes a truly multimodal view of language acquisition. An embodied multi-articulatory multimodal framework is needed to make broader claims about language acquisition.
Keywords
In their review article, Karadöller et al. (2025) argue that a “unified multimodal language view” needs to be adopted in theories of language acquisition. We agree that language from childhood through adulthood is a multimodal communicative system. Karadöller et al. demonstrate this through their review of the relationship between speech and gesture in spoken language acquisition. Meanwhile, in their discussion of future directions, they claim that speech and gesture “may exchange different loads during language development in spoken language,” which they claim is not possible for sign language because it uses only the visual modality. We argue that this view of sign language development is limited in the same way as considering speech-only would be for spoken language development.
While many researchers working in spoken languages have used modality to distinguish language and gesture (speech being considered language and all visually perceived hand, body, or facial motions being considered gesture), this is not possible for sign language researchers. Sign languages utilize what has been considered to be the same modality as gesture, which has led some researchers to see gesture everywhere in sign language and others to view gesture as occurring alongside sign language; this has led to a variety of ideas and theories on the presence and role of co-sign gesture.
Early work on co-sign gestures acknowledged that they did exist, but their form was debated. Some believed that co-sign gestures were not manual on the hands but instead occurred on the body and face (Emmorey, 1999) as well as specifically on the mouth (Sandler, 2009). Others argued that in addition to these non-manual gestures, signers also utilized manual gestures with the hands. This includes non-emblem gestures interleaved with signs and gesturally modified signs that are idiosyncratic and not morphemic, such as vowel length modifications in speech (Duncan, 2005). Co-sign gestures have also been investigated in signing children to determine if co-sign gestures can predict learning (Goldin-Meadow et al., 2012).
It seems quite difficult to distinguish manual signs from manual gestures, though there is some evidence that perhaps the brain does distinguish the two in processing (Campbell & Woll, 2017). This difficulty, though, has prompted some to assert that the line between sign and gesture is blurry and should stay that way (e.g., Hall, 2017). In a similar vein, others have proposed that gesture and language exist together on a spectrum with nonexistent boundaries imposed upon them (e.g., Occhino & Wilcox, 2017) and that the heuristics used to distinguish language from gesture using dimensions such as discreteness/gradience or wholistic/reducible cannot be easily delineated because these dimensions occur in both language and in gesture (e.g., Coppola & Senghas, 2017). Some work even proposes that the term and concept of “gesture” are misused and should be replaced (e.g., Kendon, 2008, 2017).
While these views present a complicated and contentious picture, we believe it shows that our understanding of the role of gesture in language is incomplete and lacks crucial insight when co-sign gesture is not accounted for. A complete understanding of human language requires this wider framework. Specifically, if the binary view of gesture and sign is questionable, then perhaps it is also questionable for gesture and speech. Modality appears to be less of a factor in gesture (and language) than some might think. Auditory gestures occur in spoken language (Laing, 2025) just as manual gestures occur in sign languages. Gestural forms in either modality could have linguistic features and language forms could have gestural features. If we want to account for all important aspects of language acquisition, we cannot limit ourselves by modality. Instead, we advocate for an embodied and multi-articulatory multimodal framework for language acquisition. Below we provide what we think are some important areas of consideration.
First, we believe it is helpful to separate the modality of production from the modality of perception. Sign language can be perceived visually or through tactile means. The articulation of sign languages is a complex corporeal production (of muscles and joints) that can recruit many body parts (e.g., hands, arms, shoulders, head, face, and torso), which is also true of gestures. Thus, a broadened perspective on multiple embodied articulators instead of modality would allow us to more thoroughly break down the rich set of components and skills to be mastered during language acquisition. Karadöller et al. focus on the “affordances of the visual modality,” leading them to explore how pointing, iconicity, and simultaneity interact with sign language acquisition in different domains (i.e., vocabulary and morphosyntax). What can we learn about language acquisition by investigating instead the combinatorial and expressive possibilities of every articulator, which can be both gestural and linguistic (Goldin-Meadow & Brentari, 2017; Kendon, 2017; Vermeerbergen et al., 2007)?
For example, raised eyebrows can serve multiple functions. In American Sign Language (ASL), eyebrow raises could be grammatical as a non-manual marker accompanying a sign (phrase) and a downward head tilt to indicate a polar question, topicalization in a conditional clause, or accompanying a mouth adverbial and manual repetition to indicate an “ongoing” event. Effectively, the same eyebrow movement could indicate skepticism, doubt, surprise, or being impressed. Pragmatically, an eyebrow flash (a quick raise) could signal a form of greeting, recognition, flirtation, or a request for turn-taking/giving. In Turkish mainstream hearing culture, and some others, raising one’s eyebrows is an emblem gesture indicating disagreement or “no” all by itself, similar to a headshake (left and right). Eyebrow-raising is also documented to be a sign of interest and visual searching in children and is discussed as a general sign of intention and searching behavior in humans as the widening of eyes can support vision (Darwin, 1965; Jones & Konner, 1970).
Another example is pointing, which Karadöller et al. discussed at some length. For articulators, pointing in sign language can be done by orienting just the index finger or the hand as a whole, the chin/head, eye gaze, or even the foot if the person is seated or does not have arms. The use and choice of articulator is impacted by multiple factors, just as it is when it accompanies speech, such as the corporeal affordances of interlocutors (maturation, disability, body position, interaction with objects, etc.), the co-existence of marked tokens in gestural communication systems and languages available, the context of discourse (register, discreteness, etc.), or social norms (culture, religion, taboos, etc.). Pragmatically, pointing may also be used to control joint attention, which is of paramount importance in early language acquisition. In small villages and family sign languages, where interlocutors share common landmarks, geography, and knowledge, pointing is also observed to serve in place of lexical items (de Vos & Pfau, 2015). This co-sign pointing, or spatial indexing, occurs independently of the intricate system of ASL referential pronouns (Frederiksen & Mayberry, 2021, 2022), despite previous proposals that argued that ASL pronouns are primarily gestural (Liddell, 2011).
Second, rather than simply agreeing with the idea that sign languages do not have gestures because they exist entirely in the visual modality or that modality is the end-all-be-all, the perspective we suggest requires us to more closely examine the functions of gesture in multimodal communication and language development. What functions do gestural and linguistic expressive parts and combinations serve, and what do they reveal about acquisition and cognition? Gestures can replace lexical items within or in between sentences, be located in contrastive spatial locations to indicate relationships, be used pragmatically for conversational management and backchannel feedback, and serve other functions. As a starting point, we assume that co-speech gesture functions that are essential for language communication exist in sign language, but are currently undescribed due to the fact that there are multiple possible articulators in sign (e.g., the eyebrows, eye width, eye gaze, nose crinkles, mouth movements, tongue gestures, head tilts and movements, and shoulder shrugs, and to name just a few). These non-manual articulators, as well as the hands, can perform the functions of co-speech gestures for signers. A more complete picture of how sign languages manage to package both kinds of information, linguistic and co-linguistic, within one modality requires that we have an idea of the communicative functions of gesture.
Focusing on how the same function is served in embodied communication in speech and embodied communication in the sign will allow us to have a truly multimodal view of language acquisition. Such a framework is needed for us to study development in younger and older signing children and to make broader claims about language acquisition.
Footnotes
Author contributions
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The writing of this response was supported by the National Institute on Deafness and Other Communication Disorders of the National Institutes of Health under Award Numbers F31DC021854 and F31DC021625. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
