Hand Signs for Lip-syncing: The Emergence of a Gestural Language on Musical.ly as a Video-Based Equivalent to Emoji

Abstract

Video-based communication is increasingly common online. This article looks at the hand signs that are used in lip-syncing videos on the app musical.ly and argues that they constitute a codified, non-verbal language of pictograms that is equivalent to emoji in text-based communication. Seeing the lip-syncing videos as performances allows us to situate the hand signs as the latest development in a long history of formalized gestures for theatrical performance. By recognizing this emergent gestural language as a performative element of online communication, we can reinterpret emoji and other forms of visual supplement to writing as part of a larger emerging system of communication where visual signs are becoming integrated into verbal language, not only in written language but also in video-based communication.

Keywords

emoji gestures lip-syncing multimodality video-sharing

Gestures are an important, non-verbal element in face-to-face communication, and as online communication is becoming more video-based, gestures are becoming common online. However, gestures are being used in a more codified manner online than in face-to-face communication. In this article, I argue that this is equivalent to the use of emoji as a codified way of expressing affect in text-based online communication, and that these non-verbal sign systems can be understood as developments in the history of codified gestures for performance. Specifically, I look at hand signs as they are used on musical.ly, a video-sharing app that launched in 2014 and is popular among preteens and teens. Its users, or musers as they are called, share 15 second videos where they select a soundtrack and lip-sync or mime a performance.¹ The performances frequently include a rapid succession of hand signs that interpret and enhance the lip-syncing and that have evolved as a collective system of codes. Most of the hand signs enact or mimic the concept they reference, just as emoji tend to visually resemble the object or concept they refer to. For instance, a tilted head with a hand under one cheek signifies sleep, night or tonight, and a hand held in the shape of a gun signifies kill, death, or die.²

I argue that hand signs on musical.ly are pictograms: visual representations of objects that signify a closely related concept,³ and that they have developed as an augmentation of video-based online communication similarly to the way emoji developed as an augmentation of text-based online communication. The emergence of a gestural language on musical.ly may be an early example of a resurgence of structured gestural communication that will become more common with the rapid growth of video-based communication in social media.

While text-based messaging and social media hide the body and its gestures, visual social media such as musical.ly can make the user’s body visible to viewers. But although lip-syncers on musical.ly can use gestures and body language, they cannot use their own voice, and the app won’t let them add text to the video. I propose that when online communication genres limit the available modes of communication (for instance, to text-only with no images, or video-only with no sound or text), users will tend to compensate by developing shared conventions to codify elements that cannot otherwise easily be expressed in the limited modes available. This has led to the development of emojis and animated gifs as well as to hand signs on musical.ly.

A second argument in this article is that musical.ly lip-syncing videos are performances, not simply in Goffman’s generalized sense of all self-presentation being a form of performance (1959), but in the theatrical and musical sense of the word. Using the history of performance to understand performance on muscial.ly reveals a historical lineage for the hand signs in 18th century theater and classical oration, both of which also developed a codified system of hand signs to express emotion. Using this historical context to understand emoji allows us to expand Goffman’s notion of performance in a way that helps explain a key phenomenon in contemporary self-presentation in social media: the use of emoji and other pictograms.

My approach is twofold. First, I discuss how hand signs are currently used on musical.ly, supplementing the description and analysis of a typical example of a lip-syncing video with a discussion of YouTube tutorials created by and for musers explaining how to use hand signs on musical.ly. Then, I trace the historical precursors to the hand signs through two lineages. First, I discuss the remediation of gestures and affect that led to informal punctuation, emoticons, and emoji. Next, I discuss the history of communicative gestures, including informal everyday gestures, formalized gestural languages like semaphore, Plains Indian Sign Language and sign languages for the deaf, and most importantly for my argument about hand signs as elements in a performance, the history of codified gestures in oration and theater.

What is Musical.ly?

Musical.ly is an app where users can share 15 second lip-syncing videos and other short videos. Users can select soundtracks from the extensive music database available on musical.ly, use songs they have on their phone, record live sound, or use sound from other musers’ videos and skits. They can add visual filters and emojis to their videos like on other video- and image-sharing services, as well as apply less common time-based effects, such as a time loop, where you select a second of the video that is repeated a few times. Adding text is not an option.

The app was launched in Shanghai in October 2014 by Alex Zhu and Luyu Yang and has rapidly grown to become one of the most popular apps globally for tweens and young teenagers. On an average day in mid-2017, 13 million videos were uploaded to the app, which has grown from 10 million users in 2015 to over 200 million users in September 2017 (Herrman, 2016; Perez, 2017; Robehmed, 2017). The rapid growth of the app is all the more notable for its very specific demographics: it is extremely popular with tweens and young teenagers, but many adults have not heard of it—unless they have young people in their life who have shown it to them.

Ideally, this article would include videos showing how the hand signs are used, or at least provide references allowing readers to view the videos I discuss in full. I have not done this for practical, legal, and ethical reasons. It would be practically difficult, as musical.ly is a mobile-only platform with no permanent URLs for videos, there are legal issues involved, as recording videos is against musical.ly’s Terms of Service, and it would be ethically problematic as many if not most musers are minors, and given the mobile-only platform they reasonably expect a limited audience. I would suggest readers either search the musical.ly app themselves or look at compilations posted on YouTube. One example of a popular muser who uses hand signs in lip-syncing videos is Marco Cellucci, an French-Italian 14-year-old boy who has over half a million fans on musical.ly and has compiled a video of some of his best lip-syncing videos (Cellucci, 2016). Nina Schotpoort is an 11-year-old Dutch girl who also has more than half a million followers on musical.ly as well as running her own YouTube channel. Some of her YouTube videos show her in musical.ly lip-syncing duets with her fans (Schotpoort, 2017a) or showing users how the interface works (Schotpoort, 2017b). In the next section, I will analyze one anonymized lip-syncing video in detail and then discuss two hand sign tutorials posted to YouTube (Baby Ariel, 2015; Nigerias Blessing, 2015).

Hand Signs on Musical.ly

A typical example of how hand signs are used on musical.ly is shown in Figure 1, which shows selected frames from a lip-syncing video of a 15 second sample from the song “Too Good” by Drake and Rihanna. It was made by a young girl in Norway in August 2016, and I found it by scrolling through the most recently posted videos from the town I was in at the time.⁴ Hand signs on musical.ly usually only use a single hand (as the other hand holds the camera) and they almost always replicate or enact selected words in the lyrics, as you see in the video shown in Figure 1. For the first phrase in the lyrics, “I’m way too good for you,” the muser first holds up two fingers to signify “too,” then quickly shifts to a thumbs up for “good,” back to the two fingers up again to signify “to” then points at the camera for “you.” The whole sequence takes less than a second.

Figure 1.

Sketches showing a sequence of hand signs used in a lip-syncing performance of the lyrics “I’m way too good for you.”

In the second part of the video, shown in Figure 2, the muser doesn’t use hand gestures for every word, as in Figure 1. The lyrics for this part of the video are a little more abstract: “You take my love for granted/I just don’t understand it.” In the first line, the muser chooses to sign “you” (index finger pointing at camera) and “love” (hand held in the shape of half a heart). The last line is shown in just one sign: the finger points to the side of the head.

Figure 2.

These sketches show the hand motions used in the second part of the musical.ly video. The lyrics for this sequence are, “You take my love for granted/I just don’t understand it.”

Hand motions on musical.ly are almost always direct interpretations of words in the lyrics. Sometimes, as with the use of two fingers for “too” and “to,” the signs refer to homonyms of the word used–although we could also interpret the two-fingered sign as signifying a “2” as often used in texting, rather than signifying “too” and “to”: “I’m way 2 good 2 you.”

A musical.ly lip-syncing video is a performance. The lip-syncer uses the soundtrack to perform the sound much in the same was as a musician reads sheet music to play a tune. The muser listens to the music and the words, practices singing along, and chooses which words to represent visually through hand motions. Although most musers tend to use the same set of signs (some of which are listed in the appendix), there is some leeway for personal interpretation. In Figures 1 and 2, this is evident in the selection of which words to sign and which not to sign, but also in the conclusion of the video, which is without words. After the part of the video shown in Figure 2, the muser puts her hand in front of the camera, making the screen go black. Then the video cuts to a slow-motion sequence of her first looking down, then smiling, her hand moving in toward her chin as though she is going to lean on it. She appears to be about to blow a kiss to the viewer (or to her own image on the front-facing screen of her mobile phone camera) when the video ends, or rather, loops back to the beginning.

Most hand signs used on musical.ly are directly representational: they refer to a specific word in the lyrics (see the appendix for a list of some common hand signs). There are also some signs that are non-representational, or that function more as a dance, or to emphasize the beat of the song. For instance, when moving the camera (which is done almost as a form of dance move), many users hold their left hand at the bottom of the frame, palm up, and little finger slightly raised as they move the hand from right to left. In a tutorial posted to YouTube about musical.ly hand motions, muser Nigerias Blessing explains the importance of camera motions (Nigerias Blessing, 2015). Camera motions certainly enhance the hand signs, and the ability to use camera motions effectively is a marker of real skill and prestige. In her video, Nigerias Blessing explains how the camera often should move in the opposite direction of the hand sign. Certain hand signs, in Nigerias Blessing’s view, need to be combined with appropriate camera motions. For instance, the sign for “drop” involves holding your hand with fingers down, tight together, as though holding something, then opening the fingers as though dropping that thing. Blessing suggests holding the hand quite high, and keeping it high throughout the sign, while moving the camera down. She embellishes the “shoot” sign by shaking her camera a little bit, mimicking the rat-tat-tat jolts of a series of bullets being shot. For hitting, she makes a fist, aims it at the camera, and moves the fist and camera toward each other as though they are about to collide. The integration of camera movements and gestures are a fascinating difference between these hand signs and hand signs used in face-to-face communication or in stage performances.

Nigerias Blessing’s video is the most thorough tutorial I have found on musical.ly hand signs. Other popular tutorials tend to be far less specific. Nigerias Blessing’s tutorial has by mid-2017 been watched almost 300,000 times on YouTube, although she is not a musical.ly superstar, with less than 2,000 followers on muscial.ly. Baby Ariel, on the other hand, is one of the most popular users on musical.ly, with over 20 million followers (Sherman, 2017). By the age of 15, she had established herself as a professional social media influencer based on her success on musical.ly. Several of her YouTube videos give users advice on how to make good musical.ly videos. Baby Ariel’s (2015) advice is less specific than Nigerias Blessings’ advice:

I shake my camera a lot. I don’t know how I do it—I . . . shake it [looks into her mobile camera while gently shaking it] . . . like, really softly, but it looks like, bah! [Looks into camera again, moving camera sharply]—and it turns out good!

In another video, Baby Ariel (2016) teaches her mother to make a musical.ly video. Here, she explains in detail how to move the camera either in the opposite direction of the hand or following the hand.

Camera movements are, as these tutorials show, an important part of the musical.ly aesthetic, but appear not to signify directly on their own, as many of the hand motions do. Instead, the camera movements “look good,” as the tutorial makers say, often visibly searching for more specific words as they speak but ending up with “good.”

This ability to make it “look good” clearly involves a fair bit of skill and experience. It is not a skill that can be learnt simply from watching other videos or tutorials, or by your friend showing you. You need to practice a lot in order to succeed. Baby Arial’s video of her mother learning to use the movements demonstrates this by showing how inept her mother’s first attempts are, and by showing that although the mother improves with practice, she still lags far behind her daughter’s skill level.

The skill required to understand hand signs on musical.ly is not only on the part of the performer (the lip-syncer) but also on the part of the viewer of the lip-syncing video, who has the pleasure of interpreting the hand signs. In this sense, the hand signs can be seen as a puzzle, or more directly, a rebus, as journalist Clive Thompson (2013) has argued of emoji. Thompson notes that the uninitiated often find emoji to be annoying and argues that this annoyance or disdain also applies to rebuses, citing linguist Michael J. Preston, who in a paper on rebuses and graphic riddles wrote that “Just as a pun is conventionally met with a groan, the rebus is often acknowledged by a statement of disdain, unless, of course, one knows a rebus or two and can respond in kind” (Preston, 1982, p. 119). In their paper on emoji, Stark and Crawford (2015) reiterate Thompson’s point. This disdain is similar to the reception lip-syncing videos receive when spread beyond specialized sites like musical.ly. Luckily for lip-syncers, those who would disdain the puzzles of hand signs do not tend to spend much time on muscial.ly, though no doubt, as lip-syncing becomes more mainstream it will be mocked more, following the pattern we already know from blogs, selfies, and other forms of self-representational digital culture (Burns, 2015; Rettberg, 2014, pp. 17–19).

The History of Pictograms in Online Communication

The term remediation comes from Jay Bolter and Richard Grusin’s book by the same name (Bolter & Grusin, 1999). They discuss how every new medium remediates older media by paying homage to, rivaling, and refashioning them. Following this, I am using the term to argue that new modes of communication remediate old modes of communication. Personal letters remediated oral communication, email remediates personal letters, and so on. Non-standard punctuation and drawings were typical features of personal correspondence and informal letters and notes to friends and were used as a way of “encoding affect in written code” (Kataoka, 1997, p. 105). Italics, capitalization, repetitions, and ellipses are also examples of how writers encode affect and the immediacy of an oral communication situation into writing (Lakoff, 1982). In this way, informal writing remediated oral conversation. Email writers used the same techniques and developed new ones, adapting typography to create emoticons like the smiley face :-) and other combinations of punctuation marks (Rezabek & Cochenour, 1998). Toward the end of the first decade of the 21st century, emoji were introduced, and have largely, though not entirely, supplanted emoticons. Emoji began as proprietary symbols on specific Japanese platforms, but in 2010, emoji were added to Unicode standard version 6.0, meaning that they work the same as letters in the alphabet or punctuation marks, and can be represented in many different fonts and on different platforms. Many emoji remediate emoticons, like the smiley face☺, but there are also many new emoji, some referring directly to a physical object, like the coffee cup, and others gaining different meanings in different communities (Miller et al., 2016).

Around the same time, animated gifs became an increasingly popular addition to textual communication. The use of animated gifs mixed into verbal communication is a phenomenon that can be seen as a bridge between emoji and hand signs in musical.ly videos. Response gifs are often gestural and are used as supplements to and comments on verbal communication, for instance, in messaging or on Tumblr, and they add a layer of affect to the verbal text (Miltner & Highfield 2017; Highfield & Leaver, 2016; Tolins & Samermit, 2016). Although new response gifs are constantly being created, response gifs are also becoming more codified as platforms allow users to search for gifs by typing in the emotion they want to express.

Until recently, online communication was primarily written. Images and other visual aspects were secondary; they complemented, responded to, or expanded upon the written text. Platforms like YouTube, Snapchat, Instagram, and musical.ly invert this relationship between text as dominant and image as complementary by making the visual primary. Snapchat is interesting in this context, because it is not primarily verbal, but emoji are still heavily used as overlays to images and videos, or as part of a written caption superimposed on the image. Figure 3 shows two examples of how emoji are used on Snapchat. On the left, we see how the social media celebrity DJ Khaled uses emoji similarly in his snaps to in his tweets. He also often posts images with a caption containing no text, only one or more emoji. He rarely uses emoji to augment the image directly, as the Norwegian social media influencer Supermarie does in the snap on the right. Here, she uses several different techniques to augment her image. A selfie lens gives her face an animal nose and ears. She has added a caption, stating (in Norwegian), “I AM TOO YOUNG TO DIE” all in capital letters, but followed by three laughing emoji (“face with tears of joy”). She has also used emoji outside of the caption field: there are five “fire” emoji placed to the upper left of her face. Taken together, this is a complex multimodal message suggesting perhaps that she is exhausted, or that something difficult has happened, but also that she has a sense of humor about it. Notice too that both DJ Khaled and Supermarie have included emoji in their user names.

Figure 3.

On the left, a snap by the celebrity DJ Khaled (@DJkhaled) shows a photograph of a necklace, with a caption including two key emoji and two praying hands emoji. On the right, a snap by Norwegian @supermarie showing her with closed eyes and a selfie lens filter on her face, a caption stating “I am too young to die” with three “face with tears of joy” emoji, and five fire emoji in varying sizes positioned above her face. Screenshots from Snapchat 9 December 2016.

Musical.ly videos do not include writing, but there is verbal content: the words sung that the muser lip-syncs to. Lip-syncing is of course not only a digital activity. In addition to being a long-established amateur pastime, lip-syncing has a strong tradition in drag and queer culture (Kaminski & Taylor, 2008). While gestures and facial expressions have of course always been integral to lip-syncing and miming, they have not typically been used as on musical.ly, where hand signs signify specific meanings that are recognized by a wide community. However, gestures are common in oral, face-to-face communication, and there are many gestures that have fixed meanings that can be codified. While sign language for the deaf is the most obvious example, it differs substantially from the hand signs used in musical.ly because it is an independent language where gestures are the primary mode. To understand hand signs in musical.ly as pictograms that support and expand upon another dominant mode of communication, it is more helpful to consider them in relation to informal conversational gestures and to the codification of gestures in oration and theater.

Communicative Gestures

Gestures are fundamental to human communication. Most humans can manage some basic communication without words, for instance, when trying to buy an item in a shop in a country where we do not know the language.

Gestures are often made unconsciously and may still communicate. We may slump when we are feeling dejected or raise our eyebrows slightly upon seeing somebody we like. In his book Bodytalk, Desmond Morris catalogues many gestures, both those made almost unconsciously and those made deliberately, noting their meanings and the localities where they are used. For instance, Morris lists standing arms akimbo (“hands on hips so the elbows protrude from the body”) and explains that this gesture is used globally to signify “Keep away from me!” and in Malaysia and the Philippines can also be used to signify seething rage (Morris, 1994). Some gestures are always made deliberately, like the “rude finger” or digit impudicus, which is also an example of how a gestural sign can have extraordinary longevity: Aristophanes punned upon it in The Clouds, written 423 years BC. There is documentation of its use in Roman times, and it is still common today (Robbins, 2007).

Sign languages can also be complete languages, with morphologies and grammatical rules of their own. Before developing spoken language, early Homo sapiens used gestures to communicate, possibly with quite complex syntax (Corballis, 2002). We know that structured sign languages existed well before modern times to complement or replace speech, such as the Plains Indian Sign Language, which the Arapaho, Cheyenne, Lakota, Blackfoot, Comanche, Paiute, and Crow peoples in North America used to communicate across different language communities and for ritual purposes (Carayon, 2016). The most common formalized sign languages today are sign languages for the deaf, which began to be formalized in the 18th century (Knowlson, 1965) and are now used all over the world as national sign languages with complete grammars and morphologies (Stokoe, 2005, original publication 1960). Semaphore (which can be used with hands as well as flags) and Hindu dance gestures are other examples of specialized gestural languages.

On musical.ly, lip-syncers generally use signs that are already common in everyday life, such as holding up fingers to signify a number, or signs that mime a meaning, much as we do when trying to communicate without words in a face-to-face situation. For instance, if we were trying to express that we were thirsty or to offer somebody a drink, without using words, we might curl the fingers as though holding a glass or bottle, and move this in a “drinking” gesture near the mouth to mime the motion of drinking. If we were struggling to hear somebody in a crowded nightclub, we might cup a hand to our ear and lean forward. Signs like these are equivalent to pictograms in writing: they visually mimic that which they represent, much as emoji do.

Multimodality: Words, Images, and Gestures on a Screen

With touch screens, we have become increasingly accustomed to using gestures as codes or signs that have very clear meanings. We shake our phones to undo a mistake and swipe down when watching a story on Snapchat to go back to the overview. Sometimes these gestures of interaction go beyond mere navigation and can be seen as what Ian Bogost (2007) calls procedural rhetoric: “persuading through processes in general and computational processes in particular” (p. 3), as has been argued of the swipe right to dismiss an unpleasing suitor on Tinder (David & Cambre, 2016). Cultural practice is evolving to incorporate these new modalities, both in vernacular communication and artistic genres, but we still lack a sophisticated rhetorical understanding of gestures of interaction (Verhulsdonck & Morie, 2009).

In her analysis of kinetic poetry and other forms of electronic literature that use the affordances of digital media, Alexandra Saemmer (2013) proposes that we consider combinations of gestures and words or images to be pluricode couplings, which “involve two different semiotic systems, a text and an icon, within the same active support of the sign.” In the works of electronic literature that Saemmer discusses, the gestures can either be movements the reader must make in order to access the poem (touching the screen, clicking a link, dragging a word) or movements made by the words and letters on the screen. Semioticians have used the term multimodal for some time to describe a text that uses several modalities, for instance, an ad in a magazine that uses both text and image, or a movie, where moving images are combined with sound (Kress, 2010).

In a muscial.ly lip-syncing video, I propose that we think of the hand signs as an independent modality. Thus, we can identify at least four distinct modalities: the music, the lyrics (the words sung being a linguistic modality), the moving image, and the hand signs.

The image in a magazine ad illustrates the text and expands the meanings and connotations of the ad as a whole. The relationship between the lyrics and the hand signs in most musical.ly lip-syncing videos is even closer than the relationship between image and text in an ad. Rather than expanding the meaning, the two modes of words and signs repeat each other. In Alexandra Saemmer’s words, there is a coupling between word and hand motion. The word “love” and the hand held in the shape of half a heart against the chest are a pluricode coupling. In a sense, then, the hand motions are redundant. They simply repeat the meaning that is already conveyed in the lyrics.

Lip-Syncing as Performance

If we instead think of the hand signs as an enactment or a performance of the lyrics rather than as a representation in itself, the apparent redundancy makes more sense. Young people have listened to and performed the popular music they love for generations. Pre-teen girls in particular perform, adapt, and share the music they love privately and with friends (Baker, 2001). The mobile phone is a perfect companion to such bedroom culture (Kearney, 2007), as it is both an intimate device and a communication device. Pre-teen culture can also affect adult culture, as Kyra Gaunt (2012) has shown in her study of how pre-adolescent girls’ game-songs have influenced hip hop music.

What has changed with apps like musical.ly is (1) the scale of the audience with which young people can share their performances (potentially to the world, not just their classmates and neighbors); (2) the scale at which they can access other versions of performances of the same music; and (3) the fixity of the media they use, which allows them to record and edit their performances far more easily than earlier generations could. It is not surprising that a global platform for sharing the kind of performances of popular music that tweens and teens for so long have engaged in would be hugely popular.

A lip-syncing video with hand movements can be viewed as a performance of a set text, much as a musician plays a tune from sheet music, an actor acts from a script or a parent sings a popular lullaby. Certainly, interpretation and individual choices are involved, but these performances can generally be recognized as performances of an already established text. The musical.ly app encourages this by allowing users to see many performances of the same song. By tapping the small circle in the lower-right hand corner when watching a video, we arrive at a page that shows information about the song that was performed, with thumbnail videos of other performances of the same video (see Figure 4).

Figure 4.

The page for an individual song includes moving thumbnail videos of its most popular or most recent lip-syncs beneath. The top three shown here are musical.ly celebrities @lisaandlena, @babyariel, and @mackenzieziegler, who all have several million followers.

This availability and even celebration of multiple interpretations of a set text is an established pattern of the internet today. The set text need not be an image or a song. Knitting is another creative, everyday activity that has been revolutionized by the Internet. Like lip-syncers, who used to perform in their bedrooms or on the playground, knitters used to knit at home or with close friends. With platforms like Ravelry.com, knitters can share their projects with a far greater audience and connect images and descriptions of each sock or sweater they knitt to the patterns the items are based on. This creates a database where you can often see several hundred different ways in which a single pattern has been worked up, and you can compare how various knitters have adjusted the pattern to their liking by using different colors, different yarns, or changing the pattern. Before the Internet, in the age of mass media, knitters usually only had access to a few local friends who knitted, and to the official patterns published by yarn companies and sold in yarn shops. With the Internet, creativity blossomed in knitting, not least because of the possibility of seeing hundreds of different ways that a single pattern could be knitted. Memes are another creative area where we see many interpretations of an original, and where databases, such as Knowyourmeme.com, allow users to browse through catalogues of variations on a shared original. These modes of creativity encourage a thinking where the original pattern or meme is not something to be revered, but something to be developed and made one’s own.

On musical.ly, creativity is similarly encouraged by the constant invitation to create your own version. When you watch somebody else’s video, you have the option to “start duet now!” which means making your own recording of the same song, which musical.ly will edit together with the first one you saw to create a duet. Or you can click the circle in the lower-right corner to see information about the song, other examples of videos made for it, and a tempting, bright yellow button labeled “shoot now!” inviting you to create your own version. This is an environment designed for remaking and adapting other peoples’ content, and thus an ideal environment for the rapid development of a shared gestural language.

Chironomia: Codified Gestures in Performance

Gestures have been studied for their role in supporting verbal communication since the ancient Romans, at least, when Quintilian and others described how orators used gestures and bodies to strengthen their message. Cicero coined the term chironomia in his De Oratore (55 BC) for the study of non-verbal communication through hand and arm gestures that accompany speech (Verhulsdonck & Morie, 2009).

In the 17th and 18th centuries, scholars studied and analyzed gestures of actors and produced dictionaries and systems for understanding these gestural languages, describing the field as chironomia or chirologia (Austin, 1806; Bulwer, 1644). These codified gestural languages for theater are in many ways similar to the hand signs on musical.ly. Some of the same gestures are used in musical.ly videos, such as the first gesture shown Figure 5, a finger held up to the lips to signify silence, documented in Andrea de Jorio’s (1832) book on everyday Neapolitan gestures. The theatrical gestures described by Austin and Bulwer were specifically for actors to use to accompany spoken words in a performance. They are an encoding of affect, similar to emoji, emoticons, response gifs, and hand signs on musical.ly.

Figure 5.

These 19th-century Neapolitan gestures are from a 19th-century book (de Jorio, 1832, p. 427) and signify as follows: 1. silence; 2. no; 3. beauty; 4. hunger; 5. to mock; 6. weariness; 7. stupid; 8. squint; 9. to deceive; 10. cunning.

Such formalized gestures fell out of fashion in theater with the realism of the 19th century. Giorgio Agamben argues in his “Notes on Gesture” (Agamben, 2000) that it was photography that killed gestures, not simply in theater but in society in general, by freezing them in time. “By the end of the nineteenth century,” Agamben writes, “the Western bourgeoisie had definitely lost its gestures” (p. 49), and in silent movies and a few other genres, “humanity tried for the last time to evoke what was slipping through its fingers forever” (p. 53). Agamben sees photography as killing gestures by locking them into still slices of time: “images are the reification and obliteration of a gesture (it is the imago as death mask or as symbol)” (p. 54). Cinema, on the other hand, has its center in the gesture, Agamben writes. While this is a seductive line of argument, I must point out that a freezing of gestures was achieved by the detailed drawings of gestures in books on chironomics (as shown in Figures 5 and 6) well before photography was invented. The urge to freeze gestures and categorize them may be more related to the Enlightenment desire for categorizing the world with dictionaries and encyclopedias than to the 19th-century technology of photography, although it certainly became more ubiquitous with photography. Perhaps it is also significant that photography froze the actual gestures of individual people, whereas the drawings of earlier years were more generalized and abstract.

Figure 6.

A chart from John Bulwer (1644) book Chirologia, showing different hand gestures to be used in oration, with their meanings.

Different communication technologies emphasize or obscure gestural communication. Photography and drawings freeze gestures. The predominantly written communication of the early years of the Internet hides gestures altogether, although people tried to reinscribe gestures into their writing through emoticons, non-standard punctuation and spellings, and verbal descriptions. Smartphones with cameras and increased broadband access made images central to social media from around 2008 and onwards (Rettberg, 2014, p. 3), but following Agamben, these still images would only have frozen gestures into death-masks. However, video-based communication like musical.ly makes gestures central again.

Agamben sees cinema as a technology that rekindled gestures in our culture. Silent cinema in particular required large and expressive gestures, as it had to do without sound, and verbal language was limited to short captions (not unlike Snapchat). Now, when video is gaining importance in online communication, it makes sense that gestures are also regaining prominence. Importantly, gestures on musical.ly are codified, deliberate representations rather than the largely unintended, or at least unplanned, gestures of face-to-face communication. Like emoji and response gifs, and like the codified and exaggerated gestures of oration, pre-realistic theater and silent cinema, they are codified versions of the gestures we use in face-to-face communication. Hand signs on musical.ly are not a language intended to replace verbal communication, like sign languages for the deaf or semaphore. Hand signs on musical.ly augment verbal communication, lending emphasis to the words that are sung in the music, providing anchorage as Roland Barthes (1977) might have said (pp. 35–37). They are intended to enhance performance, just as the hand gestures catalogued by Bulwer were intended to be used by actors and orators performing to an audience.

Conclusion

Hand signs, emoji, and other pictograms are the current instantiations of a tendency in human communication that can be traced back to the orators in Ancient Rome. When ancient orators or 18th-century actors stood on a stage, and were only seen at a distance by their audience, they compensated by developing a set of codified gestures to amplify and interpret the words they spoke. In musical.ly lip-syncing videos, we see the muser up close but on a tiny screen, and we do not hear their voice. They compensate by developing a set of codified hand signs to amplify and interpret the words sung in the song they are lip-syncing. When people write online, the tone of their voice and their body language is not visible to their readers. They compensate by developing a set of codified emoji to amplify and interpret the words they type.

Until now, we have thought of emoji and other forms of non-standard punctuation that are used to enhance writing in digital media as being a way to add aspects of non-verbal speech that are necessary to conversation to a text-only medium. But if, as I have argued, hand signs on musical.ly are analogous to emoji in texts, we need to rethink the way we understand not only hand signs but also emoji. Emoji are simply one of many possible kinds of pictogram that have developed and will be developed as part of online communication. If emoji developed to augment textual communication, and hand signs are developing to augment video-based communication, we could speculate that other codified sign systems will develop when other modalities become dominant, at least if communication is limited as in text or musical.ly so that only some modalities are available. Both emoji and hand signs are remediations (Bolter & Grusin, 1999) of the gestural communication that is so fundamental to human conversation. As video and other visual forms of communication become more common online, we will continue to see how the human need for gestures leaks into digital communication.

Footnotes

Appendix

List of Selected Hand Signs Commonly Used on Musical.ly With Their Meanings

Bed/sleep/night	Tilt head, leaning against hand, palm away from head
Cold	Wrap hand around self, shake camera in shivering motion
Come here	Beckon with index finger
Cry/tear	Touch finger to cheek, pull down as though a tear is running down cheek
Die/death	Make a gun from thumb and two pointed fingers and point it at your head, or hold hand horizontal and slash across throat
Drink	Hold hand to mouth and move as though drinking from cup or bottle
Drive	Hand on imagined steering wheel, move back and forward
Face	Hand moves down against side of face, palm toward face
Half/middle	Hold hand flat and vertical, 90° to face, thumb toward face, little finger to camera, and (optionally) move up or down, moving camera in opposite direction
I/me	Thumb points to self
Lie	Fingers up, palm toward face, move hand down in front of face while wiggling fingers as though in intricate pattern
Look	Hand held horizontally above eyes
Love	Make the shape of half a heart with your fingers and thumb
Money	Rub fingers together as though holding paper money between them
No	Shake head or hold index finger up and shake back and forward
Numbers	Fingers held up, palm facing self
Peace	Peace sign (two fingers held up, palm facing camera)
Pray/hope/miracle	Same as “half/middle,” although here the reference is of two hands held palm to palm as though in prayer
Run/leave/go	Move fingers as though running
Sing/talk/say	Hold hand in front of mouth and move thumb against fingers to mimic a mouth opening and closing to speak
Stressed out	Hands against top of head, upset expression on face
Take	Begin with outstretched open-palmed hand, move toward self while closing hand
Think/wonder	Tap one or more fingers lightly against side of head
Time	Hold forearm horizontal and look at wrist as though looking at a watch
You	Index finger points to camera

Acknowledgements

The author thanks Aurora Goga for her hand-drawn renditions of the lip-syncing video shown in Figures 1 and . An early version of the paper was presented at the workshop “Social Media as Semiotic Technology”, organised by Søren Vigild Poulsen and Gunhild Kvåle at the University of Southern Denmark. The author is grateful for feedback and encouragement from participants in the symposium, and to feedback from Social Media + Society’s peer reviewers.

Notes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research for this paper was done as part of the author’s job at the University of Bergen, for which she received a salary. The final revisions were completed while the author was a visiting scholar at MIT, with a stipend from the Meltzer Foundation.

Author Biography

Jill Walker Rettberg (PhD, University of Bergen) is professor of Digital Culture at the University of Bergen, and a Visiting Scholar at Massachusetts Institute of Technology in 2017. Her research interests include digital art and narrative, self-representation in social media, and the cultural effects of machine vision.

References

Agamben

(Ed.). (2000). Notes on gesture. In Binetti

Casarino

(Trans.). Means without end: Notes on politics (pp. 49-62). Minneapolis: University of Minnesota Press.

Austin

(1806). Chironomia; or, a treatise on rhetorical delivery (Printed for T. Cadell and W. Davies). Retrieved from http://archive.org/details/chironomiaoratr00austgoog

Baby Ariel. (2015). Musical.ly tutorial [YouTube video]. Retrieved from https://www.youtube.com/watch?v=EQq0FLlMuQE

Baby Ariel. (2016). Teaching my mom how to make a musical.ly. Retrieved from https://www.youtube.com/watch?v=X0FjhRqW8iU

Baker

(2001). “Rock on, baby!”: Pre-teen girls and popular music. Continuum, 15, 359-371. doi:10.1080/10304310120086830

Barthes

(1977). Image music text ( Heath

, Trans.). London, England: Fontana Press.

Bogost

(2007). Persuasive games: The expressive power of videogames. Cambridge, MA: MIT Press.

Bolter

J. D.

Grusin

(1999). Remediation: Understanding new media. Cambridge, MA: MIT Press.

Bulwer

(1644). Chirologia, or, The naturall language of the hand composed of the speaking motions, and discoursing gestures thereof: Whereunto is added Chironomia, or, The art of manuall rhetoricke, consisting of the naturall expressions, digested by art in the hand, as the chiefest instrument of eloquence, by historicall manifesto’s exemplified out of the authentique registers of common life and civill conversation: With types, or chyrograms, a long-wish’d for illustration of this argument. London, England: Tho. Harper. Retrieved from https://archive.org/details/gu_chirologianat00gent

10.

Burns

(2015). Self(ie) -discipline: Social regulation as enacted through the discussion of photographic practice. International Journal of Communication, 9, 1716-1733.

11.

Carayon

(2016). “The Gesture speech of mankind”: Old and new entanglements in the histories of American Indian and European sign languages. The American Historical Review, 121, 461-491. doi:10.1093/ahr/121.2.461

12.

Cellucci

(2016). MUSICAL.LY marco cellucci [YouTube video]. Retrieved from https://www.youtube.com/watch?v=UkGHexIL7y8

13.

Chandler

Munday

(2011). Pictogram. In Chandler

Munday

(Eds.), A dictionary of media and communication. Oxford, UK: Oxford University Press. Retrieved from http://www.oxfordreference.com/view/10.1093/acref/9780199568758.001.0001/acref-9780199568758-e-2045

14.

Corballis

M. C.

(2002). From hand to mouth: The origins of language. Princeton, NJ: Princeton University Press.

15.

David

Cambre

(2016). Screened intimacies: Tinder and the swipe logic. Social Media + Society. doi:10.1177/2056305116641976

16.

De Jorio

. (1832). La mimica degli antichi investigata nel gestire napoletano (Gesture in Naples and Gesture in Classical Antiquity). Napoli, Italy: Dalla stamperia e cartiera del Fibreno.

17.

Gaunt

K. D.

(2012). Girls’ game-songs and hip-hop: Music between the sexes. Parcours Anthropologiques, 8, 97-128. doi:10.4000/pa.116

18.

Goffman

(1959). The presentation of self in everyday life. New York, NY: Anchor Books.

19.

Herrman

(2016, September 16). Who’s too young for an app? Musical.ly tests the limits. The New York Times. Retrieved from http://www.nytimes.com/2016/09/17/business/media/a-social-network-frequented-by-children-tests-the-limits-of-online-regulation.html

20.

Highfield

Leaver

(2016). Instagrammatics and digital methods: Studying visual social media, from selfies and GIFs to memes and emoji. Communication Research and Practice, 2, 47-62. doi:10.1080/22041451.2016.1155332

21.

Kaminski

Taylor

(2008). We’re not just lip-synching up here: Music and collective identity in drag performances. In Reger

Myers

D. J.

Einwohner

R. L.

(Eds.), Identity work in social movements (pp. 47-75). Minneapolis: University of Minnesota Press.

22.

Kataoka

(1997). Affect and letter-writing: Unconventional conventions in casual writing by young Japanese women. Language in Society, 26, 103-136.

23.

Kearney

M. C.

(2007). Productive spaces: Girls’ bedrooms as sites of cultural production. Journal of Children and Media, 1, 126-141. doi:10.1080/17482790701339126

24.

Knowlson

J. R.

(1965). The idea of gesture as a Universal Language in the XVIIth and XVIIIth Centuries. Journal of the History of Ideas, 26, 495-508. doi:10.2307/2708496

25.

Kress

G. R.

(2010). Multimodality: A social semiotic approach to contemporary communication. Abingdon, UK: Taylor & Francis.

26.

Lakoff

R. T.

(1982). Some of my favorite writers are literate: The mingling of oral and literate strategies in written communication. In Tannen

(Ed.), Spoken and written language: Exploring orality and literacy (pp. 239-260). Norwood, NJ: Ablex Publishing Corporation.

27.

Miller

Thebault-Spieker

Chang

Johnson

Terveen

Hecht

(2016). “Blissfully happy” or “ready to fight”: Varying interpretations of emoji. Paper presented at International Conference on Web and Social Media (ICWSM), Cologne, Germany. Retrieved from http://www-users.cs.umn.edu/~bhecht/publications/ICWSM2016_emoji.pdf

28.

Miltner

K. M.

Highfield

(2017). Never Gonna GIF You Up: Analyzing the Cultural Significance of the Animated GIF. Social Media + Society, 3. doi:10.1177/2056305117725223.

29.

Morris

(1994). Bodytalk: The meaning of human gestures. New York, NY: Crown Trade Paperbacks.

30.

Nigerias Blessing. (2015). Musical.ly tutorial 30+ HAND MOTIONS & EXPLANATIONS [YouTube video]. Retrieved from https://www.youtube.com/watch?v=2ICGnFAHePQ

31.

Perez

(2017, August 23). Musical.ly’s redesign adds video recommendations, new user profiles. Retrieved from http://social.techcrunch.com/2017/08/23/musical-lys-redesign-adds-video-recommendations-new-user-profiles/

32.

Preston

M. J.

(1982). The English literal rebus and the graphic riddle tradition. Western Folklore, 41, 104-138. doi:10.2307/1499785

33.

Rettberg

J. W.

(2014). Seeing ourselves through technology: How we use selfies, blogs and wearable devices to see and shape ourselves. Basingstoke, UK: Palgrave Macmillan.

34.

Rezabek

Cochenour

(1998). Visual cues in computer-mediated communication: Supplementing text with emoticons. Journal of Visual Literacy, 18, 201-215. doi:10.1080/23796529.1998.11674539

35.

Robbins

I. P.

(2007). Digitus impudicus: The middle finger and the law (SSRN Scholarly Paper No. ID 982405). Rochester, NY: Social Science Research Network. Retrieved from https://papers.ssrn.com/abstract=982405

36.

Robehmed

(2017). From musers to money: Inside video app musical.ly’s coming of age. Retrieved from https://www.forbes.com/sites/natalierobehmed/2017/05/11/from-musers-to-money-inside-video-app-musical-lys-coming-of-age/

37.

Saemmer

(2013). Some reflections on the iconicity of digital texts. Language & Communication, 33, 1-7. doi:10.1016/j.langcom.2012.10.001

38.

Schotpoort

(2017a). Musical.ly na doen van fans: Musical.ly #3—nina schotpoort. Retrieved from https://www.youtube.com/watch?v=a_gg98UMUWw

39.

Schotpoort

(2017b). Musical.ly uitleg: musical.ly #1—nina schotpoort. Retrieved from https://www.youtube.com/watch?v=q5nqBg5rVxE

40.

Sherman

(2017, July 18). Musical.ly sensation Baby Ariel on handling sudden fame and her favorite artist to lip sync. Retrieved from http://www.papermag.com/musical-ly-star-baby-ariel-on-handling-sudden-fame-and-her-favorite-ar-2461469565.html

41.

Stark

Crawford

(2015). The conservatism of emoji: Work, affect, and communication. Social Media + Society. doi:10.1177/2056305115604853

42.

Stokoe

W. C.

(2005). Sign language structure: An outline of the visual communication systems of the American deaf. The Journal of Deaf Studies and Deaf Education, 10, 3-37. doi:10.1093/deafed/eni001

43.

Thompson

(2013). The prehistory of emoji. Womanzine. Retrieved from https://issuu.com/lindseyweber5/docs/emoji_by_womanzine

44.

Tolins

Samermit

(2016). GIFs as embodied enactments in text-mediated conversation. Research on Language and Social Interaction, 49, 75-91. doi:10.1080/08351813.2016.1164391

45.

Verhulsdonck

Morie

J. F.

(2009). Virtual chironomia: Developing standards for non-verbal communication in virtual worlds. Journal for Virtual Worlds Research, 2, 1-10. doi: https://doi.org/10.4101/jvwr.v2i3.657