Abstract
This article argues for the conceptualization of fAIce communication as the modus operandi of facial recognition. From apps that claim to determine a person’s trustworthiness, recruiting technology that analyses candidates’ job fitness, through to banks using iris scanning to replace debit cards, facial recognition is increasingly used to communicate information about a person’s identity and personality. Faces communicate and have increased value. Knowing more about how their communicative capacity is effectuated and materialized in contemporary machine culture is thus of heightened importance. The article asks how we might come to think of the communicative capacities of faces in applications of AI, and how their role in current biometric systems may contribute to reconfiguring our understanding of what communication is. In an age of algorithmic and automated systems that are not primarily driven by overt messages purposefully crafted by humans but by machines reassembling data traces into forms of meaningfulness, faces are no longer (if they ever were) meaningful only for humans. This article ultimately makes the case for conceptualizing the communicative potential of faces in machine culture in terms of what I term algorithmic face-work, or more colloquially, fAIce communication.
Keywords
Introduction
From apps that claim to determine a person’s trustworthiness, recruiting technology that analyses candidates’ job fitness, through to banks using iris scanning to replace debit cards, facial recognition is increasingly used to communicate information about a person’s identity and personality. Faces communicate and have increased value. Knowing more about how their communicative capacity is effectuated and materialized in contemporary machine culture is thus of heightened importance. In communication theory, the face is often seen as the originary mode of communication. How the face communicates within machine culture, however, is yet to be fully understood. This is not to say that the face has not been thoroughly studied or theorized. Quite the contrary.
Faces figure prominently in a wide range of disciplines, from art history to photography, film studies, security studies, criminology, robotics and psychology. Herein, conceptualizations of the communicative role of faces have largely been construed within the context of human face-to-face communication. Building on these existing conceptualizations, the article makes a case for the need to understand how faces communicate beyond the framework of human face-to-face communication. In an age of algorithmic and automated systems that are not primarily driven by overt messages purposefully crafted by humans but by machines reassembling data traces into forms of meaningfulness, faces are no longer (if they ever were) meaningful only for humans. In order to understand the role and importance of faces within machine culture, facial recognition and mediated communication, this article argues for a renewed conceptualization of the face as part of human–machine communication.
What is of interest in this article is how we might theorize the place of the face within algorithmic machine culture, in light of its techno-cultural history. The aim is twofold: On the one hand, the article seeks to locate a notion of communicative faces in a much longer history of theorizing the representational, expressive and relational capacities of faces, both within and beyond media and communication. On the other hand, this article asks how, given the background of this history, we might come to think of the communicative capacities of faces in applications of AI, and how these new uses in current biometric systems may contribute to reconfiguring our understanding of what communication is.
The article is structured as follows: First, it revisits how faces have previously been theorized within the discipline of media and communications. The purpose it to show how faces – something we take more or less for granted – are assigned a powerful and meaningful role in the shaping of interpersonal and mediated forms of human communication. The article then proceeds to consider the face within machine vision specifically. In the second part of the article, I consider faces within a longer techno-cultural history of facial recognition systems, starting with the role of photography and physiognomy, through to contemporary machine learning systems. In the final part of the article, the argument is made that human and machinic ways of understanding faces are not as separate as they first may seem. When considering the machinic assemblages of facial recognition today, what becomes evident is how the social, representational and affective conceptions of the face and the mathematical and computational notion of the face need to be understood as part of the same operational system. In a time when our digital photographs, profile pictures and selfies are no longer merely representational tokens in the shaping of interpersonal relations, but serve a performative and productive function in the feedback loops of machine learning, there is a need to rethink how, why and for whom faces communicate. To do this, this article ultimately makes the case for conceptualizing the communicative potential of faces in machine culture in terms of what I term algorithmic face-work, or more colloquially, fAIce communication.
Finding faces in communication theory
Facial expressions
While communication theory has traditionally privileged speech and writing over other modalities of communication, faces play a particularly important role in research on nonverbal communication. Broadly conceived, nonverbal communication is any form of communication that uses other means than language (Manusov and Patterson, 2006). Scholarship on nonverbal communication looks at, amongst other things, the communicative potentials of body movements (kinetics), touch (haptics) and facial expressions (a form of kinetics) (Ekman and Friesen, 1975; Mehrabian, 2017). Nonverbal communication also encompasses the communicative potentials of material things such as clothing and textile (physical appearance), or the communicative role of absences and silence (part of what is known as vocalics). In terms of machine culture, nonverbal communication plays an important role in the field of human–robot interaction (HRI). Here, the appearance and expressive potentials of a robot’s face is an active area of study. What is typically of interest is the robot’s capacities to ‘read’ human faces and vice versa (Admoni and Scassellati, 2017; Song and Luximon, 2020).
In both human–human and human–robot interaction, affective facial expression is a key feature of nonverbal communication. Much of the existing work in these fields draws on the influential theories of psychologist Paul Ekman and colleagues, developed during the 1970s. According to these theories, human emotions are universal and biologically determined, with the human face capable of displaying a set of a few basic emotions (Ekman, 1970; Ekman and Friesen, 1975). Ekman and Friesen’s (1978) facial action coding system (FACS), which they developed, is still a widely used coding scheme for facial expressions, not least for the development of facial recognition systems (Gates, 2011). Studies that link facial expressions, gestures, posture and changes in body position to personality traits and attitudes are now widely used to legitimize automated decision-making processes in everything from human relations to insurance and financial markets.
The recruiting-technology company HireVue, for example, has come under much public scrutiny for claiming to use an AI face-scanning algorithm to find the best candidate for a job. Based on the assumption that facial expressions reveal an inner truth about a person, HireVue deploys facial detection software to rank candidates against other applicants, ‘based on an automatically generated “employability” score’ (Harwell, 2019). Interestingly, in a company blog post, HireVue explicitly acknowledges Ekman and Friesen’s taxonomy as foundational to its business idea (https://www.hirevue.com/blog/hiring/nonverbal-communication-in-interview-assessments). While HireVue announced in early 2021 that it would stop using visual analysis in new recruiting assessments, in part due to the prolonged criticism that Ekman’s universality thesis has faced by cultural anthropologists and others, the legacy of Ekman’s work still looms large in much scholarly work on robotics and facial recognition.
Mediating faces
In film and television studies, faces are above all endowed with representational and symbolic value. For Gunning (1997), the expressive human face ‘transformed cinema from a mere means of reproduction into a unique art form’. As exemplified in studies of the cinematic close-up, the mediating potentials of faces seem immense. As Doane (2003) notes, the cinematic face can be used to signify everything from individuality to specific social types, or be used to convey intersubjectivity. Not everything, however, is written
The idea that one can read faces like a book, both literally and between the lines, has long been a common trope. Television scholar Frosh (2009) draws on this apparent textuality when conceptualizing faces as both an indication of ‘a character’s feelings’ and ‘expressive of relations with the viewers themselves’ (p. 90). Taking the ‘talking heads’ of television as a case in point, Frosh considers how the human face functions as a paradoxical communicative structure. News anchors and actors in advertisements directly address the spectator through gestures and verbal address, maintaining the illusion of a parasocial interaction between the televised face and its viewers (Frosh, 2009). By way of sensitizing viewers to a plethora of different faces, Frosh endows television with the capacity to mediate between people, and to condition how we come to relate to one another on a more global scale.
In this sense, we might consider faces a form of interface or screen. As Fedorova (2020) writes, ‘the interface is a place of connection between a human and a digital system that allows them to communicate with one another in order to generate and exchange information’ (p. 3). As a point of connection between humans and a machine, faces can be considered screens of data display. Unlike the graphical user interface of the computer, which acts as a mutual exchange point of information between humans and machine, the exchange is more one-directional. In the case of facial recognition, faces generate and provide information yet do not receive much in return. This can most readily be seen in the case of selfies.
Ever since ‘selfie’ entered the lexicon as Word of the Year in 2013 by Oxford Dictionaries, this form of digital self-portraiture has attracted the attention of researchers across a wide range of disciplines. Defined as a digital phenomenon concerned with the ‘circulation of self-generated digital photographic portraiture, spread primarily via social media’ (Senft and Baym, 2015: 1588), selfies exemplify the mediating capacities of faces. On the one hand, selfies constitute a primary mode of self-presentation in digital media. On the other hand, selfies need to be understood as a form of social practice. The role of human faces is not just confined to the obvious fact that most selfies are depictions of the face. The selfie face is also productive.
In current machine culture, selfies serve multiple functions, including: training machine learning algorithms, depositing checks via smartphones, allowing systems like Facebook to automatically recognize and tag people in photos, helping law enforcement personnel identify criminal suspects, and keeping self-driving cars from hitting people. Importantly for our discussion on machine culture, selfies are more than just a means of representing the self. As Zhao and Zappavigna (2018) point out, ‘many of the technologies for selfie augmentation rely on facial recognition technology that renders the face “readable” by machines’ (p. 675). As such, selfies fuel machine vision and increasingly work to ‘normalize biometrics and automated image manipulation’ (Rettberg, 2017).
Facing self and others
We all use our bodies, faces and physical appearances to communicate, whether intentionally or unintentionally. In Goffman’s (1959) work on the presentation of self in everyday life, he described how individuals and groups maintain and manage impressions of self in social interactions. For Goffman, social interactions are inherently uncertain. To mitigate this uncertainty, people will look for social cues, expressive gestures and status markers as part of an ongoing process of ‘impression management’ (Goffman, 1959). Yet not all impressions can be controlled. Goffman distinguishes between expressions
However, the idea of face and face-work as linked to self-image in social relationships did not emerge with Goffmanian symbolic interactionism. As Goffman (1967) acknowledges in a footnote, his conceptualization of the face is indebted to what he refers to as ‘the Chinese conception of face’ (p. 6). In a classic anthropological account of the face, Hu (1944) distinguishes between two meanings of the face in Chinese –
Importantly, these conceptions of face should not be understood too literally. It is not the same kind of face as studied in the linguistic and psychological studies on nonverbal communication discussed above. Rather, these sociological faces are inherently social and symbolic. As Goffman (1967) puts it: the person’s face clearly is something that is not lodged in or on his body, but rather something that is diffusely located in the flow of events in the encounter and becomes manifest only when these events are read and interpreted for the appraisals expressed in them. (p. 7)
For Goffman (1955), the face is explicitly about self-image and the ‘social value a person effectively claims for himself by the line others assume he has taken during a particular contact’ (p. 213), while the Chinese conception foregrounds the relational and networked aspect of the face.
This emphasis on the face as an encounter with others and relational entity is also reminiscent of Deleuze and Guattari’s (2004) notion of
Faciality can be understood as a kind of signifying power emanating from particular faces, for example, the power of the face of political leaders, or the power that the celebrity face lends to film. Again, we can see the screenic element of faces, understood as places where identity and communication are mobilized. As Chesher and Andreallo (2021) explain, ‘[f]aciality is an “abstract machine” that comes into play whenever anyone or anything encounters something that functions as a face, creating both meaning and identity, emotion and subjectivity, surface and depth, significance and subjectification’ (p. 85). The face is not a unity, despite what using the singular form might otherwise suggest, but rather a distributed accomplishment. In the next section, we will move away from the abstract machines of faciality and the ways in which faces have been theorized in media and communication research broadly construed, towards more concrete technical and historical face-aggregating and recognizing technologies.
Facing AI in automated facial recognition
Today’s social media are rampant with faces. Facebook is literally built around a social directory of faces. Visually oriented platforms such as Instagram or TikTok are filled with selfies. Tinder encourages its users to choose potential love interests based on a superficial scanning of a stranger’s face. Snapchat offers face filters called ‘lenses’. Beautification apps abound. All of these applications incorporate some form of facial recognition technology. Our digital devices work as face detectors. Smartphones, such as Samsung’s premium Galaxy phones, offer both facial recognition and a fingerprint sensor to unlock the phone. Apple’s Face ID is a biometric mapping technology that allegedly turns an user’s face into a secure authentication device, which can be used to secure iPhones and authorize payments through Apple Pay.
While facial recognition technology is often linked to the realm of state control, contemporary consumer technology shows how much it has become an integral part of our everyday lives. It is precisely in the ‘“benign” and playful consumer gadget’ (Ellerbrok, 2011: 530) that the power and value of communicative faces can most readily be found today. Digital faces and the technologies used to capture, store and process them ‘play a role in altering the conditions of personal visibility, public intimacy and relationships’ (McCosker and Wilken, 2020: 37). This is to say that faces
Historical legacies
In many ways, the history of face-aggregating technologies starts in the 19th century with the linkage of the invention of photography and the scientific racism of physiognomy. A popular scientific classification tool for many of the most prominent scientists at the time, physiognomy ‘held that people’s faces bore the signs of their essential qualities and could be visually analysed as a means of measuring moral worth’ (Gates, 2011: 19). 2 By linking the study of facial features to innate personality traits and human worth, physiognomy became a useful tool for the identification, recording and controlling of people in the name of scientific racism (Edkins, 2015). Photography played a key role in enabling these new pseudoscientific practices. With the advent of photography, faces could be concretized and made discrete. They could be catalogued, aggregated, overlayed and compared. Many of the scientists at the time developed their own photographic techniques for documenting people’s facial features and perceived abnormalities. The eugenicist Francis Galton, for example, developed a technique he called ‘composite photographs’. By superimposing photographs, Galton claimed to have revealed patterns of generic types of people; ‘poor, imbeciles, criminals, deviants’ (Edkins, 2015: 103).
Photography also became instrumental in the creation of what is sometimes referred to as the first biometric identification system. Invented by the Paris police official Alphonse Bertillon, one of Galton’s contemporaries, the Bertillonage technique consisted of a combination of anthropometric measurements and a new standard for forensic photography – better known today as the mug shot. The Bertillonage method was widely adopted by the police and judicial system as ‘a bipartite system, positioning a “microscopic” individual record within a “macroscopic” aggregate’ (Sekula, 1986: 18). This meant that the act of identification did not just depend on the individual criminal suspect, but more importantly, on cross-referencing and comparison to the archived aggregate of possible suspects. Though the Bertillonage technique was eventually replaced by the uptake of fingerprints as the dominant system of forensic identification, the principle of translating the face into numerical data for the purpose of identification and pattern recognition still remains an underlying logic of how facial recognition systems operate to this day.
The development of automated facial recognition, then, is deeply founded on a kind of biological essentialism that worked to arbitrarily divide and sort human populations based on their facial features (Stark, 2019). As Simone Browne reminds us, biometric systems have served as a means of visually classifying people, particularly in terms of imposing a category of race. For Browne (2010), these systems enact a process of ‘digital epidermalization’ that enables bodies to be encoded as data and serve as evidence and control. This encoding process is far from neutral or equal insofar as some faces communicate more clearly or more forcefully, for better or worse, to the machine.
As the numerous cases of gender and racial mis/identifications in facial recognition systems attest, some faces are consistently left out or made more or less visible compared to others (Buolamwini and Gebru, 2018). For Stark (2019), this makes facial recognition systems socially toxic, as they are ‘grounded in finding numerical reasons for construing some groups as subordinate’ (p. 53). Similar to Rettberg’s (2017) argument about selfies sensitizing people to biometric control, Stark’s (2019) work points to the often racist and discriminatory hazardous effects of seemingly cute Snapchat lenses and smartphone emojis.
Technological underpinnings
Automated facial recognition systems work by roughly the same parameters. The first step is to detect a face in an image. This often happens by overlaying the face with grids, drawing boxes around it, or otherwise distinguishing a face from its background and neighbouring elements. Then, features have to be extracted and classified to create a mathematical representation of the face. These features allow the machine to match them with corresponding images in the database or face template. Only the last step of the process entails recognizing the face, usually for the purpose of identification or verification. In essence, these kinds of biometric systems ‘see’ the face ‘as a particular configuration of surface measurements to be processed analytically and statistically’ (Pinchevski, 2016: 195).
The first attempts at automating facial recognition using computers started in the 1960s by one of the earliest pioneers of artificial intelligence research, Woodrow Wilson Bledsoe. In line with much of early computing, this research was largely funded by US intelligence services (Lee-Morrison, 2019). While the goal was to teach a computer to recognize 10 faces, in its first year of research, the team only succeeded in recognizing one face (Raviv, 2020). As Bledsoe’s research progressed, he eventually settled on a digitized system inspired by the Bertillon system discussed earlier, largely based on a data set containing all white male faces of various ages (Lee-Morrison, 2019). Technically, Bledsoe’s first attempts were modest by today’s standards and relied entirely on a human operator entering the data into the system. The first successful attempt at fully automating the process of facial recognition was made in 1973 by the Japanese computer scientist Takeo Kanade. Based on a data set of 850 digitized photographs, a rare commodity at the time, Kanade developed a face matching system based on feature recognition, without the need for any human intervention (Raviv, 2020). Yet it was not until the early 1990s that new methods were developed that did not rely on photographic portraits but were able to recognize faces in images containing other objects as well.
The break from earlier conceptions of facial recognition came with a paper by Turk and Pentland (1991), which described the eigenface method based on principal components analysis (PCA). Rather than modelling facial recognition on individual facial features and the relationships between them, as earlier research had done, the eigenface method set out to read the face holistically. That is, eigenfaces can be described as abstractions of faces broken down into the faces’ principal components. The goal with this approach was to create a minimum number of eigenfaces that could adequately represent the entire training set, and thus reduce the amount of data that had to be processed in order to detect a face. Moreover, as Lee-Morrison (2019) explains, what set the eigenface algorithmic approach apart from earlier approaches was that it produced an image as part of its algorithmic process, allowing for the first time a machine-produced vision of the face, a kind of modern-day version of Galton’s composite photographs.
While much facial recognition today relies on deep learning and neural networks, the process of finding faces in many cases still relies on an approach for data-driven computer vision developed by Paul Viola and Michael Jones in 1991. As Leslie (2020) puts it, the Viola–Jones algorithm was ‘a bellwether in the shift of computer vision techniques from traditional, knowledge-driven and rules-based methods to data-driven machine learning approaches’ (p. 10). Viola and Jones essentially pioneered face detection by proposing a classifier that could distinguish bits of images and identify whether an image contains a face or not.
In terms of image recognition, what is most often used today are convolutional neutral networks (CNNs) that work by breaking down images into smaller groups of pixels called filters. 3 Each filter is a matrix of pixels, which the network then performs a series of calculations on, comparing them against pixels in its neighbouring fields, according to a specific pattern that the network is looking for. The CNN, like any neural network, works through layers and levels of computations. The CNN knows what to look for by way of large amounts of training data and labelling. In the case of Facebook, for example, the training data and labelling of those images are in large part done by the users themselves. As Paglen (2016) writes, when you put an image on Facebook, you’re feeding its DeepFace algorithm information about ‘how to identify people and how to recognize places and objects, habits and preferences, race, class, and gender identifications, economic statuses, and much more’. This neural network reportedly ‘achieves over 97 percent accuracy at identifying individuals—a percentage comparable to what a human can achieve’ (ibid.). This accuracy rate is only achieved through a process of repeated comparison, slightly more correct for each convolution. While a CNN does well with still images, for moving images the process needs to be supplemented with a so-called recurrent neural network (RNN) in order to work with temporally sensitive models.
What is critical to the working of deep learning models used for facial recognition are large data sets of correctly labelled faces, shown from different angles, shapes and lighting. One of the first and biggest data sets of images is ImageNet. With the help of Amazon Mechanical Turk workers, more than 14 million images have been hand-annotated to date. While ImageNet is not confined to faces, there are now many image databases dedicated to faces only. If deep learning depends on large amounts of accurately labelled images, Crawford and Paglen’s (2019) work on the politics of training data is a reminder that this ideal is not always palpable. For example, in studying the image classifications of ImageNet, they found that the actor Sigourney Weaver is labelled a ‘hermaphrodite’, or that a ‘young woman lying on a beach towel is a “kleptomaniac”’ (Crawford and Paglen, 2019). As much research in fairness, accountability and transparency in machine learning has shown, when algorithms discriminate, it is often at the expense of the already marginalized, people of colour, women, LGBTQ+ communities, people with disabilities and the working class (Denton et al., 2020). Such failures, Denton et al. (2020) suggest, are often due to the ‘under-representation of these groups in the data upon which these systems are built or undesirable correlations between certain groups and target labels in a dataset’ (p. 1).
What these discussions highlight above all is the discrepancy between what the machine sees and how it is taught and made to see in particular ways. Between the rectangular shapes of the eigenface method, the convolutions of a neural network, and the politics of historically skewed and unjust data sets, where do we position the communicative face and its meaning in current machine culture? Do faces transmit information, or do they provide a symbolic representation? Do faces communicate by virtue of psychological predispositions, or as part of an intersubjective understanding? Just as there is no unified theory of communication to speak of, we cannot expect the communicative capacities of faces to follow one single line of conception. Instead, as the final section shows, how and when faces are communicating in machine culture needs to be understood as part of a more complex and ongoing process of what I call algorithmic face-work or simply fAIce communication.
Towards an understanding of fAIce communication
Facial recognition systems complicate our understanding of existing communication forms, which have mainly been conceptualized in terms of human processes. While the human face remains one of the primary modalities of communication in human communication, the human face clearly also communicates
To better understand the communicative capacities of faces in machine culture, I suggest the neologism ‘fAIce communication’ as a potentially fruitful way of conceptualizing the forms of communication characteristic of people’s oftentimes inadvertent interactions with facial recognition technologies, but also the inherent power relations at stake in these interactions. If HMC offers a new generative framework for theorizing the machine as a communicative subject, I hope the notion of fAIce communication brings to the fore a possible way of conceptualizing the faciality at play in facial recognition systems. The face, Deleuze and Guattari (2004) suggest, is not a fact but a social assemblage and defined by its relations and attachment to other people, historical settings and social situations. As such, fAIce communication can be understood as a form of algorithmic face-work insofar as it indicates the kind of work and communicative exchange at play when human faces and machines meet. Given the preceding overview of how the meaning of faces is shaped by their historical legacies, social encounters and inscriptions into technical apparatuses, the question is how those attachments come together to form a notion of fAIce communication. If faces communicate differently to humans and machines, then where do we stand with regard to the communicative capacities of faces in applications of AI?
Revealing faces
For both humans and machines, the face functions as a gateway to a more or less hidden aspect of the human. It is in this basic sense that faces communicate, or mediate. Yet for social and aesthetic approaches to faces, this mediating capacity remains at the level of metaphor. For research in nonverbal communication and machine vision, however, facial measures and expressions are treated more literally as factual gateways into a person’s apparent true self. This becomes more evident when looking at some of the current AI applications using facial recognition. Take the Israeli start-up Faception, for example, which uses machine learning to ‘score facial images using personality types like “academic researcher”, “brand promoter”, “terrorist” and “pedophile”’ (Chinoy, 2019). According to the company’s website, a ‘terrorist’ is characterized as someone who ‘suffers from a high level of anxiety and depression. Introverted, lacks emotion, calculated, tends to pessimism, with low self-esteem, low self-image and mood swing’ (Faception, 2021). Faception claims to be able to determine whether someone is a terrorist with an 80% accuracy, based on measuring the distances of different points in the face. Similar to Faception and the aforementioned HireVue, a Tokyo-based company recently pitched an app to investors called DeepScore, which is supposedly able to detect someone’s trustworthiness based on an AI reading of their face (Feathers, 2021). As is the case with many similar claims about their epistemic power, faces are often taken at face value. In these applications faces are treated as a gateway to a supposedly true self, without the need for further explanation or contextualization. But do faces really speak for themselves?
Such physiognomic assumptions are not just a matter of peripheral start-ups, however, but part of how facial recognition is imagined and implemented more widely. The case of Cambridge Analytica showed how faces are being heavily politicized and valourized. Drawing on the research of behavioural scientists such as Michal Kosinski (Kosinski et al., 2013), who developed an earlier version of a Facebook-integrated personality app, Cambridge Analytica famously claimed to be able to reveal a person’s political orientations and other attributes, simply by analysing their facial images obtained from Facebook and other sites. More recently, Kosinski’s work on AI and mass persuasion has gained much attention for claiming to be able to predict a person’s sexual orientation based on their facial expressions.
In this study, Kosinski and Wang trained facial recognition software to distinguish between gay and straight people. By downloading over 300,000 profile pictures from dating sites and training a sample of over 35,000 facial images of self-identified gay and straight individuals, Wang and Kosinski claimed the algorithm could correctly distinguish between gay and straight people 91% of the time for men and 83% of the time for women (Wang and Kosinski, 2018). Interestingly, the authors acknowledge in an appendix to the article that their work ‘presents serious risks to the privacy of LGBTQ people’ and that they debated whether or not to publish the results. Yet they reasoned that ‘the safety of gay people and other minorities hinges not on their right to privacy (which can be maliciously invaded), but on the protection of their human rights’, and they decided to go ahead with the publication.
Whether or not facial recognition classifiers can accurately score someone’s sexual or political orientation, what these examples show is how the technology and the meaning of the face can strategically be put to use for purposes other than originally intended. Much in the vein of Galton’s mission to find statistically salient types of persons that could be used to govern populations, the kind of biological essentialism at play in the design, implementation and application of facial recognition technologies may have serious consequences.
Relational faces
How and what faces communicate depends in part on the audience. When humans read faces, they primarily search for social cues. Faces become markers of identity, feelings and relationships. To the machine, if understood in the computational sense, faces are essentially a territory to be mapped and measured. While it depends on the specific algorithmic techniques used in each case, automated facial recognition works by abstraction and constant comparison. In automated facial recognition technologies, faces are not simply decoded as is, but carefully prepared to be read and identified through a number of steps. As Celis Bueno (2020) observes, this means that algorithmic facial recognition is not simply about linking a face to a private individual but using it as ‘a source of pre-personal and supra-personal information’ (p. 84). The face does not just ‘belong to a private individual, but rather constitute a data bank of face templates and training sets’ (p. 84).
We might see this logic most vividly at play in examples of facial misrecognition. In a case of British passport services rejecting a Black woman’s passport application because her mouth was allegedly open, when in fact it was closed, the system did not so much misrecognize her as an individual but rather provide an output on grounds of its training set and system design. While the picture may depict her, what counts is how that picture is read and understood in relation to archived images of others. There are some interesting connections between the algorithmic statistical modelling of faces and the Chinese conception of facial relationality discussed earlier. In both machine learning and the Chinese notions of
In both conceptions, however, faces are less about the individual and more about their relations and positionality as part of social structures. This means that someone might lose face not as a result of their own actions but because of the actions of others. If, for Goffman, every social situation comes with the risk of losing face, this seems even more the case for algorithmically shaped digital spaces. With the quantity of facial images floating around in social media, faces are not merely at risk of being jeopardized; they
Missing faces
As we have seen, the work that goes into determining the system’s set-up will also affect how faces
Not all faces are endowed with the same capacity to communicate to and with the machine. Aging faces, dark-skinned faces, twin faces and moving faces all pose distinct challenges to the machines interpreting them. As the discussion on labelling and annotation showed, faces do not automatically communicate but need to be made to communicate in certain ways. This means that the communicative capacities of faces in machine culture depend on a number of different factors, including their prior existence in the database, training sets, the minds and consciousness of the designers making and maintaining the systems, their differential treatments in the application and use of those applications by concrete institutions and organizations, and so on. Thus not all faces exist on equal terms. Sometimes faces have to be purposefully gathered, or their owners even paid to communicate with the machine, as in the case of Google employees targeting dark-skinned homeless people for images of their faces. Reportedly, Google employees were ‘pounding the pavement in a variety of US cities, looking for people willing to sell their facial data for a $5 gift certificate to help improve the Pixel 4’s face unlock systems’ (Hollister, 2019).
As was discussed, facial recognition systems still struggle with the variety and diversity of faces. In the quest for the best – most complete and diverse – data sets, companies go out of their way to find and store faces in order to refine their systems and maximize their value. If darker skin tones remain one particular challenge to these systems, so does age, as we have seen. In 2019, a Facebook challenge simply named ‘10-year challenge’ went viral for prompting people to post a photo of themselves from 10 years ago and one from today. This quickly led people to speculate about Facebook’s real motivations for promoting this challenge, suggesting how it would undoubtedly benefit the company’s facial recognition systems. Even if Facebook claimed to not be directly involved in creating this particular challenge, the value of such challenges or other gamified modes of playing with faces cannot be denied. Faces may be missing from the system, but they can always be added or strategically replaced, with or without the help of their rightful owners.
Concluding remarks
The ubiquitous nature of faces, both offline and online, demands a renewed attention to their communicative capacities, as this article has been arguing. The key challenge for communication scholars is to examine how the meaning of face is both configured by and configuring current machine culture in ways that do not fall back on an ontological division between human and machine as binary categories. If different disciplines, from psychology to sociology, media studies and computer science, offer distinct ways of theorizing the ontology and epistemology of faces, their combined insights may serve as a testing ground for what I have called fAIce communication.
As Deleuze and Guattari (2004) emphasize with their concept of faciality, what is at stake is the kind of signifying power emanating from particular faces. More importantly, as we discussed with regard to machine learning and relational faces, today there might even be more signifying power emanating from the amassing and assembly of faces than from the individual face. The concept of fAIce communication does not necessarily entail a person communicating with or to the machine, in the sense often conveyed by research in HMC (though this can certainly be one aspect of it). What the neologism ‘fAIce’ hints at is precisely the ways in which faces and AI become entangled. The faciality of facial recognition technologies differs from the power of the face of a political leader or celebrity in cinematic close-ups. Faces are not given, Edkins (2015) reminds us, but rather ‘exist in a particular cultural, geographical, and historical context’ (p. 3). The context of machine culture and facial recognition technologies, driven by convolutional neural networks, suggests a highly performative face. Thus fAIce both protects individuals and gives them away; fAIce is simultaneously playful and risky; and fAIce generates both profit and problems for big corporations.
Yet, as we have seen, this performativity or doing of the face is not something that necessarily lies
Footnotes
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
