Abstract
Previous research has focused on documenting the perceptual mechanisms of facial expressions of so-called basic emotions; however, little is known about eye movement in terms of recognizing crying expressions. The present study aimed to clarify the visual pattern and the role of face gender in recognizing smiling and crying expressions. Behavioral reactions and fixations duration were recorded, and proportions of fixation counts and viewing time directed at facial features (eyes, nose, and mouth area) were calculated. Results indicated that crying expressions could be processed and recognized faster than that of smiling expressions. Across these expressions, eyes and nose area received more attention than mouth area, but in smiling facial expressions, participants fixated longer on the mouth area. It seems that proportional gaze allocation at facial features was quantitatively modulated by different expressions, but overall gaze distribution was qualitatively similar across crying and smiling facial expressions. Moreover, eye movements showed visual attention was modulated by the gender of faces: Participants looked longer at female faces with smiling expressions relative to male faces. Findings are discussed around the perceptual mechanisms underlying facial expressions recognition and the interaction between gender and expression processing.
Facial expressions are the main source of information when perceiving emotional states in other people (Shields et al., 2012). Reading individuals’ faces and extracting emotional information quickly is a valuable skill during interpersonal interaction. To accurately capture emotional information in faces, humans have developed a certain ability with regard to the categorical perception of facial expressions (Ekman, 1993). Humans move their eyes to positions that maximize perceptual performance determining the identity, gender, and emotion information in faces (Peterson & Eckstein, 2012). A debatable issue that has yet to be elucidated is how emotional facial expression is identified. On one hand, the majority of research on facial expression recognition has supported the view of holistic face processing. According to this view, faces are processed as a whole unit or Gestalt, especially for Easterners, and the central fixation strategy is related to holistic face processing (Kelly et al., 2010; Pellicano & Rhodes, 2003). Along similar lines, Hsiao and Cottrell (2008) found that optimal recognition is achieved with two fixations, and recognition performance does not improve with additional fixation. The distribution of the fixations is around the center of nose. On the other hand, it has also been argued that faces may be processed more featurally or analytically (Tanaka & Farah, 1993). In support of this view is the fact that recognition of emotion in facial expressions is based on individual facial features (Calvo & Nummenmaa, 2008). The significance of specific areas is particularly distinctive in different expressions. For example, Eisenbarth and Alpers’s (2011) study found that in happy facial expressions, mouth region received more attention; in sad and angry facial expressions, the eyes received more fixation. In another study, Beaudry et al. (2014) showed variations occurred as a function of the facial expression: The mouth was important in the recognition of happiness and the eye/brow area of sadness.
In the recognition of facial expression, gender information could occur without conscious awareness and is effortlessly integrated into face processing (Reddy et al., 2004). This suggests that the role of face gender in the facial expression recognition needs to be emphasized. A long dominant model of face recognition holds dissociable processing routes for the face gender and facial expression (Jaquet et al., 2008). Le Gal and Bruce (2002) investigated the effect of face gender on facial expression in Garner’s selective attention task. They found that it was possible for participants to attend selectively to facial expression recognition without interference from face gender. Neuropsychological research also provided important evidence that face gender is independent of facial expression (see Wu et al., 2014, for a review). However, recent studies suggest that the relationship between face gender and facial expression is more complex than simple independence (Le Gal & Bruce, 2002). For example, Becker et al. (2007) found gender information and facial expressions are always intertwined; more precisely, participants were faster and more accurate in detecting male angry faces and female happy faces. Moreover, Liu et al. (2017) suggest a symmetrical interaction between gender and expression processing during face perception.
Most of the studies we reviewed above focused exclusively on the recognition processing of expressions of six basic emotions: happiness, sadness, anger, fear, surprise, and disgust. Little is known about the recognition processing of crying expressions. Crying is an important attachment behavior with clear evolutionary significance (Hendriks et al., 2007); in the history of human evolution, the efficient response to an infant’s crying and the ensuing care-taking and protecting behaviors increase the odds of an offspring’s survival (Hahn-Holbrook et al., 2011). There is “a tearing effect” in crying face recognition: the relevance of tears as an important visual cue adding meaning to human facial expression and facilitating recognition of sadness (Balsters et al., 2013). In a recent crying face recognition study that used eye tracking, researchers found that crying faces were recognized faster with higher accuracy. In addition, the eyes area was revealed to be important in the recognition of crying (Sun & Shi, 2017). The present study plans to repeat and expand this study from the perspective of the interaction of face gender and facial expression. We perceive a difference between a crying male and a crying female because there is an interaction between gender information and facial expression (Liu et al., 2017). Previous studies have provided evidence to support this idea. For example, women cry and smile more and overall show more facial expressiveness than men, and participants were usually faster and more accurate in recognizing angry expressions on male faces and happy expressions on female faces (Becker et al., 2007).
Based on the above discussion, the current study used eye tracking to examine how people process smiling and crying faces and to explore the interaction of face gender and facial expression. To do so, we used a variation of the visual oddball paradigm, in which participants were confronted with one frequent stimulus (a neutral face) and two deviant ones (a smiling and a crying face), and they have to detect as quickly as possible. This paradigm provides an easily quantifiable objective measure of face discrimination ability (Coll et al., 2019). Importantly, a central goal in cognitive psychology is to understand how we perceive real-world scenes (for a review, see Eckstein, 2011). Crying faces are rare in daily life; the design of oddball paradigm, therefore, allowed us to have access to understand how we perceive facial expression in a close-to-real-life context (Sun & Shi, 2017). Moreover, although the oddball task is a well-known paradigm that researchers employ to study target detection with electroencephalogram (EEG; Polich & Kok, 1995), some researchers have applied this paradigm to achieve other goals (Neta et al., 2011; Senju et al., 2003).
Materials and Methods
Participants
Forty-two healthy college students participated in this study. Four students were eliminated due to fixation deviation, resulting in 38 valid participants (19 women, 19 men; mean age = 20.4 years; range = 19–23 years). All participants were native Chinese speakers with normal or corrected-to-normal vision and were right-handed.
Stimuli
The material consisted of those used previously by Sun and Shi (2017) and comprised 320 faces: 240 neutral faces, 40 smiling faces (20 female) and 40 crying faces (20 female). See Lockwood et al.’s (2013) and Sun et al.’s (2017) studies for details of faces construction and the rating procedure. In the study, participants performed two-choice oddball tasks. Stimuli materials were divided into categories of standard stimuli and deviant stimuli (75% vs. 25%). Standard stimuli consisted of 240 neutral faces, and deviant stimuli consisted of 40 smiling faces and 40 crying faces. Images measured 350 × 405 pixels in size and displayed a full face from the shoulders up.
Apparatus and Measures
Stimuli were presented on a 17-inch display with a resolution of 1,024 × 768 pixels at a sampling rate of 60 Hz. Eye movements were recorded using an Eyelink II eye-tracker (Mississauga, Ontario, Canada) at 250 Hz sample rate and at a spatial resolution, typically of 0.01°. Three dependent measures were used in the eye-tracking sessions: (a) fixation time on the whole face, (b) fixation time ratio on areas of interest (AOIs), and (c) fixation count ratio on AOIs. The fixation time on the whole face referred to the sum of the duration of all fixation points on the entire face. The fixation time ratio on AOIs referred to the ratio between the fixation time on AOIs and that on the whole face. The fixation count ratio of AOIs referred to the ratio between the fixation count on AOIs and that on the whole face. All three indices reflected the degree of individuals’ processing on the fixation area and selection preferences of attention resources (Shimojo et al., 2003).
Procedure
Participants were tested individually. A computer screen positioned centrally 60 cm in front of them and a chin rest maintained a stable head position. The nine-point method was adopted to calibrate the visual tracking system. Participants had to fixate a point in the middle of the screen allowing the experimenter to perform a drift correction. Nine points would appear on the screen randomly, and participants need to look at the point one by one until it disappears. The experiment was conducted after accuracy reached a good level with the calibration deviation less than 1.5 (Apel et al., 2012). If the validation was poor, nine-point calibration procedure was repeated.
The study was conducted in a university laboratory. After signing informed consent and providing demographic information, participants completed a two-choice oddball task. Participants were instructed that they would see two types of stimuli, and neutral faces were standard stimuli and smiling faces and crying faces were deviant stimuli. They would see a face for a few seconds, and afterward that they would have to choose which type the face was. All participants began the task with a practice session consisting of 10 trials. The purpose of the practice session was to familiarize participants with the process. Participants then entered a formal session of 320 trials. In each trial, a fixation point “+” was presented for 300 ms followed by 500 ms of a blank screen, and then the facial expression images were displayed for 1,000 ms. Participants were asked to indicate as quickly and as accurately as possible whether they perceived the image to be either standard stimuli or deviant stimuli (button response assignment was balanced across participants; see Figure 1). Image presentation was randomized, and the entire experimental session lasted approximately 30 min. Finally, participants were debriefed and paid.

Overview of trial design.
Results
We removed the extreme values that exceed M ± 3 SD prior to statistical analyses.
Behavioral Results
Accuracy
A 2 (facial gender: male and female) × 3 (facial expression: crying, smiling, and neutral) analysis of variance (ANOVA) was conducted to evaluate participants’ recognition accuracy. This analysis revealed a significant omnibus effect of facial expression, F(2, 72) = 20.71, p < .05,
Reaction time
Analysis of reaction time revealed main effects of facial expression, F(2, 72) = 16.61, p < .05,
Eye-Tracking Results
In eye-movement analysis, we collapsed data on neutral facial expressions and only compared the difference between crying and smiling facial expressions for each dependent variable.
Fixation time on whole face
A 2 (facial gender: male and female) × 2 (facial expression: crying and smiling) ANOVA was conducted to assess participants’ fixation time on the whole face. The ANOVA yielded main effects of facial gender, F(1, 37) = 18.50, p < .05,
Fixation time ratio on AOIs
A 2 (facial expression: crying and smiling) × 2 (facial gender: male and female) × 3 (AOIs: eyes, nose, and mouth) ANOVA was conducted to examine participants’ fixation time ratio on AOIs. The ANOVA yielded main effects of facial expression, F(1, 37) = 34.62, p < .001,
Eye-Movement Data of Recognizing Different Types of Facial Expressions (Means and SDs).
Note. FTR = fixation time ratio; FCR = fixation count ratio.
Fixation count ratio on AOIs
Analysis of fixation count ratio revealed a significant omnibus effect of facial expression, F(1, 37) = 18.32, p < .001,

Fixation maps in different conditions (from low in green to high in red).
Additional Results
Some studies have showed that gender differences exist in emotional processing; we therefore suspected that our findings might be modulated by gender of participants. However, the analysis shown indicated that gender of participants did not affect the visual processing of smiling and crying faces.
Discussion
The current study used eye tracking to examine how people process crying and smiling faces. We used a variation of the visual oddball paradigm, in which participants had to detect deviant smiling or crying faces among a train of standard stimuli (neutral faces). Our interests focused on the fixation pattern and the interaction between facial expressions and gender information.
Behavioral and eye-movement data suggest that detection of crying faces is generally found to be easier and faster than that of faces expressing smiling. This can be explained by the threatening account and the feature account. As Öhman and Mineka (2001) described, threat stimuli are processed with higher priority due to an automatic threat detection system that enables rapid attention redirection. Crying provides warning signals of possible psychological or physical threats, which offered survival value over the course of evolution (Vingerhoets & Bylsma, 2007). This pattern facilitated the processing of crying signals and was consequently used to adjust behavior to cope with unexpected challenges in the environment. The results of our study replicated those of prior research indicating a threat recognition advantage for angry faces (Feldmann-Wüstefeld et al., 2011), snakes, and spiders (Soares et al., 2017). There is an alternative explanation to this finding. Specifically, some visual features of crying faces facilitate access to their affective meaning during the recognition process (Balsters et al., 2013). Tears from eyes may serve as a shortcut to access the recognition of crying faces (i.e., the tear effect). Compared with the crying faces, the smiling faces need more salient physical features to obtain a similar level of processing efficiency.
Analysis of AOIs revealed that across all facial expressions, the eyes and nose received more attention than the mouth. Consistent with previous studies, this result suggests that the overall gaze distribution was qualitatively similar across crying and smiling facial expressions (Guo, 2012). This result seems to provide support for a holistic processing in recognition of faces. However, variations occurred as a function of the emotional faces. Compared with crying faces, participants fixated on the mouth area for longer when recognizing smiling faces. The time fixated on an area could have been indicative of difficulty in the processing of this area or it could have represented the importance of this area in the recognition processing (Hernandez et al., 2009; Shields et al., 2012). Our results suggested that the time spent in an area is representative of its importance. For smiling faces, as participants spent more time in the mouth area, we can conclude that this feature is important in a quick recognition of a smiling face. Researchers have pointed out that facial features is a major element in constructing different emotional faces (Calvo & Marrero, 2009). Mouths are regarded as a unique facial feature in recognizing happy faces (Calvo & Nummenmaa, 2008). Similar to Beaudry et al.’s (2014) study in recognition tasks, participants spent more time in the mouth area for happiness than for the other emotion. This indicates our finding is not inconsistent with previous results pertaining to different facial features that convey distinct emotional content. The complexity of the visual pattern associated with facial expression recognition indicates the processing of emotional faces cannot be reduced to a simple feature processing or holistic processing for all emotions. In future studies, visual processing will be explored as a function of the emotional face to offer a comprehensive picture of the importance of different regions in the recognition processing.
Another contribution of this study is the finding that visual attention was somewhat modulated by the gender of faces. Results have shown that participants looked longer at female faces with smiling expressions relative to male faces. Of the two competing accounts (i.e., independent account and interactive account) that have been proposed, the account of an interaction between gender and expression recognition is better supported by the present findings. We tentatively interpreted this result as evidence that happiness conveyed in faces contributes to individuals’ visual preference for female faces. People have strong gender stereotypes regarding how likely it is for men and women to show certain emotions (Fischer et al., 2004). Gender stereotypes describe women as affiliative and more likely to smile, whereas men are considered dominant and more likely to show anger (Hess et al., 2004; LaFrance et al., 2003). Moreover, scholars have found smiling is generally understood as a gesture of kindness and can enhance attractiveness ratings of female but not of male faces (Otta et al., 1994, 1996). Given these considerations, we suspect that gender-based norms for smiling may be paramount in accounting for why participants tended to gaze at female smiling faces longer than male smiling faces.
Note, however, that there are limitations to the findings reported herein: First, we used a variation of the visual oddball paradigm to manipulate the frequency of neutral and target faces. One limitation of this design is that it complicates the interpretation of any comparison between smiling and crying faces. Especially considering that crying faces are indeed rare and therefore more surprising in natural environments than smiling faces. Furthermore, some studies have suggested that cognitive processes involved in different paradigms might be different (Beaudry et al., 2014; Calvo & Nummenmaa, 2008). Future studies are required to compare the difference between crying and smiling faces by diverse paradigms. A second limitation relates to the ecological validity of the crying face stimuli used. As the research investigating adult crying is still in infancy, no established set of crying faces were available. Consequently, referring to Lockwood et al.’s (2013) method, we modified a set of sad faces from Chinese face affect picture system (CFAPS; Gong et al., 2011) by digitally adding tears to sad expressions using Adobe Photoshop image. Indeed, the perceived veracity of the crying face stimuli is an issue for concern. In real life, we rarely observe a completely still crying face, and in addition to tears, crying has specific motion signatures and audible cues (Lockwood et al., 2013). Future studies of crying could use stimuli higher in ecological validity, such as the real crying face or video stimuli of crying actors. Third, and finally, the present result showed crying face could be processed and recognized faster than that of smiling faces. We have explained this result from the perspective of threat and feature. However, it should be noted that the current data do not allow us to endorse these possibilities completely. This could be addressed in future experiments using similar methods, such as adding tears to the smiling faces and sad faces. If the salience of the tears drive the rapid-processing of crying faces, the effect should be abolished if both happy and sad faces have added tears (proposed by anonymous reviewers).
To conclude, we demonstrated that crying and smiling faces with gender information could influence emotion recognition performance. Although proportional gaze allocation at AOIs was quantitatively modulated by different expressions, the overall gaze distribution was qualitatively similar across crying and smiling facial expressions. Interestingly, we also found visual attention to be somewhat modulated by the gender of faces, such that individuals seemed to prefer women’s smiling faces. Together, these findings converge to the notion that the face is a complex social signaling system in which signals for emotions and gender information all overlap (Hess et al., 2009).
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
