Abstract
In adults, cortical regions in the fusiform face area (FFA), superior temporal sulcus (STS), and medial prefrontal cortex (MPFC) respond selectively to faces but underlie distinct perceptual and social processes. When do each of these regions, and their distinctive functions, develop? We reviewed recent studies of awake human infants’ cortical responses to faces using functional near-infrared spectroscopy (fNIRS) and functional MRI (fMRI). The results converged and do not support a slow, sequential posterior-to-anterior development of face-selective responses. Instead, cortical face-selective responses arise very early and simultaneously in infancy and may reflect distinctively social processes from the start.
A glimpse of a face is full of social significance: We see where the person is looking, what feelings they are expressing, and, perhaps most importantly, who they are and what they mean to us. Correspondingly, seeing a face evokes vigorous responses in many regions of a human observer’s brain. In adults, decades of research have characterized these responses, producing the best studied examples of specialized responses in the human brain.
Much more recently, developmental cognitive neuroscientists have begun to ask when, and how, these regions develop in human infants. One possibility is that responses develop slowly in a posterior-to-anterior sequence: Initially posterior visual regions respond to face shapes, and gradually more frontal regions respond to social meaning. Here we argue instead that face-selective responses across the cortex arise in parallel early in infancy, potentially including distinctively social processes from the start.
Three Different Ways to See a Face
We focused on three regions of the cortex that respond to faces in different ways in adults: the fusiform face area (FFA), superior temporal sulcus (STS), and medial prefrontal cortex (MPFC; Figs. 1a and 1b).

Different representations of face responses. Face regions in the adult brain are schematically represented for (a) the FFA, STS, and MPFC, and average face responses are represented in (b) the adult cortex. The threshold for our group random-effects analysis of dynamic faces > dynamic object responses in 220 adult participants was –log(p), uncorrected, where p = .001 to .0000001, which corresponds to the scale from 3.0 to 7.0. The example (c, left) expression stimuli and (c, right) situation stimuli used to test representational content in the adult FFA, STS, and MPFC (positive and negative expressions included faces that varied in race and gender) are also displayed. In the positive situation on the left, the red circle is invited into the group of dancing purple shapes. In the negative situation on the right, the purple square closes the door and prevents the red circle from joining the group. The stimulus decoding in (d) the FFA (n = 19), STS (n = 19), and MPFC (n = 21) show that each region has different representations. The FFA decodes facial expressions but not positive and negative situations experienced by an animated shape (d, left). The STS separately decodes facial expressions and animated situations, but decoding does not generalize across the two stimulus types (d, middle). The MPFC decodes facial expressions and animated situations and generalizes responses across the two types of stimuli (d, right). Decoding results adapted from (Skerry & Saxe, 2014). Schematic brains obtained from https://www.behance.net. FFA = fusiform face area; STS = superior temporal sulcus; MPFC = medial prefrontal cortex.
The FFA is the most famous specialized “face area” in the human brain. Studies using functional MRI (fMRI) have shown high FFA activity in individuals presented with a face (Kanwisher et al., 1997). The face can be familiar or unfamiliar, moving or static, full color, or just a black-and-white silhouette (Kanwisher, 2010). If both a face and something else are visible, FFA activity is high when the viewer pays attention to the face (Tong et al., 1998). Responses in the FFA start < 200 ms after the image of the face is revealed (Ghuman et al., 2014). A few neurosurgery patients have had electrodes implanted in their FFA, confirming that individual neurons are strongly activated by faces (Schalk et al., 2017). Even more striking, artificially activating these neurons can make the patient see a face where there is none.
The STS is a big swath of the cortex stretching the length of the temporal lobe that contains many different functional regions. One STS region is highly active when viewing a moving face. Activity is lower if the face is not moving, and much lower if the moving object is not a face (Pitcher et al., 2009). The same region in the STS also has high activity when hearing human voices, including nonspeech sounds such as laughter (Deen et al., 2015). Both of these features suggest that the STS response captures a person’s momentary thoughts and feelings as expressed in their face, voice, or body movement; by contrast, the FFA response may reflect the invariant features of faces that establish a person’s stable identity across viewpoints, styles, and ages.
The MPFC is also an enormous swath of the cortex along the inner surface where the two frontal lobes are pressed together. At least one region in the MPFC is highly active when viewing a face compared with other visual images (Dinh et al., 2018), especially if the face is personally significant (Gobbini & Haxby, 2007). A personally significant face could be someone looking straight at the viewer, someone calling their name, or someone they know from their own life. This region of the MPFC also responds more to faces but also responds to nonfaces that are personally significant or emotionally evocative. Oral or written stories, animated cartoons, or personally significant voices can all evoke strong responses in the MPFC.
The FFA, STS, and MPFC are often active at the same time but for different reasons because they compute different information about the same faces. To illustrate this claim, we use one of our own experiments (Figs. 1c and 1d; Skerry & Saxe, 2014). Adults in an MRI machine watched movie clips. Half of the clips depicted naturalistic (unposed) human faces showing a happy or sad expression (Fig. 1c, left). In our experiment the pattern of responses in all three regions could be used to decode whether a new face was happy or sad (Fig. 1d, green).
The difference between the regions emerged when we measured responses to the other half of the clips, animations of a faceless shape implying a happy (completed a goal or socially included) or sad (failed a goal or socially excluded) experience (Fig. 1c, right). The FFA patterns could decode only faces, not the animations. Patterns in the STS could decode happy versus sad expressions, and happy versus sad animations, but patterns for happy versus sad expressions could not be used to decode the patterns for happy versus sad animations. Only in the MPFC could a classifier trained to distinguish happy versus sad faces be used to decode happy versus sad experiences in the animations, and vice versa. That is, the pattern of responses in the MPFC but not the STS could generalize across the stimulus types. 1 In sum, in adults the patterns of response in the MPFC contain information about abstract social meaning and generalize across stimulus types (i.e., faces, animations, and sentences).
This example illustrates how multiple cortical regions can respond to the same stimulus for different reasons, capturing different meanings in the same event. That is the background from studies of adults needed to set up our question about development: When do each of these face-selective regions first arise in the human cortex?
Cortical Development Is Slow and Sequential
One possibility is that cortical functions in general (and thus face selectivity in particular) arise in a slow sequence over childhood, beginning with the more posterior regions (with sensory or perceptual responses) and later in more anterior regions (with multimodal or abstract responses). Anatomically, cortical regions do develop slowly and in a predictable posterior-to-anterior sequence. Research in animals and postmortem human brains shows that the cortex anatomically matures first in basic visual (and other sensory) regions and last in the prefrontal cortex. For example, a particularly dramatic change in infancy and childhood is the myelination of tracts of axons connecting brain regions: This process happens first in basic sensory regions, such as the early visual cortex, and last in the prefrontal cortex (Bethlehem et al., 2022). The rate of metabolism, the number of new synapses being created, and the migration of new neurons into the cortex (Sanai et al., 2011) all follow this same sequence. The sequence and timing of anatomical development depend both on intrinsic maturational factors and on the accumulation of experience of the environment.
If functional responses develop in the same sequence and timeline as anatomical development, then it is possible that cortical face-selective responses would arise only after months or even years, and in a posterior-to-anterior sequence (Fig. 2a). Infants’ early orienting to faces might depend only on subcortical regions, which increase the salience of faces and face-like visual displays (Johnson, 2005). This orienting response increases infants’ visual experience with faces. Slow, gradual cortical learning from visual experience, especially from frequent shapes that are curvy, smooth, top-heavy, and at the center of one’s gaze, could then generate a selective response to faces in the FFA; some scientists have suggested that this occurs late in childhood or even adolescence (Arcaro & Livingstone, 2021; Cohen Kadosh et al., 2013; Golarai, 2009; Johnson, 2005; Scott & Arcaro, 2023).

Cortical development of face responses. A posterior-to-anterior sequence of functional development predicts that cortical face responses emerge first in (a) regions that are closer to input from the eyes, which first learn the pattern of faces, before regions farther from the sensory cortex, which attribute meaning to those faces. Some theories propose that the FFA is closer to visual inputs and so will develop face responses first followed by the STS (shown here). Other theories propose that the FFA and STS receive input simultaneously from the early visual cortex and thus might predict STS development as early, or even earlier, than the FFA (not shown). Both views predict that much later in development, the MPFC will acquire a face response as it learns to respond to faces that are personally significant (i.e., a close friend or family member) and ascribe meaning to the social and perceptual features of faces. Cortical face responses emerge (b) in parallel, early in infancy. As early as infants’ brains represent the perceptual features of faces in the FFA, they also represent the social information of faces in the STS and the self-relevant information in features in the MPFC. Infant fMRI data were collected using (c) a custom infant coil. The coil pictured here is from the coil used in Ghotra et al. (2021) while infants watched videos of faces, bodies, toys, landscapes, and abstract color displays (Kosakowski et al., 2022, 2024). FFA = fusiform face area; STS = superior temporal sulcus; MPFC = medial prefrontal cortex; fMRI = functional MRI.
There are multiple possibilities for the timing of face selectivity in the STS insofar as this view is concerned. Both the FFA and STS are recruited when people visually process faces. However, there is controversy regarding whether visual information is transmitted from the FFA to the STS or whether face information in the STS is transmitted through a different cortical pathway (Duchaine & Yovel, 2015; Pitcher & Ungerleider, 2021). Because STS responses are multimodal, this region might be expected to develop later than the FFA (as shown in Fig. 2). On the other hand, STS responses to dynamic faces may reflect input from a parallel pathway.
Uncontroversially, the slowest anatomical development in the cortex occurs in the prefrontal cortex, in which expansion, increased sulcal depth, and myelination continue to change for years (Toga et al., 2006). MPFC responses are also the most abstract, representing social meanings from stimuli as different as faces, animations, and verbal stories. Thus, the clearest prediction of posterior-to-anterior sequential functional development (Arcaro & Livingstone, 2021; Gerván et al., 2017; Scott & Arcaro, 2023; Sydnor et al., 2021) is that face selectivity in the MPFC would arise gradually and long after face selectivity in the FFA and/or STS. 2
Initial developmental fMRI studies suggested that cortical face-selective responses arise late and develop slowly throughout childhood and into adolescence (Cohen Kadosh et al., 2013; Golarai, 2009), consistent with the idea that cortical development is slow and sequential. Further, an initial fMRI study found no face-selective cortical regions in a small sample of human infants (Deen et al., 2017). Similarly, face-selective responses were observed in infant macaques only late in the first year of life (Livingstone et al., 2017) and appear to require experience seeing faces (Arcaro et al., 2017). Thus, it is worth testing whether face-selective cortical regions develop late, slowly, and in a posterior-to-anterior sequence.
Evidence for Early Functional Development From fNIRS
In contrast, there is mounting evidence that cortical responses to faces originate early in infancy in the FFA, STS, and MPFC (Fig. 2b). Much of this evidence comes from studies using functional near-infrared spectroscopy (fNIRS). Like fMRI, fNIRS measures blood oxygen changes. Unlike fMRI, fNIRS uses light emitted from optodes on the skull, reflected off the cortex, and then measured by detectors on the skull. As a result, fNIRS has a low resolution and can measure only activity near the surface of the brain. Fortunately, both the STS and MPFC are close enough to the surface to measure responses with fNIRS; unfortunately, the FFA is not. Still, to test the timing and sequence of face-selectivity development in the STS and MPFC, evidence from fNIRS is relevant.
In infants, just as in adults, socially relevant moving people and faces activate both the STS and MPFC (Lloyd-Fox et al., 2011). For example, in 4-month-old infants, activity in the STS and MPFC was higher when a face turned to look directly at the infant, a signal of personal significance, compared with when the face turned to look farther away from the infant (Grossmann et al., 2008). In another study, 6-month-old infants had more activity in both the STS and MPFC when watching two people interact with each other than when watching the same people each doing an action separately (Farris et al., 2022). Also, 7-month-old infants with stronger responses to moving faces in the STS are more sociable as toddlers (Grossmann, 2024), and infants with stronger MPFC responses to a smiling face had a stronger subsequent preference for that person (Krol & Grossmann, 2020).
When the infant STS and MPFC have distinct responses, the differences align with the regions’ distinct roles in adults. For example, in one study of 4- to 9-month-old infants, the STS was more active for infant-directed speech than adult-directed speech for both familiar and unfamiliar speakers, but the MPFC was more active when the speaker was the infant’s own mother, the more personally relevant sound (Imafuku et al., 2014).
These fNIRS studies challenge the prediction that the STS and MPFC acquire their distinct functions slowly and sequentially. However, fNIRS cannot definitively test the prediction that cortical responses to faces, including the FFA, develop simultaneously and early in infancy. There are three key limits of fNIRS: The location of neural activity is estimated imprecisely, only a few regions can be measured simultaneously, and deeper cortical regions, such as the FFA, are inaccessible. To get spatially accurate measurements of the FFA, STS, and MPFC in infants of different ages requires fMRI.
Evidence for Simultaneous Cortical Development From fMRI in Awake Infants
fMRI is an excellent tool for imaging the brain but a nonideal environment for infants. To create high-resolution images of brain activity, fMRI requires participants to lie completely still, moving less than a millimeter in a scanner that is dark, noisy, and unfamiliar. None of these features are easy for infants. Consequently, most researchers and clinicians who use MRI to image infant brains do so while infants are sedated or sleeping. However, to study brain activity while infants see faces, it is necessary to scan infants who are awake and participating in the experiment voluntarily. Over the past decade multiple labs have developed procedures for using fMRI with awake infants.
In one recent study, we scanned 2- to 9-month-old infants (Fig. 2c) while they watched dynamic movies of faces. Infants also watched movies of children’s hands and feet, moving toys, natural landscapes, and colorful abstract displays as control conditions. From the brain images, we identified the intervals when the infant happened to lie still. Eventually, we collected usable data from 65 infants, a sufficient sample size from which to test whether face-selective responses in the FFA, STS, and MPFC emerge (a) early or late and (b) in a sequence or simultaneously.
All three regions responded selectively to faces in infants (Fig. 3a; FFA: Kosakowski et al., 2022; STS and MPFC: Kosakowski et al., 2024). To test whether the regions develop in sequence, we then separately tested the older (5- to 9-month-old) infants and the younger (2- to 5-month-old) infants. Again, we found face-selective responses in all three regions in both the older and even in the younger (Fig. 3b) group. There was no evidence that any of the three regions were selective earlier or increased in selectivity later or slower than any other (Fig. 3c). These results are consistent with other fMRI experiments with awake infants that reported cortical responses to faces (Deen et al., 2017; Tzourio-Mazoyer et al., 2002; Yates et al., 2023). In all, evidence from both fNIRS and fMRI suggests that selective responses to faces emerge early, and in parallel, in these three distant and distinct cortical regions.

Cortical face responses present early in infancy. Responses to faces > nonfaces in a 4.6-month-old infant show activations (circled in blue) in the (a, left) FFA, (a, middle) STS, and (a, right) MPFC. This infant had 50.5 min of usable (low-motion) fMRI data collected while watching movies of faces, bodies, objects, landscapes, and abstract colorful scenes. Infants that were (b) 2 to 5 months old had responses to faces (purple) that were greater than responses to bodies (pink), objects (yellow), and scenes (green). Additional statistics are reported in Kosakowski et al. (2024). Face-selectivity indices (purple circle; face response − average response to nonfaces) are shown in the approximate location of the (c, left) FFA, (c, middle) STS, and (c, right) MPFC in 2- to 9-month-old infants (n = 37). Linear mixed-effects models revealed that slopes (m) were not statistically different from zero, intercepts (b) were statistically greater than zero, and there was no Age × Region interaction, F(2,405) = 0.04 (p = .96). Error bars indicate the within-subjects standard error. Symbols indicate one-tailed statistics from linear mixed-effects models. †p < .1. *p < .05. **p < .01.
Open Questions
Although existing data show that the FFA, STS, and MPFC are face-selective in infancy, many important questions remain unanswered. The roles of maturation and learning from the social environment in the development of these regions still need to be tested. Face selectivity in these regions could arise mostly independently, for example, from differential thalamic input; interactively, shaped by mutual connections between the cortical regions; or from the top down, with the MPFC and/or STS development preceding and influencing FFA development. For example, the development of face responses in the FFA might be directly or indirectly shaped by top-down input from the MPFC when a personally significant moment is detected by voice or by touch.
A related question is how much these regions’ functions change during development. In adults, the FFA, STS, and MPFC all respond to faces but have distinct responses to socially meaningful, self-relevant stimuli, as described above. Are these regions similarly functionally differentiated in infants? The fNIRS studies suggest that in infants, the STS and MPFC have distinct functions similar to those seen in adults, but these data are not conclusive because of the poor spatial resolution of fNIRS. The fMRI pattern analyses used to characterize and distinguish the regions’ functions in adults require more data and better resolution than have yet been achieved with infants. In sum, discovering when the FFA, STS, and MPFC are face-selective—early and in parallel—leaves open many fundamental questions about the development of these regions that remain to be tested in future experiments.
Conclusion
In the infant brain, responses to faces arise across many cortical regions in the first months of life. So far, there is no hint of a series or developmental sequence between these regions. Although these results are specific to faces, they challenge the general assumption that cortical functions initially arise, following the pattern of anatomical maturation, in a slow sequence from posterior to anterior regions. More broadly, these results may inspire a shift in the questions we ask about infant brain development away from how structured external input drives organization in a reactive infant cortex toward how an initial architecture that allows infants to actively seek meaning and fulfill their needs in their inherently social world.
Recommended Reading
Arcaro, M. J., & Livingstone, M. S. (2021). (See References). Reviews in detail the hypothesis that face selectivity emerges in posterior regions earlier in development than anterior regions.
Johnson, M. H. (2005). (See References). Reviews the literature supporting a subcortical face-detection mechanism as an explanation for functional development of the cerebral cortex.
Kosakowski, H. L., Cohen, M. A., Herrera, L., Nichoson, I., Kanwisher, N., & Saxe, R. (2024). (See References). Shows that face responses in the STS and MPFC are selective in even very young (2- to 5-month-old) infants.
Kosakowski, H. L., Cohen, M. A., Takahashi, A., Keil, B., Kanwisher, N., & Saxe, R. (2022). (See References). Finds that the FFA is face-selective in infants and that movies of landscapes and bodies evoke category-selective responses in other regions.
Skerry, A. E., & Saxe, R. (2014). Demonstrates that the FFA, STS, and MPFC are active in response to faces but form distinct representations as measured by their generalization to other animated situations.
Footnotes
Acknowledgements
We would like to thank Nancy Kanwisher and members of the Saxe Lab and Kanwisher Lab for helpful conversations over several years.
Transparency
Action Editor: Robert L. Goldstone
Editor: Robert L. Goldstone
Author Contributions
R. Saxe and H. L. Kosakowski contributed equally to this work. Both authors approved the final manuscript for submission.
