Abstract
Objective:
In preterm and very low birth weight (VLBW) infants, attention-related problems have been found to be more pronounced and emerge later as academic difficulties that may persist into school age. In response, based on three attention networks: alerting, orienting, and executive attention, we examined the development of attention functions at 42 months (not corrected for prematurity) as a follow-up study of VLBW (n = 23) and normal birth weight (NBW: n = 48) infants.
Method:
The alerting and orienting attention networks were examined through an overlap task with or without warning signal. The orienting network was also examined through the distribution of gaze points when exposed to videos of human faces talking and silently looking straight ahead. Executive attention was examined using a parental report measure for temperamental self-regulation, effortful control.
Results:
In the overlap task, the difference between VLBWs and NBWs was not the latency of attentional disengagement but the fact that VLBWs were less focused on the fixation stimulus (F(1,60) = 10.80, p < .01, η p 2 = .071) and seemed to profit more from auditory warning signals than NBWs (F(1,60) = 7.13, p = .01, η p 2 = .106). Moreover, there was no intergroup difference regarding lateral (right or left) or feature (eye or mouth) attention bias toward the face videos. Further, longer latencies in overlap condition were significantly positively associated with high effortful control scores only in the NBW group (r = .36, p = .018).
Conclusion:
Results indicate that poor underlying alertness and orienting relating to atypical lateralization may affect cognitive and behavioral abnormalities in VLBWs.
Introduction
Improved management of high-risk pregnancies and advances in neonatal intensive care have led to increased survival rates for children born very preterm (VP = <32 weeks) or with very low birth weight (VLBW = <1,500 g; Ely & Driscoll, 2021; Helenius et al., 2019; Kono, 2021). However, VP and VLBW infants are known to be at increased risk of neurodevelopmental morbidities. That is, the organs (e.g., lungs and brain) of VP and VLBW infants are more vulnerable to events (e.g., infection and hypoxia/hyperoxia) that tend to occur at a time when these organs must still complete most of their structural development outside the womb. Second, the risk of brain damage and adverse brain development is associated with weaknesses in neurocognitive development such as attention, which critically impairs the acquisition of new skills and impedes academic achievement (Anderson, 2014; Burstein et al., 2021; van de Weijer-Bergsma et al., 2008; Walczak-Kozłowska et al., 2020).
Recent neuroscience studies reveal that three independent brain networks are responsible for attention achieving and maintaining a state of alertness, selecting information from sensory input, and voluntarily regulating responses when dominant or well-learned behaviors are not appropriate (Posner et al., 2014). The alerting network is active as early as infancy, in which sustained attention effects rest in part on changes in the tonic alerting system. Moreover, warning signals can also increase alertness as a phasic alerting system and are an important prerequisite for other attentional operations. The further development of the alerting network during childhood involves endogenous control of the preparation and maintenance of the alert state. The orienting network in the brain, which includes the parietal lobe and frontal eye fields, primarily plays a control role during infancy, and its efficiency develops dramatically during the first year of life. In contrast, voluntary orientation and disengagement of attention are aspects that take longer to develop (Rueda & Posner, 2013). By 3 to 4 years of age and later, the frontal executive attention network involving the anterior cingulate and basal ganglia takes over this control role. Moreover, the emerging executive control system may increase the duration of looking, reflecting increased sustained or focused attention (van de Weijer-Bergsma et al., 2008). The efficiency of the executive attention network is thought to be reflected in the self-regulatory aspect of temperament, or effortful control (Rueda et al., 2005).
Reviews of preterm infants based on the above theory of attention conclude that preterm infants are at risk of difficulties in all three networks (Ginnell et al., 2021; van de Weijer-Bergsma et al., 2008). However, reviews note inconsistent results with respect to sustained attention in infancy. That is, some results show that preterm infants have lower attention spans than term infants (Downes et al., 2018: 12 months; Ruff, 1986: 7 months; Sun & Buys, 2012: 10–11 months), while others show no difference (Pridham et al., 2000: 8 months) or even that preterm infants have longer attention spans than term infants (Ruff et al., 1990: 2 years). However, the few studies that have examined the preschool period have found more consistent results, suggesting that problems with sustained attention in preterm infants become more pronounced with increasing age (van de Weijer-Bergsma et al., 2008). Arousal difficulties in preterm infants have also been pointed out as impaired brainstem function in early development (Geva & Feldman, 2008). As maintaining alertness requires the integrity of the brainstem, subcortical, and cortical network interaction is one of components of the general concept of arousal (Posner, 2012).
Jaeger et al. (2021) examined phasic and tonic alertness in healthy preterm children aged 5 to 6 years. The authors referred to Périn et al. (2010), who reported that tonic alertness activates a predominantly top-down controlled system, whereas phasic alertness is based on a more bottom-up, stimulus-driven attentional system. In their study, Jaeger et al. (2021), used standardized computerized tests that allowed for the measurement of two different types of alertness, with participants responding in conditions with and without a preceding cue. Results indicated that even in the absence of clinically relevant impairments, preterm children showed prolonged reaction and decision time in tonic alertness condition but not in phasic alertness condition, reflecting impaired top-down but intact bottom-up control processes. Phasic alerting is a major influential factor in human attention, which is particularly effective in populations with reduced tonic alertness (Kleberg et al., 2017). If so, as the effect of warning signals could be observed in younger preterm children, we investigated in 3.5 years, using eye tracking.
Turning now to group differences in relation to the orienting attention network, in a study by Davis et al. (2022) for infants aged 8 to 10 months, though there were no differences between preterm and term-born infants in gazing time or proportional gazing scores toward faces and social stimuli on the right, gazing time toward social scenes and faces on the left was significantly reduced in preterm infants. Regarding lateralization differences in preterm birth, the authors indicated that preterm infants might miss out on the period in the womb (i.e., the third trimester of pregnancy) in which term infants develop the structural leftward asymmetries in the infant language and sensori-motor networks, such as differences in interhemispheric connectivity and callosal thinning of the left hemisphere. In contrast, a left visual field bias (i.e., right hemisphere dominance) has been reported for processing faces or socioemotional stimuli in children and adults (Yovel et al., 2008). The cortical route specialized in face processing, often bilaterally but more consistently present in the right hemisphere, is already functional at birth (Buiatti et al., 2019). As the orienting network influences the brain areas that will normally be used to process targets (Posner 2012), orienting attention to the face could modify activity in the right-lateralized face-sensitive brain area. Alternatively, this right hemisphere activation may cause left visual field bias. Thus, the reduction in looking toward the left visual field cannot simply be attributed to attentional processes (Davis et al., 2022).
On the other hand, the right visual field bias observed in normal birth weight children was absent in VLBWs watching a talking face movie (Nakagawa et al., 2023). The audiovisual stimuli when the target is talking could activate the left hemisphere and produce a right visual field advantage during speech processing. This is consistent with a previous infant study of 5 to 6 months (MacKain et al., 1983). Thus visual field bias related to the higher cognitive function could be a marker of early atypical lateralization (Davis et al., 2022). Although preterm infants show a pattern of reduced gazing to social stimuli in infancy compared to term controls but catch up by 5 years of age (Dean et al., 2021), early lateral attention bias needs to be considered.
In the present study, we followed up VLBW and normal birth weight (NBW) participants who took part in the previous project at 12, 18, and 24 (corrected) months (Nakagawa et al., 2023) to examine attentional function at 42 months, when the frontal executive attention network begins to emerge. Related to the orienting attention, we presented movies of Talking and Silent Faces, the same stimuli as those given at 12, 18, and 24 months. We examined whether lateral (right or left) or feature (eye or mouth) attention bias was observed during audiovisual face perception at 42 months. In addition, we examined the arousal function and executive attention of participants by administering the overlap task with and without warning signal (Nakagawa & Sukigara, 2022). In the overlap task, participants were observed as they disengaged attention from the central fixation to a target distractor presented in the peripheral visual field. We found that the warning signal reduced latencies in both overlap and no-overlap conditions in infants aged 6 to 24 months. In infants, the ease of disengagement is an indicator of attentional development, whereas in toddlers, the ability to maintain gaze for long periods of time regardless of the presence of a distractor becomes an indicator of attentional development, with the latter thought to be related to the development of executive attention, individual differences in which have been captured in the framework of temperament as effortful control (Rothbart & Derryberry, 1981).
As we describe above, based on our previous results (Nakagawa et al., 2023), we predicted that differences in lateral or eye-mouth attention bias during audiovisual face perception between VLBWs and NBWs may not be observed at 3.5 years, whereas VLBW children may still show weakness in terms of alertness. Thus VLBWs may be more strongly affected by the warning signals. We also estimated the executive attention function by applying temperamental effortful control scores. We predicted a positive correlation between latency in overlap condition (without warning) and effortful control scores in NBWs. This is because children become more flexible about inhibiting responses to distractors as the executive attention system emerges (Ruff & Rothbart, 1996).
Method
The Ethics Committee of Nagoya City University (ID16004) and the board of the Second Red Cross Hospital (ID1132) approved the study protocols. The experimental protocol was conducted in accordance with the ethical standards specified in the 2013 Helsinki Declaration.
Participants
Figure 1 depicts participants’ flowchart in this follow-up study of Nakagawa et al. (2023). An VLBW group was formed consisting of 83 VLBW infants admitted to the Neonatal Intensive Care Unit (NICU) of the Japanese Red Cross Nagoya Daini Hospital between June 2015 and January 2017. However, 22 infants who matched exclusion criteria such as Grade III-IV intraventricular hemorrhages in the Papile classification, congenital diseases relating to neural or physical development, death before discharge from the NICU, or uncorrectable visual or auditory impairment were removed from the sample, leaving 61 eligible infants. Informed consent to participate was obtained from caregivers for 39 infants (64%). Of these, we made contact with 32 who participated at 12 months, and 23 (984 ± 253 g, 27.3 ± 2.6 weeks, 13 girls) attended gaze evaluation at precisely 42 months in the laboratory of Nagoya City University,

Participant flow chart: very low birthweight (VLBW) children.
NBW children were recruited through a laboratory-based database of caregivers interested in infant research and through an advertisement placed with local mothers’ groups in Nagoya City. Criteria for inclusion were no known birth or other complications and term (> 37 weeks of gestation) and normal birth weight (≧2,500 g). The NBW children were assessed by a specialist in developmental disorders and showed no signs of neurodevelopmental concerns at age 5. Of these, 48 (3111 ± 413 g, 39.1 ± 1.2 weeks, 29 girls) visited the laboratory at precisely 42 months.
After each participant had adjusted to the experimental setting at Nagoya City University, parent and child were escorted to a semi-dark room for the overlap task. However, four VLBW participants did not take part, having failed to calibrate at the start (N = 1) or to complete the task as a result of fussiness or crying (N = 3). All children were then assessed using Gazefinder.
Procedure
Task and Stimuli
Gazefinder (NP-H005GV, JVC Kenwood), an easy-to-use eye-tracking system developed for measuring early gaze patterns in response to social information on a 19″ monitor (1,280 × 1,024 pixels). It uses corneal reflection to record the x and y coordinates of each child’s eye position at a frequency of 50 Hz (i.e., 3,000 data collections/min). This procedure, which takes approximately 2 min to administer after a 15-s calibration of the eye position using a 5-point method at the start of testing, consists of 23 short movies in four categories: human faces (5 movies), biological motions (2 movies), people, and geometric patterns (12 movies), and objects with or without finger pointing (4 movies). Time allocated to a particular area divided by the duration of stimulus presentation is calculated automatically. If calibration quality is poor, it is repeated at least three times. Once calibration is complete, short movies are shown continuously without instructions. In our study, all available movie clips were presented once to each child in a self-contained, fixed order.
Of the 23 available movies, two movie clips of a human face (visual angle: 24° × 20°) were selected and analyzed: Talking (7 s), and silently looking straight ahead (4 s). The movies showed the same person. In the Talking movie, the actor says (in Japanese): “Hello!” “What’s your name?” and “Let’s play together.” Two Areas of Interest (AOI) are set by default: AOI-1 is the eyes region, and AOI-2 is the mouth region (Figure 2).

Talking face (Gazefinder movie sample).
Default AOI-1 and AOI-2 include the eye and mouth regions, respectively. The central line was added by the author. Permission to use the samples of Gazefinder was obtained from JVC KENWOOD Corporation.
The Tobii TX300 eye tracker provided the overlap task. Participants sat by themselves with the caregiver behind them in a semi-dark area 60 cm from a color monitor. The subject’s eye movements were monitored from outside this area through a high-angle CCD near-infrared video camera (ELMO CN43H) set in front of the participant. Stimulus presentation was controlled by E-Prime 2.0 Professional (Psychology Software Tools, Inc., Sharpsburg, PA).
The overlap task was the same as in Nakagawa and Sukigara (2022). The central fixation and peripheral targets were simple and brightly colored movies of dynamic (e.g., shrinking, expanding, rotating, or flashing) geometric shapes, respectively. These subtended a visual angle of 5°. Each of 10 different fixations or peripheral stimuli were presented against a black background. These were randomized into 20 combinations, each presented twice per participant in the right and left visual fields in a given condition (e.g., overlap condition with beep). In half of these trials, a beep sound of 60 db (pure tone of 2,000 Hz lasting 50 ms) was presented as a spatially non-predictive brief warning signal together with the fixation stimuli just before the onset of the visual peripheral target.
The procedure for the overlap task is presented in Figure 3. All children first completed a 5-point calibration before the start of the experiment. Each trial began with a given central fixation at the first key press by the experimenter. When the participant was judged to be looking at the fixation, the experimenter pressed a second key for the next step.

Procedure for the overlap task in warning signal condition.
At the second key press, the central fixation was maintained continuously for 200 ms before the peripheral target appeared for 2,600 ms at approximately 18° either to the left or the right of the central fixation point. At the second key press, a warning signal (2,000 Hz pure tone for 50 ms) was presented in half of all trials, while no warning signal was presented in the other half. In overlap condition, the central fixation stimulus remained on the monitor throughout the trial, even during the presentation of the peripheral target stimulus, whereas in no-overlap condition, the fixation stimulus disappeared when the peripheral target appeared. The experiment consisted of 20 trials in both overlap and no-overlap conditions, 10 with and 10 without warning signals. Whether the peripheral target appeared to the right or to the left of the fixation was determined by a pseudorandom schedule counterbalanced within each block of eight trials constructed as four experimental conditions: 2 (with warning and without warning) × 2 (overlap, no-overlap). One constraint was that the same experimental condition and peripheral target stimulus on the same side could not be presented more than three times in a row.
Temperament Questionnaire
Effortful control was assessed using the Japanese short version of the Childhood Behavior Questionnaire (CBQ), a 92-item parent-reported instrument containing 15 subscales (Kusanagi & Hoshi, 2017). The CBQ contains statements such as “My child always seems in a hurry to get from one place to another” and asked to rate their child on a 7-point scale (1 = totally untrue of your child—7 totally true of your child), with ratings indicating the degree to which the statement accurately describes the child’s behavior over the past six months. Caregivers are also offered a “Not applicable” response option when the child has not been observed in the situation described. Scores for each subscale consist of the total score for numerical responses to each subscale item divided by the number of items receiving a numerical response (i.e., not including items marked “Not applicable” or items receiving no responses).
In the CBQ tool, effortful control is a validated temperamental factor, and effortful control score is the mean of scores for Attention Focusing, Attention Shifting, Inhibitory Control, Low Intensity Pleasure, and Perceptual Sensitivity subscales (max. = 7). In our study, caregivers were given the temperament questionnaire at the end of each session to fill out and return.
Analytical Procedure
Gazefinder generates gaze points every 20 ms, the x-axis position, y-axis position, and whether or not gaze is directed toward the AOIs. For a horizontal center-of-face image, we chose 651 pixels from the left for stimulus (Silent) and 640 pixels from the left for stimulus (Talking). If each x-axis was smaller than these numbers, gaze might be directed to the left half of its AOI. Conversely, if each x-axis was greater than these numbers, gaze could be directed toward the right half of its AOI (i.e., the right half of the face from the viewer’s perspective). To avoid assigning gaze points on the x-axis near the midline to the wrong half of the face, we defined an area around the vertical midline (±0.5° of visual angle) where gaze points would not be assigned to either side of the face (Nakagawa et al., 2023). The proportion of looking time to the right side of the face was calculated by taking the amount of time the infant looked at the right half of the AOI-1 and AOI-2 divided by the total amount of time the infants looked to both the left and the right of the two AOIs. In addition, the proportion of time when attention was directed to the mouth region was calculated by taking the amount of time looked at the AOI-2 of the mouth region divided by the total amount of time spent looking at both the eye and the mouth regions of the two AOIs.
For the overlap task, response latency was defined as the elapsed time between peripheral target onset and the time when the child’s gaze crossed the line, indicating eye movement toward the target directly from the central fixation. Thresholds (i.e., the x-coordinate value for detecting eye movements toward the target stimulus) were set to 768 pixels from the left edge of the display for saccades to the left and 1,152 pixels from the left edge of the display for saccades to the right. Following the Tobii TX-300 user manual, data with validity codes 0 and 1 for either eye were accepted. Raw gaze data were obtained directly from the eye-tracker server using E-Prime Extensions for Tobii. Gaze data at 300 Hz were smoothed with a 37-sample median filter to remove high-frequency noise, thus approaching standardization with respect to the lower frequency data. Infant fixation confirmed that the infant’s gaze was recorded as being focused on the area of the first central fixation stimulus (±132 pixels from the center of the monitor) for more than 50% of 200 ms immediately prior to target presentation. Saccades that started less than 100 ms after the onset of the peripheral target or trials in which the child failed to fixate on the peripheral target within 2,600 ms (the duration of the target presentation) were not considered valid responses to the target location. Whether each trial was adequate for our analysis was judged by a trained coder not directly involved in the experiment, who reviewed the videotape with frame-by-frame playback. A second coder, also blind to the experimental conditions, coded child gazing for a random 20% of infants. Cohen’s Kappa of interrater reliability was .93.
Additionally, to study the sustaining of attention, we examined the duration of gazing at the central fixation for 250 ms just before being given the peripheral targets. That is, regardless of their subsequent responses to the peripheral stimuli, the recorded infant’s gaze toward the area of the central fixation stimulus (±132 pixels from the center of the monitor) was counted (Table 4).
Latencies and fixation durations in the overlap task were based on participants responding during more than two trials per condition. To treat the uneven distribution of the dependent variables, log transformation was applied to latencies and fixation durations. An arcsine transformation was applied to proportion for right or mouth bias in Gazefinder.
We performed ANOVA, which is robust against the normality assumption (Box, 1953): Gazefinder (proportion of looking time to the right side of the face, proportion of time looking to the mouth region), Overlap task (latency, response number, fixation duration).
Results
Talking Face and Silent Face
Mean number of gaze points and standard deviations (SD) for each of the four areas in Talking and Silence stimuli were presented in Table 1 for VLBWs and NBWs. These are shown in Table 2 for both right bias and mouth bias.
Mean Number of Gaze Points and Standard Deviations (SD) for Each of the Four Areas in Stimulus Talking and Silence for 7,000 and 4,000 ms, Respectively.
Note. NBW = normal birth weight children; VLBW = very preterm and very low-birth-weight children.
Mean and Standard Deviations (SD) for Right Bias and Mouth Bias in Stimulus Talking and Silence for NBW and VLBW Groups.
Note. NBW = normal birth weight children; VLBW = very preterm and very low-birth-weight children.
Of primary interest was whether the children spent more time looking at the right side of the talking face, not at the silent face, and whether the amount of attention to the right differed between VLBWs and NBWs. An ANOVA (2 (Talking, Silent) × 2 (VLBW, NBW)) was conducted for the proportion of looking time to the right side (Table 2). Results show that the main effect of the stimulus was significant (F(1,69) = 20.39, p < .001, η p 2 = .228). However, the VLBW and NBW groups did not differ from each other (F(1,69) = .03, p = ns), The proportion of time both the NBW and VLBW groups of infants spent looking to the right toward the talking face was higher, which is different from .5 respectively (t(47) = 3.69, p < .001, d = 0.53; t(22) = 2.58, p < .01, d = 0.53).
The proportion of looking time to the mouth was also analyzed with the same ANOVA. The main effect of the stimulus was significant (F(1,69) = 86.25, p < .001, η p 2 = .556). The VLBW and NBW groups did not differ from each other (F(1,69) = 0.15, p = ns). Regarding the talking face, both NBW and VLBW groups showed a probability of looking to the mouth significantly different from .5 (t(47) = 4.07, p < .001, d = 0.58; t(22) = 2.36, p = .014; d = 0.49), while in the silent face stimulus, the probability of both groups looking to the eyes was significantly different from .5 (t(47) = −10.42, p < .001, d = 1.50; t(22) = −3.28, p = .002, d = 0.68).
Overlap Task
Mean latencies (ms) and response numbers in each experimental condition and group are presented in Table 3. Results of an ANOVA (2 (with warning, without warning) × 2 (overlap, no-overlap) × 2 (NBW, VLBW)) for latencies showed that the main effects of the warning and overlap conditions were significant (F(1,56) = 4.71, p = .034, η p 2 = .078; F(1,56) = 127.56, p < .001, η p 2 = .695). That is, latencies in with-warning condition were shorter than those in without-warning condition, and latencies in overlap condition were longer than those in no-overlap condition. Neither main group effect nor interactions were found.
Mean Latencies (ms) and Response Numbers in Each Experimental Condition and Group in the Overlap Task.
Note. Standard deviations are given in parentheses.
NBW = normal birth weight children; VLBW = very preterm and very low-birth-weight children.
The same ANOVA applied to response numbers revealed a significant interaction between warning signal and overlap condition (F(1,56) = 4.05, p = .049, η p 2 = .068). That is, in overlap condition, the number of responses decreased in without-warning condition compared to in with-warning condition, while there was no difference between with- and without-warning conditions in no-overlap condition. Regarding main effects, there was a tendency for the number of responses to decrease in overlap condition (F(1,56) = 3.15, p = .081, η p 2 = .053) and in the VLBW group (F(1,56) = 3.70, p = .059, η p 2 = .062). No other interactions were significant.
As sustained attention is part of the alerting network, which begins to come under the control of the executive attention network at the end of the second year of life, we compared the duration of gazing fixation for 250 ms just before being given the peripheral targets between NBWs and VLBWs (Table 4). The results of an ANOVA (2 (with warning, without warning) × 2 (NBW, VLBW)) for fixation duration revealed a significant interaction (F(1,60) = 7.13, p = .01, η p 2 = .106). That is, the VLBW group showed longer fixation duration in with-warning condition compared to in without-warning condition, while the NBW group showed no such difference. Regarding main effects, the fixation duration was longer in the with-warning condition than in without-warning condition (F(1,60) = 4.58, p = .036, η p 2 = .071) and in NBWs than in VLBW (F(1,60) = 10.80, p < .01, η p 2 = .071).
Mean Fixation Duration (ms) and Trial Numbers in Each Experimental Condition and Group.
Note. Standard deviations are given in parentheses.
NBW = normal birth weight children; VLBW = very preterm and very low-birth-weight children.
As regards executive attention, the NBWs had higher scores on effortful control than the VLBWs (respectively Mean = 5.13, Range = 3.75–6.20, SD = 0.59; Mean = 4.66, Range = 3.08–5.79, SD = 0.69; t(63) = 3.35, p < .001). When we examined its relationship with latencies in overlap condition (Table 3, without warning), longer latencies in overlap condition were associated with high effortful control scores in the NBW group (r = .36, p = .018), whereas there were no significant relationships in the VLBW group (r = .26, p = .33).
Discussion
In this study, we examined VLBW and NBW children in terms of orienting and alerting attention at 42 months as a follow-up to Nakagawa et al. (2023). Results showed that both groups showed right attention and mouth attention biases toward the talking face stimuli differently from toward the silent face stimulus. In the overlap task, although main effects in warning-signal and overlap conditions were significant in terms of latencies, there was no difference in the time required to disengage attention between NBWs and VLBWs. On the other hand, response numbers of disengagement decreased slightly in VLBWs. Related to this, one prediction is that in VLBWs, the response itself was less likely to be judged as fixating before the presentation of the peripheral target. In other words, the difference between VLBWs and NBWs observed in the overlap task was not the response latency of attentional disengagement but difficulty in sustaining attention at the first fixation stimulus, which is the premise for the response of disengagement.
Regarding face perception, feature (eye or mouth) bias to the talking and silent faces was replicated. That is, 4-year-old children showed mouth attention bias toward the talking face and eye attention bias toward the silent face, with both stimuli provided through Gazefinder (Mori et al., 2021). Though mouth bias in speaking images has also been reported in Japanese 5- and 6-year-olds (Mori et al., 2023; Saito et al., 2017), adults show a slightly higher percentage of gaze to the eyes for the same stimuli (Fujioka et al., 2016). On the other hand, it has been reported that although infants’ attention to the mouth increases during native language learning, 12-month-olds less show mouth bias for native than for non-native language (Lewkowicz & Hansen-Tift, 2012; Pons et al., 2015). However, it has also been shown that when children enter the word acquisition phase in the second year of life, they begin to pay more attention to the speaker’s mouth again (Hillairet de Boisferon et al., 2018: 14 and 18 months). Although these studies used longer audiovisual stimuli, namely 50 s (Lewkowicz & Hansen-Tift, 2012) and 45 s (Pons et al., 2015) than the present study, and direct comparisons with the present results may be difficult, the present results suggest that NBW and VLBW 3.5-year-old Japanese speakers show mouth bias.
In addition, both NBWs and VLBWs showed rightward attentional bias during the talking video, a finding also reported previously in toddlers (Nakagawa et al., 2023). Using higher-order brain function measurements (fMRI, optical topography), speech perception has been shown to be left-hemispheric-dominant from infancy (Peña et al., 2003: 2–5 days; Perani et al., 2011: 2 days). Thus, rightward attentional bias may be due to activation of the left hemisphere (MacKain et al., 1983: 5-6 months; Mugitani et al., 2011: 8 months). On the other hand, attentional bias toward the left visual field has been reported in audiovisual face processing in adults (Buchan et al., 2008; Everdell et al., 2007), and the McGurk effect was attenuated when the right side of the talker’s mouth was covered, suggesting leftward attentional bias in observers for audiovisual videos (Nicholls et al., 2004). Beside the contribution of the face processing system in the right hemisphere of the observer’s brain, asymmetrical gaze behavior during audiovisual speech in adults could be explained from the viewpoint of left-hemisphere lateralization for language production in the talker’s brain. The intrinsic asymmetric articulation and movement observed in the faces of talkers has led observers to demonstrate leftward attention bias toward all faces, which may be a learned behavior (Everdell et al., 2007). Thus in early childhood, activation of the left hemisphere of the observer’s brain may result in rightward attentional bias toward the audiovisual video. As they gain social experience, young observers learn that the right side of the talker’s face contains more information, resulting in leftward attentional bias. This may indicate the development of attention from stimulus-driven to top-down orienting. Our results suggest that stimulus-driven orienting to the talking face is still dominant at 3.5 years of age.
In addition, we discuss the results of the overlap task without warning. The overall facilitation of latency in overlap condition or in non-overlap condition was consistent with our previous study using the same task with 6- to 24-month-olds longitudinally (Nakagawa & Sukigara, 2022). It has been reported that preterm infants are faster than term infants at disengaging attention or shifting gaze from central fixation to the peripheral stimuli during the first months of life (Atkinson, 2000: 4–6 weeks; Butcher et al., 2002; Hunnius et al, 2008: 6–26 weeks). It was also found that latencies were faster at 6 months than at 12 months in both overlap and no-overlap conditions in the term-infant longitudinal study by Nakagawa and Sukigara (2019), suggesting that the shorter fixation time may reflect the immature brain’s difficulty in inhibiting its response to peripheral stimuli. Butcher et al. (2002), who examined the development of shifts in gaze to a peripheral stimulus between 6 and 26 weeks, further reported that errors seen in term infants were less common in preterm infants after 14 weeks, while preterm infants tended to continue to stare at the fixation point. Comparisons of attentional disengagement between autism spectrum disorder (ASD) and typically developing children have been conducted up to adults (Sacrey et al., 2014), but only up to the toddler period in preterm infants (Ginnell et al., 2021; van de Weijer-Bergsma et al., 2008). Our study suggests that when examining disengagement in VLBWs, we should consider the weakness in their ability to continue to focus on the fixation point prior to disengagement.
Next, we consider the arousal network investigated in this study using auditory warning signals. In a similar eye-tracking study, it was reported that the difference in latencies between with- and without-warning condition was smaller in preterm than in term toddlers, suggesting a poor arousal network function in preterm toddlers (de Jong et al., 2015: 18 months; 2016: 18 months). On the other hand, Jaeger et al. (2021) measured reaction times using a computerized test of attentional performance for children with a response key to examine the effects of the presence or absence of warning signals. Although reaction times in without-warning condition were significantly longer in preterm than in term children, the benefit of presenting warning signals was greater for the former than for the latter. The authors suggest that preterm children may profit from auditory cues in overcoming deviations in alertness. In Snyder et al. (2007), the effect of auditory warning signals in preschool children aged 4 to 5 years was examined using a computerized Vigilance task. The results showed that the reaction time was significantly longer in VLBWs than in NBWs regardless of the presence or absence of a signal . Part of the present results, namely the amount of time spent looking at the fixation stimulus, was consistent with Jaeger et al. (2021), which found that preterm children may profit from auditory cues to compensate for their poor alertness.
Finally, as regards executive attention, the lower effortful control scores in the VLBW group are consistent with previous studies of children born preterm using a questionnaire (Klein et al., 2013: 18–35 months; Shinya et al., 2022: 18 months). In Voigt et al. (2012), in addition to a parent-reported questionnaire, a battery consisting of three delay of reward tasks and one motor inhibition task was applied at 24 months. The very preterm group, but not the moderately to late preterm group, gave lower Effortful Control Composite Scores compared to the full-term group. On the other hand, no differences were found with respect to the temperamental scales. In our study, the relationship between effortful control scores and latency in overlap condition was positive in NBWs but not significant in VLBWs. The results regarding NBWs are consistent with previous studies (Nakagawa & Sukigara, 2013; Posner & Rothbart, 2018), suggesting a genuine positive association between effortful control and sustained and focused attention. Previous research using neuropsychological tasks measuring effortful control, including a Tongue task (i.e., the child putting a sweet on their tongue while keeping the mouth open without chewing or swallowing the sweet), a Statue task (i.e., maintaining a body position with eyes closed during a 75-s period and inhibiting the impulse to respond to sound distractors), and a Selective visual attention task, found that 42-month-old VLBWs had difficulties expressing specific emotions and exerting effortful control compared to NBWs (Witt et al., 2014). A positive correlation was also reported between effortful control and emotion regulation. Thus, it would be desirable to increase the number of VLBW participants and examine whether we could find a positive relationship between effortful control scores and latency in overlap condition. Or the present results, positive correlation observed only in NBW, may suggest differences in brain development between the two groups (França et al., 2024).
In any case, even though this is a follow-up study, the small number of participants constitutes a limitation. Moreover, as face stimuli were among those available in Gazefinder, the ease to use of this supplementary tool for measuring each child’s gaze pattern toward social information may have resulted in presentation times not allowing for sufficient to discern left-right differences. Another limitation is that although 3.5 years of age is a period when control by executive attention becomes dominant (Posner et al., 2014), it is highly likely that there is much individual variability (cf. range of scores for effortful control) in NBWs and especially in VLBWs, who are known to develop at many different rates depending on comorbid conditions. Effortful control may need to be assessed not only through parent-reported questionnaires but also through behavioral tasks such as the Stroop-like conflict task, in which dominant responses are suppressed in order to activate a subdominant response. As only correlational analysis was conducted on this point, this does not allow us to make inferences about causation.
In conclusion, although it appears that VLBW infants catch up by 3.5 years of age, this is not necessarily the case. Thus future studies will need to investigate whether this is compensated by the development of executive attention. There is evidence of an increased risk to the inattentive subtype of ADHD in Very Preterm (VPT)/VLBW children along with clinical symptoms such as hyperactivity, impulsivity, and conduct problems, which often do not manifest at the behavioral level (Johnson & Marlow, 2011). In other words, a fairly consistent preterm phenotype may manifest as a high rate of subclinical symptoms. It is therefore worth taking a closer look at the development of attention in preterm infants from a neuropsychological perspective.
Footnotes
Acknowledgements
The authors wish to thank all the children and families who took part in this project. We also thank Eri Kuno and Kiho Futamura for their help in running the experiments.
Author Contributions
Nakagawa: Conceptualization, Funding acquisition, Methodology, and Writing: original draft: review & editing. Sukigara: Formal analysis, Software, and Writing: review & editing. Nomura: Data curation and Writing: review & editing. Nagai: Resources, Supervision, and Writing: review & editing. Miyachi: Resources, Supervision, Funding acquisition, and Writing: review & editing.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by JSPS Kakenhi Grant no. 19H01655.
Availability of Data and Materials
Datasets for TD or VLBW toddlers are available from AN or KN, respectively, upon reasonable request.
