Abstract
Attention can be shifted with or without an accompanying saccade (i.e., overtly or covertly, respectively). Thus far, it is unknown how cognitively costly these shifts are, yet such quantification is necessary to understand how and when attention is deployed overtly or covertly. In our first experiment (N = 24 adults), we used pupillometry to show that shifting attention overtly is more costly than shifting attention covertly, likely because planning saccades is more complex. We pose that these differential costs will, in part, determine whether attention is shifted overtly or covertly in a given context. A subsequent experiment (N = 24 adults) showed that relatively complex oblique saccades are more costly than relatively simple saccades in horizontal or vertical directions. This provides a possible explanation for the cardinal-direction bias of saccades. The utility of a cost perspective as presented here is vital to furthering our understanding of the multitude of decisions involved in processing and interacting with the external world efficiently.
The natural world provides our visual system with rich, sometimes even overwhelming input. Visual attention prioritizes only the most relevant of this information (Posner, 1980). Attention can be shifted in space in two differing ways: either overtly or covertly (i.e., with or without an eye movement, respectively; Helmholtz, 1866/1948; Posner, 1980; Rizzolatti et al., 1987, 1994). Although these two types of attention are well established, it remains open which costs are associated with them. The brain minimizes the expenditure of its limited resources whenever possible (Friston, 2010). A complete understanding of why visual attention is moved in a certain way can thus not be achieved without understanding what the possibly subtle costs of attentional shifts are.
Although pupil size primarily adapts to low-level visual processing, strict control of the stimulus material allows insights into higher-level cognition (Mathôt, 2018, 2020; Strauch et al., 2022). As such, pupil size reflects working memory load or the intensity of processing, and can thus index mental effort (Beatty, 1982; Kahneman, 1973). Because the brain’s resources are intrinsically limited, exertion of mental effort can be considered cognitively costly (Just et al., 2003; Kahneman, 1973). Just et al. (2003) even suggest that pupil dilation is an index of the sum of expenditure of neural resources. This suggestion is plausible because pupil size is closely linked to activity in the noradrenergic locus coeruleus, which has widespread excitatory projections throughout the brain (Aston-Jones & Cohen, 2005; Joshi et al., 2016; Schwarz et al., 2015; Strauch et al., 2022). Through these projections, pupil size captures the effort associated with many facets of cognition and behavior. In line with this, the effort associated with the planning and execution of movements is reflected in pupil size (Naber & Murphy, 2020; Richer & Beatty, 1985; Strauch et al., 2018). Such effects of motor preparation on pupil size scale with the complexity, speed, accuracy and force of the ultimately executed movements (Naber & Murphy, 2020; Richer & Beatty, 1985). Similar results have been reported for overt shifts of attention: The pupil already dilates during saccade programming (Jainta et al., 2011; Wang & Munoz, 2021b). Pupil size thus tracks fluctuations in effort as subtle as the planning of an upcoming eye movement.
What kind of costs could be associated with shifting attention covertly or overtly? Covert shifts of attention are realized by oculomotor programming according to the premotor theory of attention (Rizzolatti et al., 1987, 1994). For covert shifts, such programs are ultimately not executed and likely even actively inhibited, which has been proposed to be effortful (Findlay & Gilchrist, 2003; Helmholtz, 1866/1948). These processes (i.e., oculomotor programming and inhibition) contribute to the costs associated with shifting attention covertly. For overt shifts of attention, different processes may contribute to their underlying costs. Saccades are thought to be made effortlessly (Findlay & Gilchrist, 2003) or that they are “cheap” (Theeuwes, 2012, p. 25), yet it is clear that they cannot be cost free. Multiple underlying processes are necessary to realize efficient overt attentional shifting across the environment while ensuring a stable subjective experience of the visual world (Rolfs, 2015; Van der Stigchel, 2020; Van der Stigchel & Hollingworth, 2018). For example, when shifting overtly, attention shifts prior to saccade execution (Deubel & Schneider, 1996). Such presaccadic shifts allow for spatial and predictive remapping to facilitate trans-saccadic stability and continuity (Bays & Husain, 2007; Rolfs, 2015), which is unnecessary for covert shifts because there is no perceptual disruption or change in retinal input (Van der Stigchel & Hollingworth, 2018). Given the multitude of processes underlying saccades (i.e., oculomotor programming, presaccadic attentional shifting, spatial/predictive remapping, and the ultimate execution of the movement) in comparison with covert shifts, it is conceivable that the costs of these shift types differ.
Thus, although much is known about overt and covert shifts of attention, their respective underlying costs are unknown. Here, we assessed the costs of attentional shifts in two experiments using pupillometry.
Open Practices Statement
Deidentified data and data-analysis scripts for both of the experiments have been made publicly available via OSF and can be accessed at https://osf.io/6p3ry/. Neither experiment was preregistered. The materials used in these studies are widely available.
Statement of Relevance
The environment provides humans with much more visual information than the brain can process. Therefore, only the most relevant parts are selected. This selection process is called visual attention. Attention can move in two different ways: overtly (i.e., with eye movements, referred to as saccades) or covertly (without eye movements). What determines whether attention is shifted overtly or covertly? We argue that the cognitive costs associated with a movement underlie that decision, at least in part. Using pupil size as an index of cost, we found that saccades consume more cognitive resources than covert shifts of attention. To determine whether all saccades are equally costly, we assessed the effect of saccade direction. Saccades in diagonal directions were found to be more costly than saccades in horizontal or vertical directions. These findings demonstrate unequal costs for different types of attentional shifts. The cost perspective introduced here may thus inform us about how and when attention is shifted in a certain manner.
Experiment 1
Method
Participants
Twenty-eight participants with normal or corrected-to-normal vision took part in Experiment 1. Four participants did not follow task instructions, so we excluded their data. This left a final sample for analyses of 24, including the first author (age: M = 22.96 years, range = 18–29; nine males; one left-handed). Participants reported no history of epilepsy, attention-deficit-related disorders, or autism spectrum disorder. Participants were recruited through Utrecht University’s online recruitment platform (SONA Systems) and were either compensated with course credits or €8. The experimental procedure was approved by the ethical review board of Utrecht University’s Faculty of Social Sciences. Sample size was determined on the basis of behavioral work that compared overt and covert attention (Hunt & Kingstone, 2003). This exceeded previous saccade/motor preparation pupillometry studies (i.e., Jainta et al., 2011; Naber & Murphy, 2020; Richer & Beatty, 1985; Wang & Munoz, 2021b).
Stimuli
Stimuli consisted of placeholders (“8”), distractors (“2” and “5”), and targets (“3” and “E”; based on Deubel & Schneider, 1996; see Fig. 1). Stimuli were presented at five positions: outer left, inner left, center, inner right, and outer right. Outer stimuli were presented at 27.5° (height = 34.18°, width = 21.91°), inner stimuli were presented at 10° (height = 12.95°, width = 8.3°), and the middle stimulus was presented in the center of the screen (height = 2.96°, width = 1.9°). Sizes of stimuli were scaled on the basis of their eccentricities using the cortical magnification factor approximation described by Rovamo and Virsu (1979) to compensate for relative underrepresentation of the periphery in visual cortex (Rosenholtz, 2016).

Trial structure of Experiment 1. During the baseline period, participants fixated the central stimulus (“8”) in all blocks. During the overt-attention block, participants made saccades toward the target location after cue offset. Crucially, participants never started moving their eyes while the target (“3” or “E”) was presented. In the covert-attention block, participants maintained fixation on the central stimulus, but they did shift their attention toward the target location. In the overt- and covert-attention blocks, single and double arrow cues were presented to indicate near and far target positions, respectively. In the control block, targets were presented at the center stimulus. Participants did not move their eyes or shift their attention in this block. Dimensions do not match those in the experiment but have been changed here for clarity.
To minimize effects of low-level stimulus properties on pupil size, we made stimuli equiluminant with the background throughout trials. To achieve this, we asked participants to complete a flicker fusion calibration. Here, a blue background (hue, saturation, value [HSV] = 240.1.1; 85 cd/m2) was presented continuously while a red color flickered on top of the background at 25 Hz. Participants adjusted the red flicker by moving their mouse across the horizontal plane and then pressed the left mouse button when they determined that the flickering was the least noticeable. This procedure was performed three times, after which we took the average luminance of the red color for the final stimuli throughout the remainder of the experiment (value: M = .48, SD = .10).
Apparatus and eye tracker
Stimuli were presented on an OLED LG TV (1,920 × 1,080 pixels; 50 Hz) using PsychoPy (Version 2020.2.5; Peirce et al., 2019). Gaze position and pupil size were recorded monocularly (right eye, 1000 Hz) using an EyeLink 1000 Plus Tower Mount (SR Research, Mississauga, Ontario, Canada). Participants were positioned in a chin and headrest so that their eyes were 76.5 cm away from the monitor. A 5-point calibration and validation procedure was conducted at the start of the experiment.
Procedure
The main task consisted of three blocks: control, covert attention, and overt attention. Participants had to report the identity of a target (“3” or “E”). Irrespective of the block, every trial started with five red 8s as placeholders, which were presented for 2.5 s to 3.5 s on a blue background. Then, a cue (100% valid) in the central placeholder indicated the target location (Fig. 1). Cues were presented for 500, 700, or 900 ms (random) to make the timing of target onset less predictable and to minimize premature saccades. Afterward, the target was presented (“3” or “E”) at the cued location for 60 ms, and distractors (randomly a “2” or “5”) were presented at the other placeholder locations.
Participants were instructed throughout the experiment on how to attend to the target location depending on the block (Fig. 1). In the control block, neither attentional shifts nor eye movements were necessary because the target was presented in the center (40 trials: 20 near and 20 far central-cue trials; half the number of trials compared with the other blocks). In the covert-attention block (80 trials: 40 inner and 40 outer target locations), participants were instructed to covertly shift their attention to the target location (without eye movements) directly after cue onset. In the overt-attention block (80 trials: 40 inner and 40 outer target locations), participants were required to additionally make an eye movement to the target location after cue offset. After target offset, stimuli changed back into placeholders for 1 s and functioned as a mask. Participants were instructed to keep their gaze (control, overt) or covert attention (covert) at the target location during this 1 s period. At the end of the trial, all placeholders turned into fixation crosses, and participants reported target identity with the left (“3”) and right (“E”) mouse buttons.
The order of blocks was counterbalanced across participants, and 12 practice trials for each block were given prior to the experiment. Participants could take a break after every block.
Data processing and analysis
Behavioral data were analyzed in JASP (2022; Version 0.16.1). Pupil and gaze data were processed and analyzed in Python (2021; Version 3.9.7) using custom scripts. For pupil data, we followed analysis procedures recommended by Strauch et al. (2022). Blinks were interpolated for both pupil size and gaze position. Pupil sizes were baseline corrected by subtracting the median of the first 10 ms after cue onset. To ensure that participants shifted their attention correctly, we discarded all trials with incorrect responses (11.1% of trials). Practice trials were not considered for analyses. Next, for the control and covert-attention blocks, we excluded trials in which participants did not keep their gaze at the center of the screen (> 2.5° from the center; correct trials discarded: control = 11.6%, covert = 12.73%). Furthermore, for the overt-attention block, premature saccades and very slow saccades (< 80 ms or > 550 ms after target onset; 21.4% of correct trials) were discarded. Trials with saccade durations shorter than 10 ms or longer than 110 ms were also excluded (4.9% of correct trials; Nyström & Holmqvist, 2010). We also excluded trials during which participants did not make a saccade (2.7% of correct trials). Two participants were excluded because they almost exclusively made premature saccades. Two other participants were excluded because they did not maintain fixation at the center in the covert-attention block throughout the experiment. This left a total of 1,240, 1,446, and 838 trials in the overt attention, covert attention, and control conditions for the pupil-size analyses, respectively.
As expected, pupil-size data in the overt-attention block were affected by the pupil foreshortening error, caused by eye rotations relative to the camera. The pupil foreshortening error reflects the decrease in apparent pupil area by up to 10% as a function of the difference in angle between eye and camera orientation (Hayes & Petrov, 2016). Here, we implemented a simple correction for the pupil foreshortening error to allow valid comparisons between the overt-attention and other blocks after saccade onset (see Section S1 in the Supplemental Material available online).
Linear mixed-effects models (LMEMs) were used to analyze pupil responses over time (every 1 ms) starting from 500 ms preceding target onset until 550 ms after target onset so that the preparation of all saccades was considered in the analyses (Fig. 2). As a conservative bound for statistical significance, we set α to .01 (corresponding roughly to t > 2.57) for all LMEMs over time (Luke, 2017). Models were determined using Akaike-information-criterion-based backward selection while at least maintaining the main effects of interest based on the median pupil size between −100 and 100 ms around target onset.

Experiment 1: mean pupillary response over time (a) between attention blocks collapsed over distance conditions and separately for each distance condition in the (b) overt-attention and (c) covert-attention blocks. Horizontal lines at the bottom of (a) indicate significance for each comparison (p < .01). Vertical dashed lines indicate target onset (black) and median saccade onset latency (gray). Error bands indicate ±1 SE. a.u. = arbitrary units.
Results
Pupil responses
We investigated the costs of attentional shifts by comparing pupil responses between shift types (Fig. 2a; fixed effects: shift type + cue duration + accuracy + reaction time + distance; random effect: participants). If overt shifts of attention are more costly than covert shifts, pupil dilation should be enhanced in the overt-attention block compared with the covert-attention block. Indeed, pupil dilation was significantly stronger prior to a saccade compared with a covert attentional shift from approximately 100 ms before target onset (Mdn β = 24.561, range = 13.599–29.752; Mdn t = 3.811, range = 2.582–4.138, p < .01; Fig. 2a). For the first time, these data show a difference in costs between overt and covert shifts of attention. The timing of this effect indicates that the additional cost of a saccade compared with a covert shift is, at least partially, due to processes unfolding before saccade onset. Such processes likely include oculomotor programming, presaccadic shifting, and predictive remapping.
Differences between the overt-attention and control blocks showed a similar pattern, with stronger pupil dilation in the overt-attention block, but the difference emerged later, at approximately 30 ms before target onset (Mdn β = 42.777, range = 18.547–64.472; Mdn t = 5.041, range = 2.585–6.759, p < .01). These effects show that preparing a saccade is costly compared with not shifting attention.
Next, the costs of covert shifts were compared with the control block. The difference between the covert-attention and control blocks reached significance from approximately 315 ms after target onset until the rest of the trace (Mdn β = 32.689, range = 22.132–37.273; Mdn t = 3.637, range = 2.623–3.960, p < .01). The larger pupil size in the covert-attention than in the control condition can be cautiously interpreted as costs associated with shifting one’s covert attention along the horizontal meridian, but this effect emerged only after the shift had occurred.
Notably, distance never significantly predicted pupil size for the entirety of the pupil trace (t < 1.89, p > .059; Figs. 2b and 2c). This indicates that planning an attentional shift far away compared with shifting relatively close by does not seem to require more effort. Cue duration showed significant effects (Mdn β = 18.030, range = 5.241–24.787; Mdn t = 6.17, range = 2.951–7.385, p < .01) across all analyses for the entire time window, wherein longer cue durations predicted larger pupil sizes. This likely reflects increased recruitment of arousal to complete the upcoming trial (for more detailed analyses and discussion of these effects, see Section S1).
To test whether differences in pupil size between shift types might be driven by task difficulty, we took a number of steps. First, accuracies were included in the model and (marginally) predicted pupil size only around −100 until 330 ms around target onset (Mdn β = 74.758, range = 53.576–80.456; Mdn t = 2.212, range = 1.882–2.449, p < .05 but never reached p < .01). However, this effect was not statistically strong (compared with the effects of shift type) over time. Second, if accuracy and pupil effects are correlated, this would imply a possible confound of task difficulty on the pupil. We therefore calculated the extent of the pupillary effect (i.e., mean pupilovert – mean pupilcovert) and the accuracy effect (i.e., accuracyovert – accuracycovert) between shift types per participant. Bayesian Pearson correlations indicated that accuracy does not account for the differences in pupil size between the shift conditions (see Section S1; overt–covert: r = –.052, p = .808, Bayes factor [BF]01 = 3.841; overt–control: r = –.338, p = .106, BF01 = 1.150; covert–control: r = .005, p = .981, BF01 = 3.949). Third, reaction times did not significantly predict pupil size at any of the time points (t < .96, p > .33). Together, these analyses rule out the possibility that differences in pupil size between blocks are driven by task difficulty.
Previous work has shown that a relatively small baseline pupil size is generally associated with a stronger subsequent evoked pupil dilation (Knapen et al., 2016). To assess whether baseline pupil size influenced the differences between shift types, we ran an LMEM (fixed effect: shift type random effect: participants). Baseline pupil size was significantly larger in the overt-attention than in the covert-attention block (β = 130.17, SE = 11.68, t = 11.15, p <.001) and control block (β = 31.35, SE = 13.48, t = 2.33, p = .020). Furthermore, baseline pupil size was smaller in the covert compared with the control block (β = 98.81, SE = 13.03, t = 7.58, p <.001). If the observed differences between shift types were the consequence of these baseline effects, one would expect the smallest dilatory responses in the overt-attention block and larger ones in the two other conditions. Instead, baseline pupil size was largest in the overt-attention block; still, the subsequent dilation was largest in this block. If anything, the differences in pupil dilation between the overt-attention and the other blocks were therefore underestimated because of baseline effects.
Target identification accuracy
To investigate whether shift types differed in task difficulty, we compared accuracies on the identification task. Distance and shift type were entered into a 2 × 2 repeated measures analysis of variance (ANOVA) to compare accuracies. Accuracy (proportion of correct responses) did not differ between the overt-attention (M = .85, SD = .13) and covert-attention (M = .86, SD = .11) blocks, F(1, 23) = 0.23, p = .636, η p 2 = .01. It is therefore unlikely that task difficulty confounded the effect of shift type on pupil size between the overt- and covert-attention blocks. Participants showed lower accuracy in the far compared with the near condition (M = .83, SD = .14 vs. M = .88, SD = .10), F(1, 23) = 5.49, p = .028, η p 2 = .19, which is in line with previous findings (Deubel & Schneider, 1996). There was no significant interaction effect between distance and shift type, F(1, 23) = 2.93, p = .100, η p 2 = .11. To assess differences between the control block and the other shift types, we ran paired-samples t tests. These showed significantly higher accuracies in the control compared with all other conditions (M = .99, SD = .02), all ts(23) > 4.40, ps < .001, Cohen’s ds > .90 (for details, see Section S1). This shows that the control block was significantly less difficult than the other blocks. Note that we instructed participants to be as accurate as possible and gave them no time limit to respond; for analyses of reaction times, see Section S1. As stated, differences in pupil responses between shift types were not driven by effects of task difficulty (see Pupil Responses).
Interim Discussion
Together, these data indicate that pupil dilation that was stronger for overt than for covert attentional shifts even before saccade onset (vertical gray dotted line in Fig. 2a; Mdn = 345 ms). This suggests that overt attentional shifts are more costly than covert attentional shifts. Given the timing of these effects, we reason that the additional costs of saccades are likely due to programming an eye movement, presaccadic shifting of attention, and predictive remapping. Notably, eccentricity of the target did not affect the effort involved in programming a saccade. This eccentricity effect may hold for covert attentional shifts, but this is difficult to assess because of differences in behavioral performance between the distance conditions. Although less costly than saccades, covert attentional shifts were associated with costs comparable with the control (no-shift) block during planning, even when we accounted for possible effects of task difficulty.
Experiment 2
In Experiment 1, we showed that saccades are inherently costly besides the cost of attentional shifting alone. Given the timing of the pupil effects, such costs possibly followed from effort involved in oculomotor programming, presaccadic shifting, and/or predictive remapping. Next, we investigated whether saccades could differ in cost on the basis of the complexity of the motor program of the saccade. To this end, we compared saccades in different directions in Experiment 2.
Distinct neuronal populations in frontal eye fields and superior colliculus are involved in programming and executing vertical or horizontal saccades. However, there are no “oblique saccade-sensitive” neurons in either of these brain areas (Bruce et al., 1985; Segraves & Park, 1993). Thus, programming and executing oblique saccades require both vertical and horizontal saccade-sensitive neurons in frontal eye fields and superior colliculus—as well as communication between these two direction commands. Such saccade-related activity in frontal eye fields and superior colliculus is in turn reflected in pupil size (Lehmann & Corneil, 2016; Strauch et al., 2022; Wang & Munoz, 2021a). On the basis of this, we expected that oculomotor programming of oblique saccades is more complex and should thus be more costly than such programming of vertical or horizontal saccades.
Method
Methodology and procedures were identical to those in Experiment 1 unless otherwise specified.
Participants
In total, 31 participants took part in Experiment 2. The recruitment procedure and the compensation for participation were the same as in Experiment 1. Seven participants did not follow task instructions (i.e., almost all saccades they made were premature made), so we excluded their data. Data from 24 participants (age: M = 21.75 years, range = 18–26; six males; three left-handed), including the first author, were included in the analyses (as in Experiment 1).
Apparatus and stimuli
Both eyes were tracked, resulting in a maximum recording rate of 500 Hz instead of 1000 Hz. Also, stimulus sizes were now kept constant because their distance from fixation was kept equal across conditions (height = 2.96°, width = 1.9°). In addition to the central stimulus, eight other stimuli were circularly presented around the center (radius = 10°; luminance value: M = .47, SD = .15). The cues were adjusted so they indicated saccade directions (Fig. 3)

Trial structure of Experiment 2. During the baseline period, participants fixated the central stimulus (“8”). Cues indicated the saccade direction (oblique, horizontal or vertical) and target location. Saccades were executed upon cue offset as fast as possible towards the target location. Participants then fixated the target location for 1 s. Lastly, participants indicated the target identity (“3” or “E”). Dimensions do not match those in the experiment but have been changed here for clarity.
Procedure
Participants now needed to shift their attention overtly in different directions for all trials: obliquely, vertically, or horizontally toward the target location. Duration of cue presentation was now randomized between 500 ms and 1,000 ms to make the timing of target onset less predictable and to prevent participants from making premature saccades (as in Deubel & Schneider, 1996). The task consisted of 200 trials (25 per possible target location) in a random order (i.e., mixed instead of blocked conditions) that were preceded by 16 practice trials. Participants took a break after every 50 trials to avoid fatigue.
Data processing and analysis
Although both eyes were measured, only pupil and gaze data for the right eye were analyzed. Data were processed in the same manner as in the overt-attention block from Experiment 1. Trials with incorrect responses (8.8%), with slow or fast saccade onsets (14.1%), with very short or long saccade durations (2.6%), and without a saccade (4.2%) were discarded. This left a total of 1,666, 862, and 848 trials in the oblique, horizontal, and vertical conditions, respectively.
Results
Pupil responses
To investigate whether oblique saccades are more costly than saccades in cardinal directions, we analyzed pupil responses with LMEMs over time (Fig. 4; fixed effects: direction + cue duration + accuracy + reaction time; random effect: participants). If oblique saccades are more costly because of increased complexity of oculomotor programming, pupil dilation should be enhanced for oblique compared with cardinal saccades. In line with our hypothesis, pupil dilation was greater preceding oblique compared with horizontal and vertical saccades (Fig. 4). Specifically, the pupil dilated more preceding oblique than vertical saccades approximately 170 ms prior to target onset (Mdn β = 32.689, range = 22.132–37.273; Mdn t = 3.637, range = 2.623–3.960, p < .01). The difference between oblique and horizontal saccades reached significance approximately 15 ms prior to target onset (Mdn β = 9.051, range = 6.163–12.269; Mdn t = 3.317, range = 2.581–4.162, p < .01). No significant differences in pupil size were found between the horizontal and vertical conditions (t < 1.057, p > .290). This demonstrates that oblique saccades are more costly than saccades in cardinal directions, likely because of a more complex underlying oculomotor program.

Experiment 2: mean pupillary response over time for the different saccade directions. Horizontal lines at the bottom of the graph indicate significance for each comparison (p < .01). Vertical dashed lines indicate target onset (black) and median saccade onset latency (gray). Error bands indicate ±1 SE. a.u. = arbitrary units.
As in Experiment 1, cue duration significantly predicted pupil size, wherein longer cue durations were associated with a larger pupil size from approximately 160 ms prior to target onset until the rest of the trace (Mdn β = 46.222, range = 14.773–60.195; Mdn t = 6.599, range = 2.586–8.070, p < .01). Note that we included accuracies in the model to account for the small differences in task difficulty between the direction conditions (see below), and this never significantly predicted pupil size (t < 1.113, p > .265). Reaction time also did not predict pupil size at any of the time points (t < 1.627, p > .103). Inclusion of saccade amplitude and onset latency to the model yielded highly similar results (see Section S2 in the Supplemental Material), illustrating the robustness of the direction effect.
Target identification accuracy
Following the same logic as in Experiment 1, we compared accuracies to investigate possible influences of task difficulty on pupil size. A repeated measures ANOVA showed a difference in accuracies between saccade directions, F(2, 46) = 4.82, p = .013, η p 2 = .17. Follow-up paired-samples t tests showed that accuracies were lower in the oblique condition (M = .89, SD = .09) compared with the vertical condition, t(23) = 3.77, p = .001, 95% confidence interval (CI) for ΔM = [.016, .056], Cohen’s d = 0.77, and horizontal condition, t(23) = 2.69, p = .013, 95% CI for ΔM = [.009, .071], Cohen’s d = 0.55. Accuracy in the vertical and horizontal conditions (M = .93, SD = .09 vs. M = .92, SD = .08, respectively) did not differ significantly, t(23) = 0.24, p = .809, 95% CI for ΔM = [−.031, .039], Cohen’s d = 0.05. Together, these analyses indicate that oblique trials were more difficult than trials in cardinal directions. However, these differences in performance were numerically small, and accuracies in the oblique condition were still high (for more details, see Section S2). No significant differences were found in reaction times between the direction conditions, F(2, 46) = 0.286, p = .753, η p 2 = .012. Accuracies and reaction times also never significantly predicted pupil size in the LMEMs reported in the previous section—ruling out a confound of task difficulty.
Interim discussion
These results show that it is more costly to program oblique compared with horizontal and vertical saccades. This effect cannot be explained by task difficulty, saccade amplitude, or saccade onset latency. The difference in costs between cardinal and oblique saccades is likely due to additional expenditure of neural resources to realize more complex oculomotor programs for oblique eye movements.
General Discussion
In two experiments, we investigated the costs of shifting attention, indexed with pupillometry. In Experiment 1, we showed for the first time that shifting attention overtly is more costly than shifting attention covertly. Crucially, this effect emerged before the execution of a saccade, potentially reflecting costs associated with oculomotor programming, presaccadic attentional shifting, and predictive remapping. In Experiment 2, we took this a step further and asked whether saccades in different directions are equally costly. We showed that oblique saccades are more costly than cardinal saccades, reflecting the higher complexity of the underlying oculomotor program of oblique saccades.
Deciding how, whether, and where to move visual attention is arguably the most frequent decision that humans make. To understand the factors determining the outcome of this decision, we must know the value of one principal variable in the equation: cognitive cost. The findings presented here show that voluntary saccades are indeed cheap (Findlay & Gilchrist, 2003; Theeuwes, 2012), but not free. Saccades are performed very frequently—on average, humans execute a saccade roughly every 350 ms (Henderson, 2003). Although each individual saccade is associated with subtle costs, these will add up greatly over time. Therefore, covert shifts in attention might represent a cost-efficient alternative to eye movements. Indeed, covert shifts of attention increase acuity and contrast sensitivity and have been shown to aid selection of upcoming saccade targets (Carrasco, 2006; Findlay & Blythe, 2009; Li et al., 2021). On the basis of our results, we propose that the eyes are moved when a task requires foveal vision; otherwise, a cheaper covert shift might be preferred, and the task will be completed using peripheral vision. It remains to be investigated how the results generalize to clinical populations as well as to much younger or older participants. Moreover, the current data are limited to voluntary shifts of attention, and future work should investigate the potentially differing costs associated with reflexive shifting (see Li et al., 2021). Additionally, pupillometry requires strict control over low-level stimulus features such as brightness. Follow-up studies could, however, investigate the presently reported planning costs without some of these constraints by using different physiological and neural measures of effort. This could also reveal how such costs vary as a function of target contrast or other visual features. Mental effort may then even be measured as a biological or neural cost: Wiehler et al. (2022) showed that exertion of mental effort throughout the day causes fatigue, which is linked to weaker task-evoked pupil dilations and increased accumulation of glutamate in the lateral prefrontal cortex.
One could argue that the observed differences in cost between the overt- and covert-attention conditions contradict the premotor theory of attention (see Li et al., 2021; Smith & Schenk, 2012), which assumes that a covert shift of attention is mandatorily accompanied by an oculomotor program (Rizzolatti et al., 1987, 1994). In such an explanation, the increased costs in the overt condition are due to a lack of an oculomotor program in the covert-attention condition. Although the current data do indeed show a difference in costs between overt and covert shifts, the increased costs in the overt-attention condition could also be explained by costs associated with predictive remapping and presaccadic shifting involved in preparing an eye movement. Because the current data cannot quantify the cost of each of these potential contributing subprocesses, they cannot directly support or contradict the premotor theory of attention. Follow-up experiments may provide such insights using similar approaches.
Although the above illustrates how attention is deployed (overtly vs. covertly), an additional decision in the oculomotor process is the direction in which overt attention is moved. Multiple studies have reported a cardinal bias for the direction of saccades (Foulsham & Kingstone, 2010; Gilchrist & Harvey, 2006), and similar patterns have been reported for microsaccades (Engbert & Kliegl, 2003). The observed additional costs of oblique compared with cardinal saccades in our Experiment 2 offer a potential explanation for this. Oblique saccades might be performed only rarely in comparison with cardinal eye movements because they are inherently more costly. Whether the planning costs assessed here indeed predict saccadic direction behaviors remains to be tested in future work.
Pupil size can reveal differences in the complexity of oculomotor programming of oblique, vertical, and horizontal saccades. This further underlines the idea that pupil size may be used as a sensitive index of cost for mental processes, including the planning of motor movements (Just et al., 2003; Kahneman, 1973; Naber & Murphy, 2020; Richer & Beatty, 1985; Strauch et al., 2022). Our data show that pupil size may index motor programming more on a neurocognitive than on a muscular level. To illustrate, performing oblique or vertical saccades requires (at least) two muscle groups, whereas horizontal saccades generally require only one muscle group (Goffart, 2009; Viviani et al., 1977). Thus, if saccade complexity scales with the number of muscles necessary to execute the movement, costs of oblique and vertical saccades should both outweigh those of horizontal saccades. On a neural level, however, oblique saccades are more complex than both vertical and horizontal saccades, requiring the activation of both vertical and horizontal saccade-sensitive neuronal populations in frontal eye fields and superior colliculus—and coordination between them (Bruce et al., 1985; Segraves & Park, 1993). The fact that we observed an additional cost of oblique compared with cardinal saccades indicates that pupil size may thus more closely reflect the costs of the neurocognitive aspect of motor planning and coordination than solely the number of muscles required to perform an upcoming movement.
Taken together, the cost account on visual attention that we have presented here not only quantifies such costs but also provides information about how and where attention is deployed (see also Attneave, 1959; Garner, 1962; Just et al., 2003). Assessing the costs of attentional shifts also provides insight into more complex decisions in visual processing. To complete everyday tasks, humans can choose between different tools from the cognitive repertoire. For example, in some situations, a task can be completed either by shifting attention to sample (or resample) the external world or by storing information internally in visual working memory (Van der Stigchel, 2020). The quantification of the costs associated with shifting attention to the external world allows us to model how and when humans shift their attention in space or prefer to rely on internal representations. Humans constantly need to weigh differentially costly options to optimally process and interact with the rich visual environment surrounding us. We argue that the cost perspective offered here is crucial to further understanding the decisions that shape visual processing from the lowest to the highest level.
Supplemental Material
sj-pdf-1-pss-10.1177_09567976231179378 – Supplemental material for The Costs of Paying Overt and Covert Attention Assessed With Pupillometry
Supplemental material, sj-pdf-1-pss-10.1177_09567976231179378 for The Costs of Paying Overt and Covert Attention Assessed With Pupillometry by Damian Koevoet, Christoph Strauch, Marnix Naber and Stefan Van der Stigchel in Psychological Science
Footnotes
Transparency
Action Editor: Daniela Schiller
Editor: Patricia J. Bauer
Author Contributions
Correction (August 2023):
This article has been updated with revised Figure 2A (the word “Cotrol” was corrected to “Control”) and 2B (“Covert far” was corrected to “Overt Far.”) The legend in Figure 4 has been moved, but the content is not changed.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
