Abstract
Rapid and seemingly effortless word recognition is a virtually unquestioned characteristic of skilled reading, yet the definition and operationalization of the concept of cognitive effort have proven elusive. We investigated the cognitive effort involved in oral and silent word reading using pupillometry among adults (Experiment 1,
One of the most distinctive characteristics of skilled reading is the sheer speed and apparent effortlessness of word recognition. Among reading researchers, there is a broad consensus that fast, near-effortless recognition of printed words (often termed
The measurement of word-reading accuracy and speed is relatively straightforward and has consequently become universal practice in the assessment of reading skill. On the other hand, researchers have yet to reach a consensus on the definition and operationalization of the concept of effort, underscoring the ambiguity of broader constructs such as fluency and automaticity (Logan, 1997; Megherbi, Elbro, Oakhill, Segui, & New, 2018; Moors & De Houwer, 2006; Reynolds & Besner, 2006; Share, 2008; Stanovich, 1990), of which effortlessness is a core property (Kuhn, Schwanenflugel, & Meisinger, 2010; Logan, 1997).
Logan (1997) described effortlessness as the ability to perform a task with a sense of ease while performing other tasks concurrently. Although Logan’s definition has been adopted by many reading researchers (e.g., Kuhn et al., 2010), the operationalization of effortlessness has generated a variety of controversial techniques such as the dual-task paradigm (Pashler, 1994) and the Stroop task (Labuschagne & Besner, 2015; Megherbi et al., 2018). These techniques have proven useful in capturing the outcomes of learning. However, they do not provide online, moment-to-moment insight into the dynamics of word recognition that is at the heart of current item-based models of printed-word learning (Ehri, 2014; Logan, 1988; Perry, Zorzi, & Ziegler, 2019; Share, 1995), which emphasize the micro changes in the reading process that occur as individual words are encountered during reading.
This lacuna casts a shadow over all research concerned with the development of word-reading skill, efficiency, fluency, and automaticity. In the present investigation, we took a first step toward redressing this situation by exploring the use of pupil dilation to study the critical yet neglected issue of cognitive effort in word recognition.
Pupil Dilation as a Measure of Cognitive Effort
The connection between pupil dilation and cognitive activity was noted more than 100 years ago (e.g., Mentz, 1895, cited in Kahneman, Tursky, Shapiro, & Crider, 1969). However, the revival of pupillometry as a measure of cognitive effort occurred only in the 1960s and 1970s (Beatty, 1982; Kahneman, 1973; Kahneman et al., 1969). Since then, pupil dilation has proven to be a sensitive and reliable measure of cognitive effort in a variety of domains including language, memory, decision-making, emotion, and cognitive development (Sirois & Brisson, 2014; van der Wel & van Steenbergen, 2018). While a cognitive task is performed, mental effort arouses the sympathetic system and, correspondingly, pupil diameter increases (Eckstein, Guerra-Carrillo, Singley, & Bunge, 2017). Kahneman (1973) argued that pupillometry is “the best single index” (p. 18) of effort because it captures within-task, between-task, and between-individual variation (for a review, see Beatty, 1982). Surprisingly, pupillometry has been conspicuously absent in reading research. We were able to locate only a handful of pupillometric studies of reading over the past half century since cognitive scientists rediscovered pupillometry.
Carver (1971) was probably the first to study the connection between reading and pupil size. In a study of text-reading difficulty among undergraduates, Carver found no evidence of variation in pupil dilation across difficulty levels. However, he recorded pupil size at a small number of randomly varying text locations, and the identity of specific words fixated was left uncontrolled. Moreover, by using different locations, Carver did not control for changes in gaze angle, thereby overlooking the foreshortening effect that causes the recorded pupil size to diminish as a result of rotation of the eye (Hayes & Petrov, 2016). Since Carver’s work, and perhaps owing to his disappointing results, only a handful of studies have used pupillometry in reading research, typically focusing on sentence reading among highly skilled readers (e.g., Fernández, Biondi, Castro, & Agamenonni, 2016; Just & Carpenter, 1993).
Statement of Relevance
The hallmark of expertise in many skill domains is the speed and apparent effortlessness of task execution. Yet mastering a skill typically starts out with slow, effortful, unskilled performance, gradually shifting with practice toward expert levels. In the case of reading, novices start off reading individual words, often letter by letter, whereas for skilled readers, reading is fast and even automatic. These characterizations come from behavioral measures of accuracy and speed, often labeled fluency. By contrast, direct measurement of the effort involved in reading has been largely neglected. In this investigation, we examined the effort involved in word recognition by analyzing changes in pupil dilation among skilled adult readers and elementary school children. We found that readers in each age group invested more cognitive effort in reading unfamiliar compared with familiar words, in both oral and silent reading. This approach to quantifying effort opens up new possibilities for studying allocation of effort in a range of domains of skill learning.
Among the few studies of word recognition that used pupillometry, Kuchinke, Võ, Hofmann, and Jacobs (2007) found that peak pupil dilation in a lexical decision task was higher for low-frequency words compared with high-frequency words. Another study, conducted by Fernández et al. (2016), examined sentence processing but also looked at word length, predictability, and frequency. The results showed that mean pupil dilation was larger for longer words and smaller for more frequent and predictable words. In another study looking at the processing of single words, Mathôt, Grainger, and Strijkers (2017) found that spoken and written words conveying a sense of darkness (e.g.,
We report four experiments examining the question of effort in word reading through the lens of pupillometry among both skilled adult readers and elementary school children. Because we were interested in reading in general, for each age group, we examined the question of cognitive effort in both oral and silent word reading.
The theoretical framework for the present study is the unfamiliar-to-familiar/novice-to-expert developmental framework outlined by Share (2008). This theory posits a fundamental and universal within-item developmental transition from unfamiliar to familiar (Share, 2008). Because every printed word is, at one point, unfamiliar, the reader must possess some means of independently deciphering novel words and morphemes. The need to identify unfamiliar printed words is crucial for the novice and expert reader alike, because a majority of words have very low frequencies and are rarely encountered in print. On the other hand, the reader must eventually be able to achieve a high degree of unitization, or “chunking,” of letter strings to enable the rapid, near-effortless recognition of familiar words and morphemes via direct memory retrieval (LaBerge & Samuels, 1974; Logan, 1988, 1997; Perfetti, 1985; Perry et al., 2019).
Because our investigation ventured into largely uncharted waters, we took a measured step-by-step approach by first asking whether differences between reading familiar words (real words) and unfamiliar letter strings (pseudowords) are reflected in cognitive effort as measured by changes in pupil size. In each experiment, we predicted that pseudowords would require significantly more effort to read than real words, as indicated by greater overall pupil dilation, higher maximum (peak) dilation, and longer latencies to peak dilation. We also included the standard behavioral measures of pronunciation accuracy and response latencies, anticipating slower responses and lower accuracy for pseudowords. In addition, we examined the length effect, which is widely regarded as reflecting the serial letter-by-letter processing typical of pseudowords. We predicted a familiarity-by-length interaction, in which length effects on both behavioral and pupillometric measures would be greater for pseudowords than real words, as reported previously for response times by Weekes (1997).
Experiment 1: Oral Reading (University Students)
Method
Participants
Because it was not possible to rely on prior research to determine the required sample size, we conducted a power analysis (using G*Power Version 3; Faul, Erdfelder, Lang, & Buchner, 2007) with power set at .80, a relatively conservative alpha of .01, and an intermediate effect size (
Design
The experiment had a fully within-subjects 2 × 2 design with two levels of familiarity (unfamiliar letter strings [pseudowords] vs. familiar real words) and two lengths (three letters vs. five letters). Each of the four conditions contained 40 random items (i.e., 160 target stimuli). The inclusion of an additional 80 fillers yielded a total of 240 trials. These were divided into four blocks, each block containing 20 pseudowords (10 of each length), 20 real words (10 of each length), and 20 fillers. Each stimulus appeared only once during the experiment. Yoked pairs of target stimuli (a real word and its matched pseudoword) were separated by an intervening block (Blocks 1 and 3 or 2 and 4).
Stimuli
Target stimuli
The examination of reading by pupillometry poses a number of challenges regarding potential confounds with luminance because the pupil’s response to light is larger than the pupil’s cognitive response (Granholm & Steinhauer, 2004). We maintained identical luminance levels across conditions by creating yoked pairs of target stimuli (i.e., pairs of real words and pseudowords). We first compiled a list of common real words. For each length (three and five letters), 75 pointed (fully vocalized) words were selected from two academic frequency-based word lists in Hebrew (Balgur, 1968; Mahelman, Rozen, & Shaked, 1960). Our aim was to include words that would be familiar not only to adults but also to children. Only high-frequency words were included, covering various parts of speech. The available corpuses (Balgur, 1968; Mahelman et al., 1960), however, have two shortcomings: They are old and possibly outdated; furthermore, they may not reflect printed word frequencies. To validate the frequency of candidate items, we asked 17 teachers currently teaching fourth- to sixth-grade classes to respond to an online questionnaire containing two separate lists of words: three-letter words and five-letter words. Each list contained 100 words: the 75 high-frequency candidate words and another 25 rare words (i.e., low-frequency words, selected from the lists of Mahelman et al., 1960, and Balgur, 1968). Using a five-point Likert-type scale, teachers were asked to evaluate, “How many times a student in 4th-6th grade would have seen the printed word?” Response options were
Next, for each candidate (high-frequency) real word, we created a matched pseudoword by scrambling the letters while preserving the vowel diacritics (e.g., שֶׁלֶג [ʃɛlɛɡ], the word for
Filler words
Twenty filler words, representing a variety of parts of speech and length (i.e., two to eight letters), were added to each block to provide a more ecologically valid range of word frequencies and minimize possible strategic artifacts that can arise when the set of experimental stimuli includes a large proportion of pseudoword stimuli. From the viewing distance of 57 cm, these stimuli subtended a visual angle of 1.00° to 1.61° for height and 1.31° to 5.82° for width. All stimuli were centered and presented in white text (RGB values = 255, 255, 255) on a gray background (RGB values = 128, 128, 128).
Procedure
The data were collected in a dimly illuminated sound-reduced room at the Edmond J. Safra Brain Research Center for the Study of Learning Disabilities at the University of Haifa. Participants were asked to read aloud all letter strings (words, pseudowords, and fillers), which were presented one at a time on a computer screen. Each block began with an instruction screen, and the participant was asked to read the displayed word aloud. Participants were informed that the printed word would disappear automatically. Two practice trials were then presented. After calibration and validation, a drift correction was displayed and the block began.
Figure 1 illustrates the procedure. Each trial commenced with a central fixation cross presented for 500 ms, followed by a gray fixation screen. The fixation screen presented a string of Xs—the same as the number of characters in the upcoming string—to avoid luminance confounding. This fixation screen appeared for 1,000 ms and was followed immediately by the stimulus word. Because pupil-size changes are characterized by much slower responses than typical behavioral measures such as reaction times (Partala & Surakka, 2003), stimuli remained on the screen for 3,300 ms. The trial ended with a blank screen displayed for 1,500 ms. Pronunciation onset latencies were recorded by a voice key. Each pronunciation was also audio-recorded. During the task, however, errors were manually documented by the tester, who sat behind the participant in front of the host computer.

Example trial sequence in Experiment 1. After viewing a string of Xs (presented to avoid luminance confounding), participants saw a stimulus consisting of a real word or a pseudoword, which they were asked to read aloud.
Apparatus
Pupillometry data were recorded with an EyeLink 1000 Plus (SR Research, Kanata, Ontario, Canada), a video-based eye tracker with a sampling rate of 1,000 Hz. The experimental materials were presented using the EyeLink’s Experiment Builder software. Participants wore headphones (HS-11V stereo headphones with microphone, SilverLine, China), placed their chin on a chin rest, and adjusted the microphone to their mouth. Next, participants were asked to pronounce a sample word (the word שָׁלוֹם [ʃɑlom], which means
Statistical analysis
Pupil-data analysis
We used the divisive baseline-correction method (percentage of relative change = 100 × pupil size/baseline) for analyzing changes in pupil size (e.g., Binda, Pereverzeva, & Murray, 2014). Raw pupil data were analyzed using CHAP software (Hershman, Henik, & Cohen, 2019). For each trial,
Rather than rely on a single dependent measure, we included several common parameters of pupillary responses with the aim of obtaining converging findings across multiple measures. Consequently, we used average pupil dilation as an overall index of the amount of cognitive effort invested in reading each item (Kahneman et al., 1969), peak (maximum) pupil dilation as an indicator of the maximum invested effort, and peak latency (the time elapsed from stimulus onset to peak dilation) as a reflection of processing speed (Zekveld, Kramer, & Festen, 2011).
Response time analyses
Only when both members of yoked pairs were pronounced correctly were their naming latencies included in the analysis. Response times greater than 2 standard deviations above or below the participant mean were excluded. Finally, for each of the four experimental conditions, response times were averaged within participants.
Accuracy analyses
For each participant, we calculated the percentage of target stimuli pronounced correctly in each of the four experimental conditions.
Results
Pupil dilation
Pupillary data were submitted to a 2 (familiar vs. unfamiliar) × 2 (three letters vs. five letters) repeated measures analysis of variance (ANOVA) using a time window from stimulus onset to 4,300 ms (1,000 ms after stimulus offset). Figure 2 displays the average proportional changes of pupillary responses in the four conditions in Experiment 1.

Relative changes in pupil size for the four conditions in Experiment 1, from stimulus onset (Time 0, the dashed vertical line) to 1,000 ms after stimulus offset. The shaded areas depict standard errors of the mean.
Mean relative changes in pupil size
The mean relative changes in pupil size over this time window revealed a significant main effect for word familiarity,
Peak dilation
A two-way repeated measures ANOVA of peak pupil dilation also revealed significant main effects for both word familiarity,
Latency to peak dilation
The third pupillary measure, latency to peak dilation, also revealed a significant main effect for word familiarity,
Behavioral data
Both times and accuracy for yoked pairs pronounced correctly were analyzed with a two-way repeated measures ANOVA with familiarity (familiar vs. unfamiliar) and length (three letters vs. five letters) as within-subjects factors.
Pronunciation onset latencies
For pronunciation onset latencies, the main effect for word familiarity was significant,
Mean Pronunciation Onset Latencies and Accuracy in Oral Word Reading Among University Students (
Note: Standard deviations are given in parentheses.
Pronunciation accuracy
For pronunciation accuracy, a main effect for familiarity was observed,
Summary
In summary, our results clearly showed that pupillary responses are indeed sensitive to the familiarity and length of individual letter strings. As anticipated, unfamiliar letter strings appear to require greater cognitive effort, as indicated by multiple measures of pupil dilation. Both overall pupil-size changes and peak-dilation analyses (but not peak latency) confirmed greater length effects for pseudowords compared with real words, indicating that reading longer letter strings, especially longer pseudowords, demands additional mental effort. Furthermore, the pupillary data were largely in accordance with the behavioral predictions regarding lower accuracy and slower pronunciation latencies for pseudowords, longer strings, and their interaction. Interestingly, the attenuated length effect for real words was statistically significant on both mean dilation and peak dilation but not significant on pronunciation accuracy and speed, hinting that pupillometric measures may be more sensitive to word-level variables than traditional speed and accuracy measures.
To ensure that our findings were not simply task specific, resulting perhaps from pupillary responses for speech output (i.e., articulatory demands), we conducted a follow-up study (Experiment 2) in which we traced pupillary responses during silent word reading. This study used a novel variant of the delayed naming task that we call the “silent-then-oral-reading” procedure.
Experiment 2: Silent-Then-Oral Reading (University Students)
Method
Participants
Twenty-three students from the University of Haifa participated in this experiment. On the basis of an a priori power analysis (G*Power) using the reported effect size from Experiment 1, we estimated that a required sample size of 18 participants would be necessary to achieve an effect size (
Design and procedure
The design, stimuli, and apparatus were the same as in Experiment 1. The procedure was similar to that in Experiment 1, with some exceptions. Each trial commenced, as in Experiment 1, with a central cross presented for 500 ms, followed by a gray fixation screen with a string of Xs for 1,000 ms, followed immediately by the stimulus item. Here, instead of reading aloud, participants were asked to read the displayed stimulus silently and press a response button after completing a single reading. The stimulus disappeared when the key was pressed or 4,000 ms after stimulus onset in the case of a missing response. Next, a blank screen was presented for 1,500 ms (following Hershman & Henik, 2019). This was followed by the simultaneous presentation of a 300-ms auditory tone (beep) and the reappearance of the letter string. At this point, participants were asked to read the stimulus aloud to allow the tester to document reading accuracy, on the assumption that the accuracy of oral reading at the second appearance of the stimulus would, in the vast majority of cases, reflect the accuracy of the immediately preceding silent reading. The trial ended with a blank screen displayed for 1,500 ms. Figure 3 illustrates the procedure.

Example trial sequence in Experiment 2. After viewing a string of Xs, participants saw a stimulus consisting of a real word or a pseudoword, and they had to press a response key to indicate that they had silently read the stimulus a single time. Following a 1,500-ms interval, the stimulus was presented again, together with an auditory tone (beep). Participants then read the stimulus aloud. RT = response time.
Statistical analysis
Pupil-data analyses included only yoked pairs of target stimuli that were pronounced correctly. We omitted responses to filler words as well as incorrect responses. Because we were interested in silent word reading, we excluded items that were pronounced aloud inaccurately as well as items with missing response times for key presses indicating completion of the silent reading. Because response times varied from trial to trial for each participant, we created a mean score for each condition based on each individual per-trial time window from stimulus onset to tone. Response time and accuracy analyses were the same as in Experiment 1.
Results
Pupil dilation
A fully within-subjects 2 × 2 ANOVA with two levels of familiarity (familiar vs. unfamiliar) and two lengths (three letters vs. five letters) was conducted using a time window from stimulus onset to auditory tone (1,500 ms after silent reading). Figure 4 presents the average pupillary responses in the four conditions in Experiment 2.

Relative changes in pupil size for the four conditions in Experiment 2 from stimulus onset (Time 0; the black, dashed vertical line) to trial offset. Each colored, dashed vertical line marks 1,500 ms after the mean response time for the given condition. For each condition, the time course from stimulus onset to the colored vertical line represents the silent-reading mode. The time course from the colored vertical line to trial offset represents the oral-reading mode. The shaded areas depict standard errors of the mean.
Mean relative changes in pupil size
For relative changes in pupil size, we obtained a main effect of word familiarity,
Peak dilation
For peak dilation, the main effect of familiarity was again significant,
Latency to peak dilation
For latency to peak dilation, all effects were, once again, significant: a main effect for word familiarity,
Behavioral data
Both response times and accuracy for yoked pairs pronounced correctly were again analyzed using a two-way repeated measures ANOVA with familiarity (familiar vs. unfamiliar) and length (three letters vs. five letters) as within-subjects factors.
Response times
For response times, we observed main effects for word familiarity,
Mean Response Times and (Estimated) Accuracy in Silent Word Reading Among University Students (
Note: Standard deviations are given in parentheses.
Pronunciation accuracy
For pronunciation accuracy, the main effect for familiarity was significant,
Summary
In summary, Experiment 2 confirmed that silent word reading of unfamiliar letter strings (pseudowords) indeed demands more effort than silent reading of familiar (real) words, as indicated by each of the pupillometric measures (mean, peak dilation, and peak latency). Furthermore, the multiple pupillometric analyses clearly pointed to length effects only for pseudowords, emphasizing the greater mental effort invested in reading longer pseudowords silently. Thus, Experiments 1 and 2 together supply clear evidence that pupillary responses are sensitive to word familiarity and its interaction with length in both oral and silent reading among skilled adult readers.
Experiments 3 and 4 replicated these findings with elementary school children.
Experiment 3: Oral Reading (Fourth to Sixth Graders)
Method
Participants
A pool of 38 children in the upper elementary grades (fourth to sixth grades) was recruited for this experiment. Four children reported attentional difficulties and were excluded. The other 34 participants, all native Hebrew speakers, reported no past or present reading difficulties or attentional deficits and had normal or corrected-to-normal vision. The data from four participants were excluded because they did not reach a minimum of 20 valid trials in each of the four conditions (i.e., 50% correct responses with no more than 20% of missing pupil values). The final sample numbered 30 participants (17 female; age:
Design and procedure
As in Experiment 1, this experiment had a fully within-subjects 2 × 2 design with two levels of familiarity (familiar vs. unfamiliar) and two lengths (three letters vs. five letters). However, to accommodate the younger age group, the current experiment included slightly fewer trials: 200 (instead of 240) likewise divided into four blocks. Each block contained 50 items, 10 fillers (instead of 20 for the adults), and the same 20 pseudowords (10 of each length) and 20 real words (10 of each length).
The procedure and apparatus were the same as in Experiment 1, with the exception that the duration of stimulus presentation was longer (4,700 ms) to accommodate this younger sample (Fig. 5). As in Experiment 1, target stimuli were yoked pairs (a real word and its matched pseudoword). Filler words were randomly selected from the fillers in Experiment 1. Data were analyzed as in Experiment 1.

Example trial sequence in Experiment 3. After viewing a string of Xs, participants saw a stimulus consisting of a real word or a pseudoword, which they were asked to read aloud. The experiment differed from Experiment 1 in that the time allocated for participants’ response was longer.
Results
Pupil dilation
Pupillary data were analyzed over a time window from stimulus onset to 5,700 ms (1,000 ms after stimulus offset). Changes in pupil dilation for the four conditions are depicted in Figure 6.

Relative changes in pupil size for the four conditions in Experiment 3 from stimulus onset (Time 0, the dashed vertical line) to 1,000 ms after stimulus offset. The shaded areas depict standard errors of the mean.
Mean relative changes in pupil size
Analyses of relative changes in pupil size replicated the pattern of adult data in Experiment 1. We found a significant main effect for word familiarity,
Peak dilation
For peak dilation, too, ANOVAs indicated significant main effects for both word familiarity,
Latency to peak dilation
In ANOVAs examining latency to peak dilation, the main effect for word familiarity was again significant,
Behavioral data
Pronunciation onset latencies
Analyses of pronunciation onset latencies revealed a main effect for word familiarity,
Mean Pronunciation Onset Latencies and Accuracy in Oral Word Reading Among Fourth to Sixth Graders (
Note: Standard deviations are given in parentheses.
Pronunciation accuracy
Accuracy analyses also produced a main effect for familiarity,
Summary
In summary, the results of this experiment essentially replicated Experiment 1 and extended the results obtained with skilled adult readers to developing readers. More cognitive resources were required to read unfamiliar letter strings compared with familiar (real word) strings, as reflected in multiple pupillary measures (mean relative changes, peak dilation, and peak latency), consistent with lower accuracy and slower pronunciation times. We also observed greater length costs for pseudowords compared with real words not only on behavioral measures but also on pupillometric measures of mean overall changes and peak dilation.
To our knowledge, this is the first study to successfully apply pupillometric methods to single-word reading in children. Here, too, we conducted a follow-up silent-then-oral-reading study among another sample of fourth- to sixth-grade children (Experiment 4) to confirm that our findings are generalizable to word reading as such, as opposed to being a task-specific effect possibly reflecting the articulatory-motor demands of vocalization.
Experiment 4: Silent-Then-Oral Reading (Fourth to Sixth Graders)
Method
Based on the reported effect size from Experiment 3, an a priori power analysis (G*Power) estimated that a sample size of 18 participants would be necessary to achieve an effect size (
The design, stimuli, and apparatus were the same as in Experiment 3. The procedure and statistical analyses were the same as in Experiment 2.
Results
Pupil dilation
A within-subjects 2 × 2 design with two levels of familiarity (familiar vs. unfamiliar) and two lengths (three letters vs. five letters) was conducted using a time window from stimulus onset to the auditory tone (1,500 ms after the participant’s key press indicating completion of silent reading) to examine pupil dilation. Figure 7 presents the average pupillary responses in the four conditions in Experiment 4.

Relative changes in pupil size for the four conditions in Experiment 4, from stimulus onset (Time 0; the black, dashed vertical line) to trial offset. Each colored, dashed vertical line marks 1,500 ms after the mean response time (completion of silent reading) for the given condition. For each condition, the time course from stimulus onset to the colored vertical line represents the silent-reading mode. The time course from the colored vertical line to the trial offset represents the oral-reading mode. The shaded areas depict standard errors of the mean.
Mean relative changes in pupil size
Analyses of relative changes in pupil size replicated the silent-reading results for adults in Experiment 3. First, we found a significant main effect for word familiarity,
Peak dilation
For the measure of peak dilation, we also obtained significant main effects for both word familiarity,
Latency to peak dilation
Results for latency to peak dilation were consistent with the outcomes for mean dilation and peak dilation, showing a significant main effect for word familiarity,
Behavioral data
Response times
The ANOVA on response times yielded a main effect for word familiarity,
Mean Response Times and Accuracy in Silent Word Reading Among Fourth to Sixth Graders (
Note: Standard deviations are given in parentheses.
Pronunciation accuracy
For pronunciation accuracy, there was again a main effect for familiarity,
Summary
In summary, replicating the silent-reading results of Experiment 2 with university students, Experiment 4 confirmed that readers in the fourth to sixth grades also invested more cognitive effort in silently reading pseudowords compared with real words, as indicated by multiple pupillary measures (mean relative changes, peak dilation, and peak latency). Furthermore, we consistently obtained length effects for pseudowords but not for real words on multiple pupillometric measures. Together, the results of Experiments 3 and 4 support the contention that changes in pupil size are a reliable and sensitive index of the cognitive effort invested in both oral and silent reading among developing readers.
Discussion
For more than a century, the notion of mental effort, and effortlessness in particular, has been a common denominator in the psychological literature on skill learning in general and visual word recognition in particular. Like the broader, multifaceted constructs of automaticity and fluency of which it is a defining property (Kuhn et al., 2010; Logan, 1997), word-reading effortlessness or near effortlessness has long been regarded as a distinctive feature of skilled reading. The obverse case of unskilled or impaired reading is typically defined as inaccurate or slow and effortful (American Psychiatric Association, 2013). Yet despite the continued popularity and intuitive appeal of these bread-and-butter concepts, their definition and operationalization have proven surprisingly elusive. In the present investigation, we set out to redress this gap in our knowledge by exploring the applicability of pupillometry as a direct measure of the cognitive effort involved in word reading.
Our findings provided clear evidence that pupillary responses are sensitive to the cognitive effort involved in single-word reading not only among skilled readers (Fernández et al., 2016; Kuchinke et al., 2007; Mathôt et al., 2017) but also among school-age readers in both oral- and silent-reading modes. The data from four experiments were near unanimous in showing that readers, both young and old, are not only slower and less accurate but also allocate more cognitive resources when reading unfamiliar letter strings (i.e., pseudowords) compared with familiar (real) words. Furthermore, our study also examined the length effect—widely regarded as reflecting reliance on the serial, letter-by-letter processing typical of unfamiliar letter strings. We predicted and repeatedly confirmed a significant familiarity-by-length interaction; length effects on behavioral and pupillometric measures were consistently stronger for pseudowords than for real words. These findings corroborate the widespread assumption that reading via a sequential process of letter-to-sound translation and synthesis indeed demands more cognitive resources than reading via direct memory-retrieval mechanisms (e.g., Ehri, 2014; LaBerge & Samuels, 1974; Logan, 1988, 1997; Share, 1995, 2008). This observation, moreover, merges the study of reading with the study of human skill learning in general (e.g., Anderson, 1981; Logan, 1988). Common to almost all skill learning is a transition from slow, effortful, step-by-step, unskilled performance to rapid, near-effortless, one-step, or “unitized” skilled performance.
If replicated, 4 our findings have the potential to open up new avenues of research capable of providing a deeper understanding of the ubiquitous but troublesome concepts of fluency and automaticity. Pupillometry may offer reading researchers a more sensitive moment-by-moment glimpse into the dynamics of word recognition (including developmental, interindividual, and intraindividual variation) that goes beyond the standard measures of skill growth such as reading accuracy and rate or some combination of these two (such as words correctly read per minute). And because learning to read is a paradigmatic case of skill learning, pupillometry has potentially far-reaching applications to a wide variety of domains of skill learning.
We acknowledge that our study is only a first sortie into uncharted waters. This essentially pretheoretical investigation, nonetheless, raises a host of questions for future work. What is the nature of the association between pupil dilation and standard measures of reading proficiency, and how does this vary across and within levels of reading ability? When does a novel printed word become a familiar unitized orthographic pattern in the course of repeated exposures, and how does this relate to the shape of the effortful-to-(near)-effortless trajectory? Is the learning function monotonic, is it discontinuous with a critical threshold, or does it follow the well-known reaction time power law (Logan, 1988)? Does the disabled reader’s word reading remain forever effortful? What exactly is “effort” in the brain? These are just some of the many questions that lie ahead.
Footnotes
Acknowledgements
We thank Sam Hutton, Stav Magalnik, and Amir Yair for their assistance in designing the experiments. We thank Ronen Hershman, Stuart Steinhauer, Noga Cohen, and Amit Yashar for their valuable comments. We also thank Tami Katzir for her support in this work. Finally, we are grateful to the children, the parents, and the students who participated in this study.
Transparency
D. L. Share conceived the idea for this study. A. Shechter developed the experimental design and materials, implemented the experiments, analyzed the results, and wrote the first draft of the manuscript. Both authors revised and approved the final manuscript for submission.
