Abstract
Current neurocognitive research suggests that the efficiency of visual word recognition rests on abstract memory representations of written letters and words stored in the visual word form area (VWFA) in the left ventral occipitotemporal cortex. These representations are assumed to be invariant to visual characteristics such as font and case. In the present functional MRI study, we tested this assumption by presenting written words and varying the case format of the initial letter of German nouns (which are always capitalized) as well as German adjectives and adverbs (both usually in lowercase). As evident from a Word Type × Case Format interaction, activation in the VWFA was greater to words presented in unfamiliar case formats relative to familiar case formats. Our results suggest that neural representations of written words in the VWFA are not fully abstract and still contain information about the visual format in which words are most frequently perceived.
Keywords
Competent readers are affected only to a minor extent by the specific appearance of written words (e.g.,
The assumption that letters and words are represented abstractly has also been adopted by recent neuroscientific research. Electrophysiological studies, for example, have suggested the existence of an abstract letter-identification processing stage distinct from a preceding letter-form-identification stage (e.g., Carreiras, Perea, Gil-López, Mallouh, & Salillas, 2013). Neuroimaging studies suggest that visual word recognition relies on a hierarchy of increasingly larger and more abstract neural representations along the left ventral visual pathway (e.g., Dehaene, Cohen, Sigman, & Vinckier, 2005). Central to this account is the so-called
Evidence for the assumption that representations in the VWFA are abstract was initially provided by a study showing that activation in this region was invariant to the retinal location of presented words (Cohen et al., 2000). Of main importance, however, were functional MRI (fMRI) priming studies, which found repetition-suppression effects for words in the VWFA to be independent of case (Dehaene et al., 2004; Dehaene et al., 2001; see also Devlin, Jamison, Gonnerman, & Matthews, 2006). Subliminal primes presented in a case different from that of the target word (e.g.,
However, several previous findings are difficult to reconcile with the assumption that letter and word representations in the VWFA are abstractions that do not include specific visual attributes. Burgund, Guo, and Aurbach (2009), for example, failed to find case-independent repetition suppression for letters in the VWFA (see also Gauthier et al., 2000, for similar findings using letters in different fonts). Doubts about abstract representations were also raised by studies that compared words presented in an unfamiliar mixed-case format (e.g.,
The aim of the present fMRI study was to provide a more stringent test of whether representations in the VWFA are fully abstract or still contain information about the visual format in which words are most frequently perceived. To this end, we based our study on behavioral research that has shown that even minor deviations from the familiar visual format—such as presenting the initial letter of a word in an unfamiliar case—affect word-recognition speed (Jacobs, Nuerk, Graf, Braun, & Nazir, 2008; Peressotti, Cubelli, & Job, 2003). Following Jacobs et al. (2008), we presented German words with the initial letter in either uppercase or lowercase. The presented words were either nouns or nonnouns (i.e., adjectives and adverbs). Critically, while German nouns are always seen with initial capitalization (e.g.,
We expected that if representations in the VWFA are fully abstract (Dehaene et al., 2004; Dehaene et al., 2001), the present case-format manipulation should have no significant effect on activation in this region because recognition of both familiar and unfamiliar case formats should be supported by the same abstract representations. If, however, representations in the VWFA do contain information about the visual format in which words are most frequently perceived, the present case-deviant forms should violate these representations. This should result in an interactive effect of word type (nouns vs. nonnouns) and case format (uppercase vs. lowercase) on VWFA activation, with increased activation for words presented in unfamiliar case formats.
Method
Participants
Twenty-six German-speaking university students (13 female, 13 male) between the ages of 20 and 41 years (
Task and stimuli
For the in-scanner task, participants were instructed to indicate with a two-choice key press whether the presented stimulus was or was not an existing German word (i.e., a lexical decision task). All participants saw the same 384 items (half words, half pseudowords), but each participant saw the items in one of two pseudorandomized lists. Half of the word items were nouns, and the other half were adjectives and adverbs (i.e., nonnouns). Half of the items in each category (i.e., nouns, nonnouns, and pseudowords) were presented with the initial letter in uppercase, and the remainder were presented with the initial letter in lowercase. The case format of the first letter of each item varied between the two pseudorandomized lists. Counterbalancing the lists ensured that both forms were presented equally often. As shown in Table 1, pseudowords were roughly matched to words with respect to number of letters, bigram frequency (based on the CELEX database; Baayen, Piepenbrock, & van Rijn, 1993), and number of orthographic neighbors (i.e., Coltheart’s
Comparison of the Mean Characteristics of the Word and Pseudoword Items
Note: Standard deviations are given in parentheses.
The 384 items were presented in two experimental runs of 192 items each, with an equal number of words and pseudowords. Each item was displayed for 800 ms, with an interstimulus interval (ISI) of 2,100 ms, during which a fixation cross was shown. This stimulus onset asynchrony of 2,900 ms was not an integer of the repetition time of 2,000 ms (see fMRI Data Acquisition and Analysis), which enhanced the efficiency of the design by sampling the hemodynamic response at different time points. In addition to the items, 40 null events of 2,900-ms duration, during which only a fixation cross was presented, were included in each run. The null events were included to improve evaluation of stimulus-related activation relative to baseline.
Participants were familiarized with the lexical decision task outside the scanner. During scanning, visual stimuli were projected on a semitransparent screen by a video projector outside the scanner room. Participants used a magnetic-resonance-compatible response box, responding with the index finger (“yes”) and the middle finger (“no”) of their right hands. Stimulus delivery and response registration were controlled by Presentation software (Neurobehavioral Systems, Albany, CA).
fMRI data acquisition and analysis
During each of the two functional-imaging runs, 340 images sensitive to blood-oxygen-level-dependent (BOLD) contrast were acquired with a T2*-weighted echo-planar imaging sequence (flip angle = 70°, repetition time = 2,000 ms, echo time = 30 ms, field of view = 210 mm, 64 × 64 matrix). Thirty-six descending axial slices (thickness = 3.0 mm, interslice gap = 0.3 mm) were acquired. Additionally, a high-resolution (1- × 1- × 1.2-mm) structural scan was acquired using a T1-weighted magnetization-prepared rapid-acquisition gradient-echo sequence. Participants 1 to 16 were scanned with an Achieva 3 Tesla scanner (Philips Medical Systems, Best, The Netherlands) using an eight-channel head coil. The remaining participants were scanned with a Magnetom Trio 3 Tesla scanner (Siemens, Erlangen, Germany) using a 12-channel head coil.
For preprocessing and statistical analysis, we used Statistical Parametric Mapping software (SPM8; Wellcome Trust Centre for Neuroimaging, London, United Kingdom; www.fil.ion.ucl.ac.uk/spm/) running in a MATLAB environment (Version 7.6; The MathWorks, Natick, MA). Preprocessing steps for the functional images included realigning and unwarping of the images to correct for head motion during the scan and slice-time correction. Images were normalized into a common space with the help of the high-resolution structural image. Using the VBM8 toolbox (http://dbm.neuro.uni-jena.de/vbm8), we (a) segmented the structural image into gray matter, white matter, and cerebrospinal fluid; (b) denoised the image; and (c) warped the image into the Montreal Neurological Institute (MNI) standard space using the high-dimensional DARTEL registration algorithm (Ashburner, 2007). Additionally, a skull-stripped version of the structural image was created in native space. The functional images were (a) coregistered to the skull-stripped structural image and (b) normalized to the MNI standard space using the parameters from the DARTEL registration of the structural image. Finally, the functional images were resampled to 2- × 2- × 2-mm voxels and smoothed with a 6-mm full-width half-maximum Gaussian kernel.
Statistical analysis of the fMRI data was performed within a two-stage mixed-effects model. In the first level (i.e., subject-specific level), we built a general linear model (Henson, 2004) including one regressor per item type (i.e., uppercase nouns, lowercase nouns, uppercase nonnouns, lowercase nonnouns, uppercase pseudowords, lowercase pseudowords). The regressors consisted of the trial onsets of the corresponding item type modeled by a stick function convolved with a synthetic hemodynamic response function. Additionally, six covariates corresponding to the movement parameters (rotations and translations) were included. The functional imaging data in these first-level models were high-pass filtered with a cutoff of 128 s and corrected for autocorrelation by an AR(1) model (Friston et al., 2002). For each participant, we computed contrast images reflecting signal change for each item type relative to fixation baseline (i.e., ISIs and null trials). These images were then used for the second-level (i.e., group-level) random-effects analysis. For statistical comparisons on the group level, we used a voxelwise threshold of
Results
Behavioral results
The present lexical decision task posed little difficulty for participants; there was an average of 95% correct responses across all trials. Participants were more accurate in correctly rejecting pseudowords (
Repeated measures 2 (word type: nouns vs. nonnouns) × 2 (case format: uppercase vs. lowercase) analyses of variance (ANOVAs) for the word items showed an interaction between the two factors for both accuracy,

Behavioral results. Mean accuracy (left) and response time (right) in the lexical decision task are shown as a function of word type and case format. Error bars denote ±1
fMRI results
Of main interest for our hypothesis was the identification of brain regions with a differing response to the case format of the initial letter between nouns and nonnouns. To this end, we performed a 2 (word type) × 2 (case format) ANOVA on brain activation for the word items. To avoid differences arising from deactivations, we masked the analysis with a words > fixation baseline contrast (

Functional MRI results. The brain map shows activation clusters identified by the Word Type (nouns vs. nonnouns) × Case Format (uppercase vs. lowercase) interaction for word items. The activation clusters, in left ventral occipitotemporal cortex (vOT) and left superior parietal lobule (SPL), are superimposed on a lateral view of the left hemisphere. The graphs show mean brain-activation estimates (in arbitrary units) as a function of word type and case format, separately for each cluster (peaks are given in Montreal Neurological Institute coordinates). Estimates were obtained relative to fixation baseline. Error bars denote ±1
Functional MRI Results: Brain Regions Showing a Word Type (Nouns vs. Nonnouns) × Case Format (Uppercase vs. Lowercase) Interaction Effect
Note: The two rightmost columns show peak
Additional ANOVA findings were that nonnouns elicited higher activation than nouns in a cluster located in the right angular gyrus (peak:
In a separate analysis, we searched for brain regions exhibiting a case-format effect for the pseudoword items. As in the analysis of the word items, we identified higher activation for uppercase compared with lowercase pseudowords in the right lingual gyrus (peak:
Finally, we compared activation between pseudowords and words (pooled across nouns and nonnouns). As Table 3 shows, higher activation for pseudowords relative to words was found in the left precentral gyrus and in the supplementary motor area. Higher activation for words compared with pseudowords was found in the left angular gyrus and the right supramarginal gyrus. No significant differences between activations for words and pseudowords were found in left vOT or left SPL.
Functional MRI Results: Brain Regions Identified by the Main Effect of Item Type (Words vs. Pseudowords)
Note:
Discussion
In the present study, we investigated the predominant assumption in neurocognitive research that visual word recognition rests on abstract neural representations for written letters and words in the VWFA in the left vOT (Dehaene & Cohen, 2011; Dehaene et al., 2005). Several previous findings had raised doubts about the abstractness of orthographic representations and suggested that they might still contain information about the visual format in which words are most often seen (e.g., Jacobs et al., 2008). The present study showed that a minor violation of the typical visual format of German words (i.e., presenting the initial letter in an unfamiliar case format) increased brain activation in a left vOT region corresponding to the classic localization of the VWFA (Cohen et al., 2000; Cohen et al., 2002). This finding stands in contrast to the view that activation in this region is invariant to the specific visual appearance of words (Dehaene & Cohen, 2011).
By manipulating the case format of the initial letter of both German nouns (always seen capitalized) and nonnouns (mostly seen in lowercase), we were able to investigate the effect of case-format familiarity on VWFA activation independent of visual factors (i.e., physical case format). This overcame the drawbacks of previous neuroimaging studies (Kronbichler et al., 2009; Xu et al., 2001) that found increased left vOT activation for unfamiliar mixed-case formats, which are also known to result in low-level visual difficulties (Mayall et al., 1997). In contrast to these findings, the present case-familiarity effect was restricted to the left vOT region corresponding to the VWFA and was not seen in more posterior regions. Therefore, the present effect in the VWFA cannot be interpreted as a downstream effect of high activation in occipital regions. We did, however, identify a right occipital region that exhibited higher activation for uppercase relative to lowercase letters. This finding is in line with previous research showing that physical characteristics of visual words, such as number of letters, affect activation in early visual regions (Mechelli, Humphreys, Mayall, Olson, & Price, 2000; Schurz et al., 2010).
Proponents of abstract representations in the VWFA have argued that increased activation in the region might be the result of top-down processes rather than a reflection of the nature of local representations (Dehaene & Cohen, 2011). For example, it could be argued that the unfamiliarity of the case format might be detected only after the instantiation of abstract representations in the VWFA at the level of grammatical processing (i.e., on the basis of capitalization rules such as “if noun, then uppercase”), which leads to a top-down reactivation of the VWFA. However, if this were the case, it should have also resulted in increased activation in brain regions associated with higher language processes. The finding that no increased activation to the unfamiliar case formats was observed in any temporal or frontal brain regions associated with language processing (Price, 2012) speaks against the concern that the observed increased activation in the VWFA was driven by higher language processes.
Another possible concern with the present findings is that because fMRI integrates the brain signal over a long period of time, increased VWFA activation could also reflect greater processing time (Dehaene & Cohen, 2011). Critically, unfamiliar case formats of words resulted not only in increased VWFA activation but also in longer RTs relative to familiar formats. However, the RT difference between unfamiliar and familiar case formats for words (
In addition to the VWFA, a left SPL cluster (
Because the case format of the initial letter is a characteristic of whole words, the present findings support the view that the VWFA hosts representations for whole words (Glezer, Jiang, & Riesenhuber, 2009; Kronbichler et al., 2004; Ludersdorfer, Schurz, Richlan, Kronbichler, & Wimmer, 2013) and thus might serve as an orthographic lexicon, as posited by dual-route models of reading (e.g., Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001). Support for this view also comes from previous studies that found increased VWFA activation for unfamiliar compared with familiar spellings of the same phonological words (e.g.,
In conclusion, the findings of the present study suggest that neural representations of written words in the VWFA contain information about the visual format in which words are most frequently perceived. Such a grounding of memory representations in visual perception is denied by current neuroscientific models of visual word recognition (Dehaene et al., 2005), which assume that these representations are abstract and thus invariant to visual characteristics, such as font or case. However, the fact that visual word recognition is robust enough to deal with even very unfamiliar formats (e.g.,
Footnotes
Action Editor
Charles Hulme served as action editor for this article.
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
Funding
This research was supported by Austrian Science Foundation Grant No. FWF P-23916-B18 to M. Kronbichler. P. Ludersdorfer was supported by the Doctoral College “Imaging the Mind” of the Austrian Science Foundation (Grant No. FWF-W1233), and F. Richlan was supported by the Austrian Agency for International Cooperation in Education and Research (OeAD PL 11/2015).
