Abstract
When we use language, we combine sounds, signs, and letters into words that then form sentences, which together tell a story. Both language production and language comprehension rely on representations that need to be continuously and rapidly activated, selected, and combined. These representations are specific to language, but many processes that regulate their use, such as inhibition of competitors or updating of working memory, are domain-general abilities that apply across different kinds of tasks. Here, we provide an overview of the behavioral and neurophysiological evidence for domain-general abilities underpinning language skills and describe which aspects of production and comprehension draw on such cognitive resources. We discuss how this line of research reveals important similarities between production and comprehension and also helps establish links between language and other cognitive domains. Finally, we argue that studying how domain-general abilities are used in language leads to important insights into the highly dynamic communication between brain networks that is necessary to successfully go from sounds to stories.
Language is a cornerstone of human culture, allowing information to be shared across time and space. On a typical day, an adult speaker will produce around 15,000 words. Even producing just a single word requires several processing steps, such as preparing the concept that best conveys the intended message, selecting the appropriate word, and finding and arranging the sounds or sign constituents for which to prepare a motor plan. At the sentence level, the producer needs to sequence the words to express who is doing what to whom and to signal which information is new to the listener. In turn, comprehenders typically take in spoken, signed, or written words at a rate of two to three per second, and this rapid stream of information must be decoded, linked to stored knowledge, and used to build a message. The perceptual information is often unclear (“noisy”), words and phrases are ambiguous, and new information regularly prompts revision of the unfolding message. Thus, comprehending and producing language are among the most complex of human cognitive skills.
A long-standing view posits that humankind’s language capabilities are different from other cognitive abilities, requiring specialized processes, brain areas, and even genes (e.g., Pinker, 1994). An alternative view (e.g., Bates et al., 1996) emphasizes instead that identifying familiar stimuli and accessing their meaning (as in language comprehension), sequencing and planning actions (as in language production), and selecting, maintaining, and updating information over time (as in both) are processes that are used across a wide range of tasks (i.e., are
Getting Words Out
Even though speakers have tremendous experience with language production, because of its complexity and flexibility, this highly practiced skill is not fully automatic. In a seminal study, Ferreira and Pashler (2002) showed that producing even one word requires general cognitive resources. In their dual-task paradigm, participants named pictures while also discriminating simple tones. Picture-naming difficulty was varied, for instance, by making the pictures more or less predictable following a sentence context. Critically, making picture naming harder also slowed tone discrimination. Responding to the tone needed to be postponed as production planning was unfolding, indicating that these two tasks both draw from a domain-general set of resources.
Other methods, such as the study of individual differences, can reveal the specific kinds of domain-general abilities that are used during these production processes. For example, Shao, Roelofs, and Meyer (2012) had participants perform a picture-naming task and two standard tasks used to tap into domain-general abilities that do not rely on language: the operation span task, which measures the ability to maintain and manipulate (update) information in working memory, and the stop-signal task, which measures the ability to suppress prepotent responses (inhibition). Correlations between the tasks revealed that individuals with poorer updating and inhibition ability had a larger proportion of very slow naming trials. Jongman, Roelofs, and Meyer (2015) showed a similar relationship between picture naming and sustained attention, the ability to maintain focus over a prolonged period of time (as measured by a continuous performance test requiring participants to monitor for an infrequent visual target). Therefore, even for what appears to be a very simple production task, domain-general resources are required, and this relationship is most evident for slow trials—arguably those trials in which people failed to maintain attention, update, or exert inhibition.
To test the hypothesis that slow naming trials are, in part, due to a failure to maintain attention, Jongman, Roelofs, and Lewis (2020) used measures of brain electrical activity via the electroencephalogram (EEG), taking advantage of a common EEG finding from studies of perception of simple, nonlinguistic stimuli. Alpha power (the synchronized rhythmic activity at 8–12 Hz of a large population of neurons) measured before a stimulus is presented predicts later performance, including response times. Alpha power is linked to attention because it reflects the regulation of information flow through the inhibition of irrelevant brain areas. Good performance is preceded by a decrease in alpha power over task-relevant brain regions and an increase in alpha over task-irrelevant areas (i.e., Haegens, Luther, & Jensen, 2012). Jongman et al. found that alpha over motor areas predicts picture-naming latencies, suggesting motor regions need to be inhibited while early stages of production planning unfold. Unsuccessful inhibition (less alpha), due to a lapse of attention, results in increased interference from premature articulation preparation and its corresponding motor activity, which, in turn, results in slower naming.
Other measures of domain-general processes can be obtained from examining event-related potentials (ERPs), EEG signals that are yoked in time to events of interest. Shao, Roelofs, Acheson, and Meyer (2014) used the N2 to test whether inhibition is involved in word production. The N2, which occurs 200 ms to 350 ms after stimulus onset, has been characterized in a number of nonlinguistic tasks and is thought to reflect inhibition. N2s are observed, for example, when people need to withhold motor responses to stimuli in go/no-go tasks (for a review on the N2, see Folstein & Van Petten, 2008). In Shao et al.’s study, pictures with multiple possible names (e.g., “COUCH” vs. “SOFA”) not only were named slower but also produced a larger N2 compared with those with a single likely name. In other words, to produce a single response, participants must select one word and inhibit others. Relatedly, the N2 is increased when people need to switch between short and long descriptions (“the fork” to “the green fork”), and thus suppress the alternative structure, compared with when they use the same phrase type (Sikora, Roelofs, & Hermans, 2016). Thus, selection, and associated inhibition, also applies at the level of choosing the type of multiword structure to produce. Overall, although using EEG and ERP measures to study domain-general processes in production is a relatively recent approach, studies such as these add to the growing body of evidence that language production critically depends on the brain’s ability to coordinate the flow of the information over time.
Getting Words In
At its heart, understanding language involves deriving meaning from sequences of words—that is, linking the perceptual information that makes up a word in some language to knowledge stored in long-term memory. Because comprehension is inherently an internal state, not correlated with any particular behavior, EEG and ERP studies have provided a particularly important view of the processing involved. Accessing meaning has been linked to the N400, an ERP response peaking around 400 ms. The size (but not the timing) of the N400 is reduced to the extent that an input, such as a word, fits the context in which it is encountered and is thus easier to comprehend. Studies using the N400 have revealed that meaning access does not wait until a stimulus is individuated and identified. Instead, when processing a word such as
Importantly, the N400 is not specific to words but is observed whenever people encounter any kind of meaningful stimulus, including pictures, faces, environmental sounds, and gestures (Kutas & Federmeier, 2011). Thus, N400 studies of word comprehension have important implications for theories of object and face processing, which have tended to assume that meaning access should take place at different times for different types of stimuli, depending on when competition between similar perceptual forms has been resolved. Instead, the N400 shows that to comprehend meaning, whether within or outside of language, the brain uses time to coordinate complex patterns of activation across the widespread brain network that makes up semantic memory (Federmeier, Kutas, & Dickson, 2016).
The relatively automatic but also noisy process of accessing meaning often seems to be sufficient to allow us to make links between perception and knowledge. However, in some cases, additional mechanisms are needed to help focus in on and select the right meaning information, and there has long been interest in how this ability may draw on domain-general mechanisms (e.g., Balota, Cortese, & Wenke, 2001; Gernsbacher & Faust, 1991). Ambiguous words such as
Predicting and Adapting
Although comprehension can be passive, evidence has accumulated that comprehenders sometimes draw on processes similar to those used in language production. That is, the brain predicts (covertly produces?) what is likely to come up next—which concepts or words and what they might look or sound like. In ERP measures, prediction is seen in a pattern of effects distributed over time, which build up as sentences unfold, making words increasingly easier to perceive and understand (Brothers, Swaab, & Traxler, 2015; Payne, Lee, & Federmeier, 2015) as well as affecting what people remember (Hubbard, Rommers, Jacobs, & Federmeier, 2019; Rommers & Federmeier, 2018). As observed in production studies, prediction efficacy is influenced by individual differences, including age, literacy, and fluency (Federmeier, Kutas, & Schul, 2010; Ng, Payne, Stine-Morrow, & Federmeier, 2018; Wlotko, Federmeier, & Kutas, 2012). Moreover, regulation of attention again seems to play an important role. Payne, Federmeier, and Stine-Morrow (2020) found that eye-movement patterns among poorer readers showed a higher proportion of very slow fixation times that were not driven by properties (e.g., length or frequency) of the words themselves—a pattern suggestive of attentional lapses. In addition, like production, successful prediction has been linked to prestimulus alpha-power decreases in EEG activity (Rommers, Dickson, Norton, Wlotko, & Federmeier, 2017), again highlighting the necessity of regulation of information flow. Thus, the kind of domain-general resources used to successfully produce language may also allow active engagement during comprehension, doing preparatory work that can ease the difficulty of keeping up with a rapid stream of information.
Language comprehension also critically involves being able to deal with uncertainty and engaging in rapid and flexible updating. Words that are related to the current context (and the events that are being described; e.g., Metusalem et al., 2012) are known to be activated in semantic memory even before they are encountered. Across a series of studies, Szewczyk and colleagues (Szewczyk & Schriefers, 2013; Szewczyk & Wodniecka, 2020) have shown that such activations can be rapidly updated, before the critical word appears. For example, the sentence context “The other driver was so angry he threatened him with a . . . ” is compatible with several different continuations, including “gun” or “lawsuit.” After next reading an adjective such as

Cloze probabilities and event-related potentials (ERPs). Mean cloze probability (the proportion of people who produce a given word as a continuation of the sentence) of nouns (“gun” or “loaded”) is shown (a) following one of two adjectives (“loaded” or “civil”) or the sentence without any adjective. Cloze probabilities are a good index of words’ predictability. As can be seen, the adjectives altered the predictability of the nouns in both directions—making nouns both more and less likely to be produced. To assess whether the actual expectations that readers have in the moment are updated by the adjectives, we asked participants to read for comprehension while their ERPs were recorded. ERPs elicited by the noun are shown (b) as a function of whether it was preceded by the adjective “loaded” or “civil” or no adjective at all. The modulations of ERP waveforms in the 300-ms to 500-ms time window reflect the N400 component, an index of semantic-memory access. The results show that adjectives update predictions by increasing or decreasing the activation of the noun, which is reflected by the N400 amplitude to the noun (the more the waveform goes in the negative direction, the less activated the noun). Note that following the convention in the ERP literature, the
In other cases, comprehenders need to get ready for entirely new kinds of information after encountering a particular word. For example, after reading “He always . . . ,” the comprehension system should expect various verbs. Once a particular verb—say, “hid”—has been obtained, however, the other verbs that might have been possible are no longer useful, and the comprehension system has to refocus on getting ready for a new set of possible continuations, concerning things that could be hidden. There is evidence that this kind of updating unfolds differently. Words that signal that a completely new set of predictions need to be generated elicit an N400 effect (Szewczyk & Wodniecka, 2020), which is larger for words that allow the generation of more specific predictions (Maess, Mamashli, Obleser, Helle, & Friederici, 2016).
Impressively, both types of updating can be elicited not only by the meaning of words but also by much subtler cues: words’ grammatical markers. For example, in many languages, a determiner or a marker on an adjective indicates the grammatical gender of the upcoming noun, which can match the gender of all, some, or none of the already predicted nouns. Markers that do not match any of the predictable words elicit ERP effects (Szewczyk & Schriefers, 2013; Van Berkum, Brown, Zwitserlood, Kooijman, & Hagoort, 2005; Wicha, Moreno, & Kutas, 2004). For instance, the Spanish version of the sentence fragment, “The story of Excalibur says that the young King Arthur removed from a large stone a [feminine] . . . ,” cannot end with the most predictable word “sword” because it carries the masculine gender and would thus mismatch the preceding feminine determiner (Wicha et al., 2004). Readers’ brains are sensitive to this information, and they use it to update their expectations about the upcoming noun (Szewczyk & Wodniecka, 2020). Thus, it seems that the brain takes into account every available cue, be it semantic or morphological, to update its predictions. Crucially, age-related changes in attention and other domain-general skills reduce the ability to engage in rapid updating (DeLong, Groppe, Urbach, & Kutas, 2012). Moreover, some aspects of updating—for example, processing a plausible unexpected word when a different word was predicted—have been linked to increased EEG activity in the theta band over frontal brain areas (3–8 Hz; Rommers et al., 2017). Similar frontal theta increases have been observed in a wide variety of nonlanguage tasks in response to unexpected or conflicting information (Cavanagh & Frank, 2014), suggesting that unexpected language input recruits these same general mechanisms.
Conclusions
Humans’ ability to produce and comprehend language requires that the brain be able to plan ahead, get information active in a timely manner, and select among active representations (keeping some and suppressing others). It also needs to continuously monitor and update processing at multiple levels—from the sounds that make up a single word to the message in the context of the current conversation. Here, we have presented an overview of how these language skills are critically scaffolded by domain-general abilities, such as regulating information flow, sustaining attention, and exerting inhibition, among others (see Table 1). Studies of language are thus highlighting the dynamic and flexible way that large-scale processing networks, which allow complex cognition, are created through coordinated interactions across widespread regions of the brain (cf. Gratton, Laumann, Gordon, Adeyemo, & Petersen, 2016). This flexibility, in turn, is what affords the ability to fluently speak and effectively comprehend even as cognitive abilities and resources differ across people and change over both short (e.g., fatigue) and long (e.g., aging) time scales.
For researchers, findings such as these illuminate how language provides a natural environment for the study of complex processes that are of broad interest, such as prediction (cf. Hutchinson & Barrett, 2019). More generally, this kind of work promises to reveal how the brain achieves the combination that makes language so powerful and unique: the creativity that allows people to regularly produce and comprehend new utterances, expressing both new and old ideas, alongside the consistency that permits those utterances to create similar mental states in other humans, even those with very different backgrounds and experiences.
Measures Used to Assess Domain-General Processes That Have Been Linked to Language
Note: RT = reaction time, EEG = electroencephalogram, ERP = event-related potential.
The tau parameter indexes the right tail of a distribution: A large tau reflects a large proportion of extremely slow trials.
Recommended Reading
Federmeier, K. D. (2007). Thinking ahead: The role and roots of prediction in language comprehension.
Fedorenko, E. (2014). The role of domain-general cognitive control in language comprehension.
Leckey, M., & Federmeier, K. D. (2019). Electrophysiological methods in the study of language processing. In G. I. de Zubicaray & N. O. Schiller (Eds.),
Nozari, N. (2018). How special is language production? Perspectives from monitoring and control. In K. D. Federmeier & D. G. Watson (Eds.),
Roelofs, A., & Ferreira, V. S. (2019). The architecture of speaking. In P. Hagoort (Ed.),
