Abstract
Inspired by studies on a potential bilingual advantage in executive functions (EFs), evidence-based empirical research over the past decade has investigated whether interpreters (in particular simultaneous interpreters) possess an interpreter advantage in EFs. However, limited attempts have been made to set out a theoretical conceptualisation of the working mechanisms of EF components in the simultaneous interpreting (SI) process and the possible development of EFs in SI training. In an attempt to address this gap, this article first reviews the critical theoretical models of EFs in the field of psychology to explain the concept of EFs and their potential components. The possible working mechanisms of EF components in the SI process are then explained via comparison with the theoretical models of the three major components of EFs, namely inhibiting, shifting, and updating, in bilingual conversations and in the cognitive process of SI. Based on a review of the literature and a five-phase training model, a basic mapping of EF development onto the phases of SI training is proposed.
Keywords
1. Introduction
Executive functions (EFs) refer to a set of higher cognitive processes responsible for regulating individual thoughts and behaviours when completing a goal-directed task (Diamond, 2013; Friedman & Miyake, 2017). Although the debate about exactly how EFs should be classified continues, the prevailing view holds that they are a collective name for a set of interconnected but independent sub-functions rather than one unitary structure (Diamond, 2013; Miyake et al., 2000; Timarová et al., 2014). Based on the agreed assumption that cognitive abilities are dynamic and could be shaped by an individual’s life experience (Babcock & Vallesi, 2017), researchers in the fields of psychology, neuropsychology, and psycholinguistics have sought to determine whether the experience of everyday bilingualism influences cognition and, specifically, EFs. However, empirical findings on this question have been inconsistent to date. Early research by Bialystok and colleagues (2004) suggested a bilingual advantage in relation to the core EF of inhibition: middle-aged and older bilinguals outperformed their monolingual peers on the Simon task. This result was soon replicated and extended to children, young adults, and older adults (Bialystok et al., 2005). Interestingly, while bilingual advantages emerged for children, middle-aged and older adults, no such effect was observed for young adults. Building on these findings, later studies tested whether bilingualism also conferred benefits in the other EF subcomponents, such as shifting. Evidence for a shifting advantage was reported in several studies using the Colour–Shape task (Barac & Bialystok, 2012; Prior & Gollan, 2011; Prior & MacWhinney, 2010; Wiseheart et al., 2016). Yet replication attempts raised doubts. Morton and Harper (2007, 2009), for example, found no bilingual advantage in children once immigrant and socioeconomic status were controlled. Such null or contradictory results sparked a wave of studies that likewise failed to find consistent bilingual advantages, particularly in relation to inhibition (Paap et al., 2014; Paap & Sawi, 2014; Prior & Gollan, 2013). Against this background, a debate has emerged with regard to the existence of a bilingual advantage. As this debate continues, methodological issues in studies that reported such advantages, such as small sample sizes, sampling biases, and inconsistent effects, have been identified (Paap, 2023). Despite their strong stance against the existence of a broad bilingual advantage in EFs, Paap and colleagues have acknowledged that if such an effect exists, it is unlikely to reflect a general enhancement of executive functions. Rather, it may occur only under “very specific and underdetermined circumstances” (Paap, 2023, p. 10). This perspective resonates with the Adaptive Control Model (Green & Abutalebi, 2013), which proposes that the degree and pattern of language control demands vary across distinct interactional contexts.
In this respect, simultaneous interpreting (SI)—one of the dominant modes of mainstream international conference interpreting, requires interpreters to render target speeches in real time with a lag of just a few seconds behind the source speech (Setton & Dawrant, 2016a)—represents one special form of bilingualism. Like other types of bilingualism, SI involves basic subtasks, such as analytical listening, comprehension, and language production (Macnamara et al., 2011). Yet the cognitive demands are unique in that SI requires the concurrent execution of multiple subtasks, the simultaneous activation of two language systems, and the management of the input information whose content and delivery pace cannot be controlled (Setton, 1999). These unique demands have led researchers to hypothesise that interpreters may exhibit enhanced executive functions compared with non-interpreters.
Against this background, a growing body of empirical studies has moved away from debates about the general bilingual advantage to examining potential interpreter advantage in EFs, typically comparing interpreters’ performance on EF tasks with that of non-interpreters, including bilinguals without interpreting experience and monolinguals. However, findings in this area of focus have also been mixed. For instance, a systematic review and meta-analysis by Hu and Fan (2021) found no evidence supporting an interpreter advantage with regard to inhibiting but reported substantial evidence for an advantage with regard to shifting. Their results on updating were more nuanced. Although interpreting training appeared to enhance updating, no consistent advantage was observed when professional interpreters were compared with control groups.
Despite these mixed results, existing studies have largely been grounded in theories and inspired by empirical evidence developed in the broader bilingualism field. Little attention has been devoted to theorising how EFs operate specifically within the process of SI. Given that SI represents a fundamentally different type of bilingual experience, there is a clear need for theoretical models that conceptualise the relationship between SI and EFs. To this end, the current study does not seek to test either the interpreter advantage or the bilingual advantage; instead, it aims to explore the association between SI training and EFs in a theoretical manner, by mapping how EFs might develop across the various pedagogical phases and sequences of SI training, with reference to the SI training model established by Setton and Dawrant (2016a). This model—while hypothetical and requiring further empirical validation—was chosen for two main reasons. First, its five pedagogical phases provide a useful framework for correlating the development of EFs with the timing of SI training. Second, its detailed account of everyday classroom practices and exercises enables analysis of the ways in which specific SI training methods might foster the development of particular EF subcomponents.
To achieve the research objective, the analysis begins with an overview of the theoretical components of EFs and identifies the components that are most frequently and empirically examined in the literature (Section 2). Given that research on the interpreter advantage has been largely inspired by and grounded in studies on the bilingual advantage, Section 3 first explores the rationale for and mechanisms by which the most often examined EF components are engaged in normal bilingual conversations. Building on this foundation and taking into account the similarities and differences between normal bilingual conversation and SI, this section then explains how these EF components function within the SI process (Section 3). Finally, Section 4 proposes a theoretical framework that traces the process of SI skill acquisition and its potential link to the development of EFs.
2. EF components and definitions
A range of theoretical models have been developed to structure the components of EFs. Notable contributions include models proposed by Luriia (1973), Stuss and Benson (1986), Sohlberg and Mateer (1987, 2001), and Diamond (2013). These models collectively outline components, such as anticipation, task initiation and planning, task execution and maintenance, task organisation, self-monitoring, response inhibition, creative thinking, working memory updating, task shifting (cognitive flexibility) and reasoning. Among these, inhibiting, shifting, and working memory updating (herein after referred to as “updating”) have been widely recognised as the core components and are most frequently examined in empirical research (Miyake et al., 2000).
Given the ongoing debate regarding the structure of EFs, Miyake et al. (2000) conducted a confirmatory analysis to investigate the interrelations among three core components. Their findings suggested that these components are independent with some underlying interconnections, leading to the conclusion that EFs comprise both unified and diversified components. Although this model is not considered comprehensive in terms of encompassing all executive functions (EFs), it provides evidence for the relationships between these three dissociable components. In this framework, inhibiting was defined as the ability to deliberately inhibit dominant, automatic, or prepotent responses and control one’s attention, behaviour, thoughts and emotions (Diamond, 2013; Miyake et al. 2000); shifting was defined as the ability to switch between multiple tasks and mindsets, learn from mistakes, devise alternative strategies, divide attention, and process multiple sources of information simultaneously (Anderson, 2002; Monsell, 2003); and updating was defined as the monitoring and coding of incoming new and more relevant information to replace the old and no longer relevant information held temporarily in the working memory (Miyake et al. 2000; Morris & Jones, 1990).
Given that both shifting and inhibiting are particularly relevant to multilingual input and output (Keller et al., 2020) along with updating to hold information in the mind, and due to the fact that updating is central to SI, it is posited that these three components of EFs are shaped by bilingualism/multilingualism and SI experience. Empirical and theoretical research has sought to determine whether bilingual and SI experience contribute to enhanced performance on behavioural tasks assessing these three EF components. To illustrate which of these components function in the SI process (and how), these empirical and theoretical studies are reviewed in the following section.
3. EF working mechanism in SI: Beyond bilingualism
Motivated by both empirical findings and theoretical discussions on the bilingual advantage in EFs, some researchers have extended this line of inquiry to interpreters. However, as with the broader bilingualism literature, studies on the so-called “interpreter advantage” have yielded mixed and often inconclusive results, leaving open the question of whether interpreters systematically differ from non-interpreters in their EF performance. Importantly, many of these empirical works are grounded in theories originally developed in bilingualism research, with relatively little attention devoted to theorising the way in which EFs operate specifically within the process of SI. To advance this discussion, it is therefore critical to investigate how EFs are engaged within the SI process. This requires, first, an understanding of why the bilingual advantage paradigm provides a useful conceptual starting point (Section 3.1), and second, an exploration of how the unique features of SI, in contrast to general bilingualism, may shape EF engagement in unique ways (Section 3.2).
3.1 Executive functions in bilingualism
Bilingualism or multilingualism refers to the coexistence of two or more language systems within an individual’s mind in different interactional contexts (Hakuta, 2009; Verplaetse & Schmitt, 2010; Wei, 2006). To communicate efficiently, bilinguals and multilinguals must select a target language while simultaneously controlling interference from the non-target language. According to Babcock (2024), the mechanisms underlying this language control have been conceptualised in different ways. Early accounts proposed the existence of a mental “switch” that enables speakers to select the relevant language (Penfield & Roberts, 1959), namely the language in use, whereas more recent models emphasise that both languages remain coactivated (e.g., Spivey & Marian, 1999; van Hell & Dijkstra, 2002; van Heuven et al., 2008), thereby requiring bilinguals not only to manage their coexistence but also inhibit cross-linguistic interference.
Among the proposed language control models, one of the earliest and most influential models is the Inhibitory Control (IC) model (Green, 1998). This model is built on three key assumptions. The first is control, according to which temporary speech disruptions are attributed to failures of control rather than to insufficient linguistic knowledge. The second is activation, which suggests that lexical candidates from different language systems vary in activation levels, with the most highly activated item being produced. The third is resources, which holds that when available cognitive resources are insufficient to meet control demands, bilingual language control becomes impaired. Importantly, Green (1998) argued that bilingual control depends primarily on suppressing the activation of the non-target language, rather than simply selecting the target one, making inhibition the central mechanism of the model. On this basis, it has been hypothesised that bilinguals and multilinguals, through their consistent and frequent engagement in such control processes, may strengthen their inhibiting, one subcomponent of domain-general EFs. This assumption provided the foundation for the early empirical investigations into the bilingual advantage in EFs, most notably those conducted by Bialystok et al. (2004) and Bialystok et al. (2005). While these studies offered influential evidence in support of the hypothesis, they have also attracted substantial criticism, particularly regarding consistency and generalisability of the findings (e.g., Paap, 2023).
Building on this work, Bialystok and Craik (2010) further emphasised that two languages remain coactivated and that bilinguals must intentionally select that target language while inhibiting interference from the alternative. Within this framework, two types of inhibiting exist: (1) suppression of prepotent and automatic responses and (2) resistance to interference. However, subsequent studies have yielded mixed results. For instance, when fMRI was used to characterise brain activation changes on a single task designed to examine interference suppression and response-inhibiting-related activations, bilinguals were observed to have an advantage with regard to interference suppression over monolinguals, but not in response inhibiting (Dong & Zhong, 2017). Similarly, a review of 31 experiments employing Simon and Flanker tasks, two tests measuring inhibiting skills (see Section 3.2.1), reported partial support for the bilingual advantage of inhibiting (Hilchey & Klein, 2011). These inconsistencies highlight the ongoing debate over the existence of a bilingual advantage in EFs.
Despite their prominent critiques, Paap and colleagues, who are among the most influential opponents of this hypothesis, have also acknowledged that if such an advantage exists, it is unlikely to reflect a broad, general enhancement of EFs. Instead, they have argued that any potential advantage would be “restricted to very specific and underdetermined circumstances” (Paap, 2023, p. 10), such as particular populations or specific types of cognitive control. This perspective resonates with the Adaptive Control Model (Green & Abutalebi, 2013), which emphasises that the extent and nature of bilingual cognitive effects depend on the interactional contexts in which language use occurs. The model distinguishes three types of real-world interactional contexts: (1) single-language context, where one language is used exclusively in one environment, while the second language in another; (2) dual-language context, where two or more languages may be used within the same conversation but with different speakers; and (3) dense code-switching context, where two languages are interleaved within one utterance.
In a single-language context, interlocutors must inhibit interference from the non-target language, thereby strengthening the inhibiting process that may extend to domain-general tasks, a mechanism also highlighted in the IC model (Green, 1998). In a dual-language context, more than one interlocutor is involved in one conversation in which interlocutors may speak different languages when talking to different speakers. As a consequence, language switching may occur. Extensive exposure to this condition may not only facilitate enhancement of the inhibiting skill but also the shifting skill. In this dense code-switching context, however, both languages remain highly activated and coutilised within utterances. Bilinguals under this condition, therefore, might not exhibit an advantage with regard to inhibiting like those under the other two conditions. The Adaptive Control Model (Green & Abutalebi, 2013) therefore conceptualises how differently the degree and mechanisms of language control processes could shape an individual’s EFs, providing a more nuanced account of the potential sources of bilingual EF advantages.
SI is an instantiation of multitasking which involves the execution of a language comprehension task and a language production task simultaneously (Seeber, 2011). In this multitasking process, interpreters need to listen analytically to the source speech, keep the analysed information in the short-term memory, and produce it in the target language while monitoring that output at the same time. Although the basic subtasks in the SI process are similar to those in normal bilingual conversation where bilinguals communicate using their shared languages, these two processes are fundamentally different in five respects (Setton, 1999, p. 2): (1) Use of speech systems; (2) Goal orientation; (3) Pacing; (4) Sourcing; and (5) Input and output languages (see Table 1). These features of SI make its language processing mechanism more distinctive and cognitively demanding, and therefore capable of shaping EF skills in a different way.
Differences Between Normal Bilingual Conversation and SI (Setton, 1999, p. 2).
To conceptualise how the experience of SI might affect EFs, the differences of the task characteristics between SI and bilingualism will be addressed in the following section.
3.2 Simultaneous interpreting as an extreme example of bilingualism
Two EF components, namely the inhibiting and shifting skills, which are often focused on in studies aiming to prove the existence of a bilingual advantage, have also been hypothesised to be positively affected by SI experience.
3.2.1 Inhibiting in the SI process
As the basic subtasks overlap with each other in the SI process, interpreters could encounter three types of interference which would hinder analytical listening of the source speech and therefore negatively affect target language production: (1) interference of environmental distractors, such as poor sound or unclear voice caused by bad SI equipment or by the speaker’s poor articulation; (2) the interpreter’s own voice while producing and monitoring the rendition in the target language; and (3) multiple information sources (e.g., SI with text). Early studies hypothesised that the frequent need to manage interference would strengthen interpreters’ inhibiting. However, empirical evidence has not supported this assumption. Across a range of studies using inhibiting tasks such as the Flanker and Attentional Network Task, interpreters have not consistently outperformed comparison groups (e.g., Babcock et al., 2017; Babcock & Vallesi, 2017; Dong & Xie, 2014; Morales et al., 2015; Nour et al., 2020; Van der Linden et al., 2018).
When interpreters’ performances on SI tasks under two noisy conditions were compared with a no-noise condition, more errors and omissions were found in the former (Gerver, 1974), which indicated the existence of a negative noise effect on SI performance. To guarantee the quality of the target speech in this case, interpreters need to stay focused on the content of the source speech and the production of the target speech while inhibiting the interference of environmental distractors as much as possible. The interference inhibiting process in this condition is similar to most cases of normal bilingual conversation when the alternative language is viewed as a distractor, and its activation is inhibited (Green, 1998; Green & Abutalebi, 2013).
Inhibiting skills are often measured by the Flanker task (Eriksen & Eriksen, 1974) and Attentional Network Task (ANT) (Fan et al., 2002), in which individuals are presented with a series of visual stimuli which contain content relevant and irrelevant to the task goal. The essence of task completion is to achieve the task goal according to the relevant content, while inhibiting the interference of irrelevant content. The general assumption is that rich experience in an activity which exercises certain mind machinery will be revealed by the improvement of the respective executive functions involved, even when measured through domain-general behavioural tasks. Based on that, it has been assumed that interpreters may have better inhibiting skills than non-interpreters. However, results in different studies showed no significant group differences between interpreters and comparison groups (Babcock et al., 2017; Babcock & Vallesi, 2017; Dong & Xie, 2014; Morales et al., 2015; Van der Linden et al., 2018; Woumans et al., 2015). One of the possible reasons for the results is the low likelihood of encountering the first type of interference both in SI practice and training contexts. Without frequent exercise of the mind machinery which requires inhibiting skills, the SI experience itself does not provide reliable benefits for inhibiting, and, therefore, interpreters could not be expected to possess stronger inhibiting skills than non-interpreters.
The other possible reason could be related to the modality of interference. As mentioned, the distractors in the Flanker task and ANT are content irrelevant to the task goal, which is presented to the individuals together with the relevant content in one visual stimulus. However, the environmental distractors that interpreters often encounter are primarily auditory noises. Therefore, the differences in the modality of the interference presentation could possibly lead to task insensitivity.
Interpreters’ ability to control the interference of irrelevant auditory information proved to be better than non-interpreters through complex span tasks with articulatory suppression, which were originally designed to test an individual’s working memory capacity, especially the phonological loop. In these tasks, participants listened to a series of words and memorised as many as possible while overtly verbalising meaningless syllables, such as “bla.” It is suggested that the repetition of the meaningless syllables could disrupt the memorisation of the target words because the memorisation of verbal information depends on its rehearsal in the phonological loop. As a consequence, to minimise the interference created by the production of meaningless syllables, participants could either ignore what they verbalised, enabled by inhibiting, or engage in parallel processing of two auditory information source lines, facilitated by rapid switching. This process resembles simultaneous interpreting in the overlap between the hearing of the source speech delivered by the speaker and of the target speech production (type two interference). If the employment of an inhibiting mechanism is the true reason why interpreters perform better in complex span tasks with articulatory suppression, interpreters would likely ignore their target language production, which is obviously different from what interpreters do in the real world where they need to actively listen to what the speaker says to analyse it and monitor their own rendition. This indicates that the apparent advantage found in tasks involving articulatory suppression cannot be straightforwardly interpreted as stronger inhibiting skills in interpreters.
Gerver (1975) conceptualised the output procedure of SI as a process of testing: the source speech is comprehended and stored in memory for comparison with the target speech. Interpreters could either carry out the testing process before or after their target speech utterance. If it happened before the utterance, it would result in a more satisfactory version of the target speech; if it happened after the utterance, it could be revealed by corrections of the errors in the target speech. However, interpreters could also choose to leave the errors in the output as they are and proceed to the next portion of information if they do not consider the errors to be critical. Therefore, the errors which remain unchanged could also be the result of a choice made by the interpreters after the testing process, rather than the suppression or inhibiting of the produced target speech.
In normal bilingual conversations, when interlocutors are rendering and revising their speech, they do not simultaneously have to process new auditory input given that conversations are typically sequential with turn-taking and not simultaneous. In SI, by contrast, the listening task continues while interpreters deliver the target speech. The final version of the target speech is the product of a combination of listening to and comparing the source and target speech. This parallel language processing mechanism was part of what was named “coordination” (Gile, 2009) or viewed as an aspect of multitasking, an attentional control mechanism which can be realised through divided and selective attention. Divided attention, also named “split attention” by Dong and Li (2020) and Kahneman (1973), was defined as the allocation of attention to two or more channels of information at the same time. The information channels could use either one sense (e.g., hearing) or two or more senses (e.g., hearing and vision; VandenBos, 2015). Selective attention refers to the selection of the most important channel or channels to attend to by inhibiting the peripheral or incidental information channels or increasing the intensity of the important ones (Kahneman, 1973).
Listening to the source and target speeches simultaneously requires divided attention, whereas selecting which of them to respond to is determined by the importance of the two information channels, which varies according to the situations and task goals. For instance, if errors are not detected in the target speech or are considered as non-critical by the interpreter, the source speech is labelled as having a higher level of importance, which will be processed and translated into the next portion of the target speech. However, if the errors are considered critical and need to be revised without any delay, the importance of the target speech channel is increased. This is so that interpreters will revise it immediately and store the incoming new information in the working memory for a later response or carry on the rendition of the incoming new information, and store the corrections to be made in the working memory so as to proceed with it later, such as at the next pause in the source speech. With the continuing variations between the importance levels of the two information channels, interpreters need to frequently shift their cognitive attention between the two channels. As a consequence, the interference control produced by the overlapping of listening and speaking in the SI process is conceivably realised mainly through shifting rather than inhibiting.
Simultaneous interpreting with text, known as SIMTXT (Lambert, 2004) or SI with text (Seeber & Carmen, 2020), refers to a specific SI scenario in which interpreters are presented not only with the source speech delivered through the headphones but also with the visual input of written manuscripts. As interpreters are faced with the situation of exploiting “dual input (auditory and visual)” in this scenario, it has become one of the concerns of interpreting practitioners, trainers and researchers (e.g., Seeber, 2017). Although some researchers believe that visual input assists interpreters given that it could compensate for aspects that lack clarity in the audio source speech, others deem it to create an additional level of difficulty. This is because the increased demand to process more information channels would likely result in interference between the information channels and therefore an increased cognitive load during the process (Cammoun-Claveria et al., 2009). This kind of interference is the third interference type in the current study.
According to the results of a survey on 50 AIIC professional interpreters (Cammoun-Claveria et al., 2009), when the additional visual information channel is aligned with the major auditory information channel (speaker’s live speech), interpreters can store the content as a complement to the major auditory information channel. In this situation, interpreters’ attention will be divided between the information channels with a particular focus on the major auditory information channel. When the major auditory channel is not sufficiently clear, interpreters would likely shift their attention from the auditory input to the visual input by downgrading the importance level of the former and upgrading the importance level of the latter. When the additional information channel contradicts the major auditory information channel, interpreters would simply ignore the written text without processing it. As a consequence, divided attention and selective attention (enabled by shifting) could conceivably account for the way interpreters deal with dual input situations.
3.2.2 Shifting in the SI process
The activation of two languages simultaneously in the process of SI not only requires interpreters to control any interference but also to frequently and regularly switch between the two languages since the listening, comprehension, and production processes are executed concurrently in a continuous flow and within a limited timeframe. Compared with bilinguals who generally use one language and switch to another when that language is no longer efficient or practicable (Green & Abutalebi, 2013), interpreters must switch between two languages, using one of them for comprehension and the other for production (Togato & Macizo, 2023). As a consequence, the frequency of language switching and the cognitive demands are higher in the SI process, leading to an assumption that simultaneous interpreters have better shifting skills than bilinguals without interpreting experience.
This assumption could be supported by both cross-sectional and longitudinal research in recent years, which has examined the interpreter advantage in shifting, as well as the effects of interpreting training or practice. These studies typically compared the performance of interpreter groups in relation to shifting tasks with that of non-interpreter control groups or investigated the differences between interpreting students’ pre-training and post-training performance on shifting tasks. For instance, the most recent meta-analysis conducted by Hu and Fan (2021) reported that only a limited number of studies (4 of 18) found no significant group differences in shifting task performance between interpreters and non-interpreter control groups. By contrast, six longitudinal studies reviewed demonstrated significant training or practice effects. Among the studies included in Hu and Fan’s (2021) analysis, most that reported a significant interpreter advantage employed the Wisconsin Card Sorting Test. By contrast, the four studies that did not find such an advantage used different measures: one employed Plus-minus Task, two utilised Task-switching, and one used the Number-letter Task. This pattern suggests that the inconsistencies may, at least in part, stem from the choice of measurement tool. According to training-induced cognitive studies, transfer effects are most likely to occur when training activity and the outcome task share overlapping cognitive processes or underlying mechanisms (Dahlin et al., 2008). In the context of SI, this means that only tasks that resemble the demands of SI are expected to reveal measurable advantages. Therefore, although one cannot entirely rule out the possibility that interpreters do not systematically possess a shifting advantage, the balance of evidence is more consistent with the interpretation that SI experience enhances shifting, leading to an interpreter advantage when measured by tasks closely aligned with the shifting processes involved in SI.
In addition to the language-switching process, shifting skills may also contribute to the multitasking processes of SI in which interpreters are required to simultaneously execute different tasks, including analytical listening, information memorisation, and production. As mentioned above, multitasking is realised through the combination of divided attention and selective attention. Divided attention is the foundation of this multitasking process, as attention needs to be allocated to all the necessary subtasks of SI first to determine which of them is more important. After that, selective attention comes into play by intensifying attention to the critical task so as to prioritise its response. However, the level of importance of the subtasks might vary according to situations and task goals, which would result in attention being shifted between these tasks at a high speed and frequency, which renders the execution of all the subtasks almost simultaneous.
As the task-shifting mechanism demands a high level of cognitive flexibility, shifting skills are also called “cognitive flexibility,” a concept often discussed in interpreting aptitude studies in which researchers strive to identify potential cognitive aptitude (Timarová & Salaets, 2011). Although cognitive flexibility is not directly linked to interpreting or language processing, it is an essential component of the abilities which are considered significant in interpreting, namely finding innovative solutions to problems, and adaptive expertise (Timarová & Salaets, 2011). As a result, it was regarded as part of the hard skills which have potential for selecting appropriate interpreting students before they start their learning. Timarová and Salaets (2011) compared the initial performances of interpreting students and control groups, and the interpreting students who passed and failed the final examination, on the Wisconsin Card Sorting Test and found significant group differences, which suggests the existence of self-selection and predictability in terms of cognitive flexibility. This result is in alignment with the existing studies on interpreter advantage and the training effect (Babcock et al., 2017; Dong & Liu, 2016; Dong & Zhong, 2017; Macnamara & Conway, 2014).
3.2.3 Updating in the SI process
Updating information stored in the working memory system is a concept that is closely linked to and even used interchangeably with working memory (Karr et al., 2018). Updating is further claimed to be a prerequisite for successful SI (Mellinger & Hanson, 2019). Working memory is defined as the short-term maintenance and manipulation of information necessary for conducting complex cognitive processes (Baddeley, 1986). One of the variables that concerned researchers when measuring interpreters’ working memory capacity is the distinction between working memory and short-term memory (Mellinger & Hanson, 2019). As working memory was regarded as a combination of retaining (or holding in mind) and processing components, and short-term memory was solely regarded as retaining material in mind, short-term memory was classified by some as a subset of working memory (Engle et al., 1999). Although the updating process in EFs requires keeping track of old and new information to make a comparison between them and replace the old with the new, it goes beyond mere passive retention to encompass the active manipulation of information (Miyake et al., 2000).
Updating in the SI process could, therefore, play a more important role in processing the new incoming information instead of simply retaining the information, and can work in the SI process through the following two approaches: (1) update the new information delivered by the speaker through analytical listening and memorisation of the source speech and (2) update the processed information delivered by the interpreter to make a comparison with the source speech information and then deciding on the final version of the target speech. This process distinguishes itself in the following three aspects: (1) no control over the content, form, and speed when updating the information in the source speech (Seeber, 2015); (2) requirement of processing two information channels; and (3) after rendering the final version of the target speech, attention on the new incoming source language will be enhanced while “forgetting” the processed information, which could be seen as a process of “flushing” the old information to make room for new information (Timarová et al., 2014).
Given the close link between working memory and updating, tasks measuring working memory capacity were often claimed to be capable of testing updating skills (García, 2014; Hu & Fan, 2021; Nour et al., 2020). These include the Letter Memory task (Morris & Jones, 1990), the N-back task (Jonides & Smith, 1997), and complex working memory span tasks (Bayliss et al., 2003). Among them, the N-back task is the most often used as a measure to test interpreters’ updating skills. Based on existing empirical studies, several meta-analyses and systematic reviews were employed to examine the relationships between EFs and interpreting (García, 2014; Hu & Fan, 2021; Nour et al., 2020). As the number of existing studies which were labelled as testing interpreters’ updating skills is limited, studies which used complex working memory span tasks and were not labelled as investigations of interpreters’ working memory capacity were also included as complementary data in these meta-analyses and systematic reviews (García, 2014; Hu & Fan, 2021; Nour et al., 2020).
N-back task and complex span tasks were often measured either longitudinally to examine the improvement in performance on this task during interpreting training, or cross-sectionally to compare performances between interpreters and control groups. In N-back tasks, individuals are asked to decide whether the stimulus that appeared on the screen matches the one that appeared n items ago. Complex span tasks refer to the overlapping of a memory task which requires participants to memorise a set of items in correct order, and a processing task which involves the judgement of the correctness of equations or whether sentences make sense. Although nearly identical results were detected showing professional interpreters outperform comparison groups on complex span tasks in a meta-analyses study which investigates the relationship between working memory and interpreting (Mellinger & Hanson, 2019), inconsistent results were found when an N-back task was involved.
When comparing the performances on the single N-back task (with only one source of information: visually presented stimuli) between interpreters and control groups across a 1-year training period, a significant improvement in average response time for the N-back task was observed in the interpreting group. By contrast, only a marginal improvement was noted in the translation group, and no statistical improvement was found in the English major group (Dong & Liu, 2016). In addition, a positive correlation was also detected when performances on single N-back tasks were related to the scores of interpreting in terms of correctly interpreted numbers (Timarová et al., 2014). However, no group differences between interpreters and comparison groups were found by Henrard and Van Daele (2017) or Van der Linden et al. (2018). Meanwhile, when the usage of extensive vocabulary in an interpreting task was related to performances on the single N-back task, a negative correlation was detected (Timarová et al., 2014). When the dual N-back (two information sources: one visual stimulus and one audio stimulus presented simultaneously) was employed, significant group differences were reported (Attanak et al., 2019; Henrard & Van Daele, 2017). Two factors might contribute to the inconsistencies of the results: (1) a dual N-back task resembles the SI process more than a single N-back task as it involves the updating of auditorily presented information and (2) a dual N-back task is more cognitively demanding due to its requirement for simultaneously handling two information sources and increased practice in multitasking could transfer into an advantage of updating information from multiple sources. However, it is also undeniable that these mixed results could suggest SI may not systematically benefit the process of updating, and that interpreters might not possess an advantage in this subcomponent of EFs compared with other bilinguals or control groups.
In summary, inhibiting, shifting, and updating are theoretically all involved in the SI process. However, the extent to which SI experience enhances these components and leads to an interpreter advantage remains inconsistent across studies:
Inhibiting: suppression of interference by environmental distractors. Unlike bilingual conversational contexts, professional interpreters and interpreting students only occasionally engage inhibiting processes (as explained in Section 3.2.1). This limited and less consistent demand may result in insufficient practice of inhibiting skills, making it possible that SI experience does not yield measurable benefits in relation to inhibiting. Consequently, interpreters may not necessarily demonstrate superior inhibiting skills compared with non-interpreters.
Shifting: switching between the source language and the target language, and between different task sets, including listening and analysis, understanding, reformulation, and speaking (Setton & Dawrant, 2016a, p. 256). Evidence suggests that SI experience is more likely to benefit shifting than not. In line with other studies on training-induced cognitive transfer (Dahlin et al., 2008), advantages are particularly observed in tasks that closely resemble the shifting mechanisms engaged during SI, indicating that interpreters plausibly hold a shifting advantage within the context of interpreting.
Updating: updating information delivered by the speaker (auditory and visual) and monitoring the target speech production to generate creative solutions to problems encountered in this process. While some studies report advantages for interpreters, the findings are inconsistent. This suggests that SI may facilitate updating under certain conditions (especially when tasks involve multiple information sources), yet one cannot entirely exclude the possibility that SI experience enhances the updating process.
Based on the discussions above, a theoretical framework mapping the SI training process onto the development trajectory of EFs is proposed in the next section. However, as existing literature suggests that SI experience is unlikely to enhance inhibiting, this subcomponent of EFs is excluded from the theoretical framework.
4. The development of EFs in SI training
With increasing empirical evidence confirming the involvement of EFs or at least two components of EFs, namely shifting and updating, in the SI process, researchers have also been keen to investigate the training effect of interpreting on EFs. They have questioned whether SI training, which is aimed at preparing future interpreters with the necessary SI skills, could lead to EF enrichment. To answer this question, longitudinal studies were carried out to compare interpreting students’ EFs before and after training (Attanak et al., 2019; Babcock et al., 2017; Dong & Liu, 2016; Dong & Zhong, 2017; Macnamara & Conway, 2014). Through the comparison of the pre-test and post-test scores on an N-back task and complex span tasks which measure the updating skills of interpreting students, Attanak et al. (2019), Dong et al. (2018), and Dong and Liu (2016) found significant pre-test and post-test differences, with better performance in the post-test. However, Macnamara and Conway (2014), using the complex span task, failed to find such a result. When the shifting skill is the focus, a consistent result was found with evidence supporting improved shifting skills after interpreting training (Babcock & Vallesi, 2017; Dong & Liu, 2016; Macnamara & Conway, 2014). When it comes to the inhibiting skill which was measured by a Stroop task (Dong & Liu, 2016) and ANT (Babcock & Vallesi, 2017), longitudinal studies failed to find a teaching effect of interpreting. It could be inferred, therefore, that shifting and updating of EFs could be enhanced through interpreting training, while inhibiting could not.
Despite positive findings supporting the potential impact of SI training on shifting and updating of EFs, the underlying mechanisms remain unclear. It is yet to be clarified whether the enhancement in shifting and updating after SI training solely stems from frequent practice of SI tasks alone or results from a combination of various practices aimed at developing SI competencies. In addition, the timing at which the enhancement of EFs becomes evident in domain-general EF tasks requires further investigation. Consequently, there is a need to identify theoretical frameworks linking SI training and EF development, as well as empirical evidence in this regard. To address the former concern, this section begins with an introduction to cognitive enrichment theory, which provides the theoretical basis for understanding how SI training could potentially lead to EF enhancement. This is followed by a review of existing models or handbooks of SI training. Based on this review, the SI training model proposed by Setton and Dawrant (2016a) is selected as the most comprehensive and pedagogical model, which can be used here to map how different training exercises and phases might be related to the development of EFs. Importantly, this article does not aim to prescribe how SI training should be tailored to enhance EFs; rather, as part of a broader longitudinal study examining the relationship between EFs and SI performance before and after training, its purpose is to establish the theoretical foundation for such an empirical investigation.
4.1 Cognitive enrichment theory
Cognitive enrichment theory (CET) posits that engaging in cognitively demanding activities can enhance cognitive functioning by strengthening underlying cognitive control processes and promoting neuroplasticity (Hertzog, 2009). It was first developed in the context of ageing research to explain why individuals who engage in mentally demanding activities, such as education, complex occupations, leisure activities, or bilingualism, tend to show slower rates of cognitive decline (e.g., Bialystok et al., 2004, 2008; Craik et al., 2010; Schmiedek et al., 2010). This line of research emphasised the protective role of sustained cognitive engagement, suggesting that such activities strengthen underlying control processes and contribute to the cognitive reserve. Building on these insights, subsequent work expanded into the domain of cognitive interventions and training-induced changes in brain and behaviour. For instance, young adults were shown to improve significantly in trained perceptual or motor tasks, such as visual motion discrimination (Ball & Sekuler, 1982), motor adaptation (Brashers-Krug et al., 1996), grating discrimination (Fiorentini & Berardi, 1980), and arm-reaching under altered dynamics (Gandolfo et al., 1996). However, these studies consistently revealed a striking specificity of learning: improvements were typically confined to the exact task that was practised, with little or no transfer to even highly similar untrained tasks. Indeed, large-scale training studies and meta-analyses have confirmed the lack of far transfer, showing that training gains rarely generalise to broader cognitive functions (Melby-Lervåg & Hulme, 2013; Owen et al., 2010; Redick et al., 2013; Sala & Gobet, 2017). Such findings suggest that transfer only occurs when the trained and transferred tasks engage overlapping cognitive processes and neural substrates (Dahlin et al., 2008).
This consistent evidence for the specificity of learning raises an important theoretical question: under what conditions does training lead to narrow, task-bound improvements versus broader, transferable gains? One influential framework that addresses this issue is reverse hierarchy theory (RHT), initially proposed by Ahissar and Hochstein (2004). RHT proposes that learning occurs at the highest level of representation sufficient for successful performance. When training targets higher-order and abstract representations, improvements are more likely to generalise across tasks because these trainings engage flexible, domain-general control processes. By contrast, when training depends on lower-level or highly specific representations, learning remains narrow and task-specific, with limited potential for transfer. This framework aligns with recent findings that training paradigms engaging EFs such as working memory updating and task-switching produce broader transfer effects than training confined to simple perceptual or motor routines (Lustig et al., 2009; Noack et al., 2009).
Building on this theoretical account, reverse hierarchy theory provides a useful framework for mapping the relationship between SI training and EF development, given that SI is widely recognised as a highly cognitively demanding activity that intensively engages core executive functions. As discussed in Section 3.2, the working mechanisms of SI inherently draw on two EF subcomponents, shifting and updating. From the perspective of CET, repeated engagement in such high-demand processes is likely to strengthen these functions over time. Moreover, because SI skills are acquired progressively rather than in a once-and-for-all manner and are cultivated through both holistic exercises and targeted drills (Seeber & Arbona, 2020), the RHT framework suggests that much of this learning will occur at higher representational levels, thereby creating greater potential for generalisation and transfer. Consequently, examining how phases of SI training map onto EF development offers a theoretically grounded way of understanding how the acquisition of SI skills may generate measurable cognitive benefits. The next section therefore introduces key SI training models and identifies the most suitable framework for aligning training phases with EF enhancement.
4.2 SI training models
The earliest SI training effort can be dated back to the Nuremberg Trials after the Second World War, when SI was used for the first time in an international multilingual trial. In the early stages, as no SI training method had been developed, the principle adopted in training was “learning by doing” (Kalina, 2015; Kalina & Barranco-Droege, 2022). Later, with the increasing demand for professional conference interpreters, universities began offering highly specialised training courses, and handbooks based on professional interpreters’ working experiences were written. However, as SI practices were initially limited, early handbooks introduced general training methods applicable to all interpreting modes or focused primarily on the consecutive mode.
The development of specific training methods for the SI mode only gained traction when simultaneous interpretation became a crucial component of conference interpreting (Kalina, 2015). One of the goals of SI training is to help students acquire SI mode-specific skills. Given the intricate nature of interpreting, it is agreed that interpreting, including SI, is composed of various interdependent subskills (Kalina, 1992; Kurz, 1992; Moser-Mercer et al., 1997; Setton & Dawrant, 2016b) which should be taught in a progressive manner, namely from easy to difficult. Nevertheless, perspectives on how to teach these subskills vary, with some supporting an integrative approach while others favour a deconstructive approach.
The integrative approach, as exemplified by the Interpretive Theory of Translation (Seleskovitch & Lederer, 1989, 2002), the curriculum-based SI training model by Sawyer (2004), and the incremental SI training model by Setton and Dawrant (2016a) advocated practice on tasks resembling the whole SI process while adjusting the emphasis to tackle specific challenges which students could encounter in the SI process. For instance, one exercise within this approach is “spoonfeeding,” wherein the instructor delivers the source speech as the speaker but pauses slightly after each sense unit within the sentence and again between sentences (Setton & Dawrant, 2016a). This type of exercise allows students to ease into real SI tasks while practising all the basic techniques of it.
By contrast, the deconstructive approach, as presented by Kalina (1992, 2000), Kurz (1992), and Moser-Mercer et al. (1997), argues that prior to engaging in actual SI tasks, SI training should focus on the subcomponents or subskills of the intricate SI process. These components can be effectively trained through specific exercises independently. For instance, Kalina (1992) suggested four blocks of exercises to equip students for real SI tasks: (1) general preparatory exercises like sight translation to enhance discourse processing skills under adverse conditions; (2) anticipation tasks to develop an understanding of the interplay between bottom-up and top-down processes in discourse processing; (3) tasks addressing language signposts to train language production monitoring and flexibility of expression; and (4) easy SI tasks involving simplified foreign texts with short sentences and linear constructions, covering information already addressed in earlier stages. Building on this atomistic perspective, Gile’s (1995, 2009) Effort Models conceptualised SI as the concurrency of four efforts, analytical listening, memory, production, and coordination, arguing that interpreters often work near cognitive saturation. This “tightrope” view has informed exercises such as stress-testing under speed or noise and self-diagnosis tasks to identify which effort has broken down. More practically, Gillies (2013) advanced a deconstructive philosophy with a repertoire of drills targeting core subskills such as anticipation, lag management, and reformulation, designed as flexible resources for trainers and students. Although this approach allows the SI training to isolate problems, to focus on one variable (subskill) at a time, and to combine the separated variables into progressively more intricate structures (Kurz, 1992), the sequencing of the content and skills in training is one problem which needs to be addressed, namely whether the separate subskills should be taught in parallel or in sequence? And if in sequence, how to sequence the skills? (Setton & Dawrant, 2016b).
The incremental and progressive SI training model proposed by Setton and Dawrant (2016a), recognised by a number of authors as the best-argued, most comprehensive and thoroughly explained SI training model (Aguirre Fernández Bravo, 2017; Bin, 2017; Kalina, 2017; Seeber, 2016), was regarded as a potential solution to the above-mentioned questions given its detailed description of day-to-day training classes, and their content (Kalina & Barranco-Droege, 2022). As a consequence, the model was considered suitable for use in an attempt to map the development of EFs in the process of SI training.
Exercises suggested in this model to strengthen specific skills of SI are fully explained from the perspective of their training purposes, operational guidelines, and expected outcomes after practising them. This detailed explanation provides an opportunity to deeply analyse the working mechanism of each exercise and investigate the possible interrelationships between the working mechanism of the exercise and the EFs, thereby supporting insight into the ways in which training in specific SI skills could affect the EFs. In addition, the model proposed by Setton and Dawrant (2016a) has divided the SI training process into five pedagogical phases, sequenced and scaffolded according to the level of difficulty. Exercises suggested to enhance specific SI skills are also distributed across the five pedagogical phases based on their training purpose and difficulty level. The linking between exercises and SI training phases enables the present authors to consider the SI acquisition process as well as the impact of SI training exercises on EFs in a chronological order, which facilitates the mapping of EF development onto the SI training process.
It is worth mentioning that this theoretical mapping of the development of EFs across the SI training process is hypothetical and is part of a larger project. In subsequent work, empirical evidence of this mapping will be gathered in the context of a dedicated Master’s level course in conference interpreting that aligns with the model proposed by Setton and Dawrant (2016a). In the sample curriculum, students develop their competence and skills in SI incrementally. SI practices are initially introduced in simple dialogic communication situations, after trainees have developed solid skills in both short and long consecutive interpreting, and are then further developed and taught in different contexts and scenarios appropriate to conference interpreting settings.
4.3 Mapping the development of EFs skills on the basis of the five pedagogical-phase training model of SI proposed by Setton and Dawrant (2016a)
4.3.1 Phase 1: initiation—updating enhancement and shifting initiation
The major objectives of this phase include (1) improving reformulation skills through source-text-based exercises and (2) experiencing split attention for the first time.
Although reformulation refers to the ability to use the target language correctly and idiomatically to convey messages intended in the source speech, the foundation to achieve this goal is the full understanding of the source speech, the core of the exercises designed for the first training phase. For instance, smart shadowing requires students to shadow the meaning rather than the form of the source speech through substituting words and phrases of the original speech. Although interlingual switching is not practised, deeper processing of the source speech is realised through intra-language switching. On-line cloze, which asks students to shadow speeches with missing words and phrases, easy SI, as well as frozen SI and spoonfeeding, which control how source speech is delivered, all require students to hear and understand everything in the source speech to make sense in the target speech. These exercises require accurate updating, monitoring, and comprehension of the incoming new information delivered in the source speech. Furthermore, creative solutions to relevant problems in this process not only benefit students in terms of being able to more deeply process the source speech but could also develop their domain-general updating skills.
Besides the improvement of reformulation skills, getting to understand split attention, is another objective of the first phase. Through exercises designed for the first phase, students are asked to focus on the meaning rather than the form of the language. As a result, their attention shifts from form to meaning through either interlingual switching or intralingual switching. Although the demand for mindset shifting in each exercise is relatively low, students can gain a first taste of split attention during this phase, which could benefit non-linguistic shifting skills. Comparing the working mechanism of updating and shifting in EFs to the exercises designed to polish the skill of source speech comprehension and to begin to manage split attention, it is reasonable to argue that through the first phase of training, the updating skill might be enhanced significantly while the development of shifting skill might be initiated.
4.3.2 Phase 2: coordination—shifting enhancement
The objectives of the second training phase are to further enhance split attention skills through learning to combine listening, thinking, and speaking in real time (Setton & Dawrant, 2016a). In the proposed exercises designed for this phase, some aspects of the SI process are emphasised while others are not involved, which facilitates the gradual development of skills to coordinate different sub-processes of SI. For instance, faster spoonfeeding, which slows the speed and breaks the simultaneity through deliberately adding pauses to the source speech according to students’ performance, introduces students to the coordination of the listening, thinking, and speaking efforts. Two specific tasks are proposed to the students.
In the SI with Training Wheels exercise (Setton & Dawrant, 2016a, p. 261), the instructor or a speaker will be invited to deliver a speech based on notes, and students are asked to consecutively interpret. After that, a collective discussion of students’ consecutive interpreting performances helps to identify the problems of comprehension, reformulation/translation, and expression they encountered. Then, the speaker will deliver the speech again, from which students are asked to provide a simultaneous interpretation.
In the Simultaneous-Consecutive exercise (Setton & Dawrant, 2016a, p. 261), students need to take notes similar to those taken in consecutive interpreting while listening to the source speech for the first time. However, instead of consecutively interpreting the source speech, the source speech is played to the students a second time so that they interpret it simultaneously, using the notes taken as a support, if needed.
These two tasks provide students with opportunities to listen twice to the source speech which relieves the effort of analytical listening and reformulation, and facilitates better understanding of the SI components, reflection on their performance, and the adoption of proper SI techniques (Orlando & Hlavac, 2020). As students already know the content of the source text and have already reformulated it in consecutive interpreting, they can focus on SI skills (monitoring of output, fluency of delivery, tone, etc.). Compared with other exercises in this and the previous phase, time pressure and simultaneity are imposed. To realise the simultaneity of multiple tasks within limited time, students need to temporarily focus their attention on one or more of the SI task components which are considered more critical at that point and rapidly switch to the others every now and then. Extensive practice on this task, which requires students to shift mindsets between SI subtasks, could result in better domain-general task-shifting ability.
4.3.3 Phase 3: experimentation—shifting enhancement
With the foundation laid during the previous phase, students are expected to be capable of adapting to more challenging situations. Therefore, the objective of the third phase is to stabilise the skills acquired in the previous two phases by practising materials of intermediate difficulty, which differ in genre, topic, and speaker style. Three outcomes can be predicted after training: (1) strengthened ability to anticipate incoming new information through familiarisation with the differences between two language systems and the potential problems which might be encountered; (2) enhanced cognitive flexibility in target language production as students would be more open to selecting the appropriate words, phrases and sentence structures to render their target language; and (3) enhanced cognitive flexibility in delivering the target language through better control over the rhythm, fluency, and EVS of the target speech production.
As frequent language control experience requires flexible cognition which could facilitate the selection of certain expressions within a language and shifting between languages, it would in turn benefit the shifting skill or cognitive flexibility (Dong & Li, 2020). Therefore, we can posit that with rich experience in language selection and target speech delivery in the process of SI, trainees’ domain-general shifting ability could be further enhanced.
4.3.4 Phase 4: consolidation—updating and shifting enhancement
The fourth SI training phase revolves around two major aims: (1) consolidating SI mode-specific skills and (2) consolidating the product (Setton & Dawrant, 2016a, p. 293). The first aim is realised through deliberate practice on more difficult and authentic materials which differ in domain. In this phase, authentic and formal materials relating to different domains and settings, such as materials from international organisations, national institutions and diplomacy, and the private market, will be deliberately and extensively practised. With more knowledge of different domains and settings acquired and available, the processing efficiency of working memory will be increased and demands on working memory will be lessened, thereby facilitating the allocation of more attention to listening and comprehension of the new information provided by the speaker, shifting mindsets between different tasks and self-monitoring their own production.
Although self-monitoring is an essential component of interpreter training, it will be particularly emphasised in this phase, when students’ attention can be freed up. Effective self-monitoring requires students to ensure the accuracy of their target output while concurrently listening to and comprehending the source speech. In this process, the act of producing speech in the target language introduces interference which students must control. As discussed in Section 3.2.1, this interference is mainly controlled through frequent shifts of attention between two language systems. Therefore, it is reasonable to anticipate that training in this phase will enhance students’ abilities in both new information processing and target speech interference control, thereby contributing to an improvement in updating and shifting skills.
4.3.5 Phase 5: reality—updating and shifting enhancement
The last phase of SI training is also called the “Last Mile” (Setton & Dawrant, 2016, p. 264) which aims at preparing students for real SI work. Three major components are included in this phase: (1) user-orientation, the ability to deliver a precise and polished product which is easy for listeners to follow; (2) advanced on-line skills for more challenging input; and (3) survival skills for coping with extreme conditions and unrealistic demands.
To produce a user-friendly interpretation, students should self-monitor their own output more attentively, paying attention to the communicative situations, unexpected surprises, and weigh the risks and benefits. To complete the skillsets necessary for professional practice, students will be asked to practice tasks common in everyday interpreting during the first half of this training phase, such as SI-with-text, relay SI, and to develop abilities to cope with extreme conditions in the second half, for example, by practising tasks which seem to be impossible to complete. All these training exercises together are targeted at polishing and optimising all the SI mode-specific skills by using authentic or near-authentic materials which are considerably more difficult than the exercises in the previous phases. With increasing cognitive demands due to the increased difficulty of exercises in this phase, EF skills could be expected to undergo further development.
4.3.6 Summary
In summary, it is assumed that EF skills could gradually develop along with the SI mode-specific skills across all the training phases proposed by Setton and Dawrant (2016a; see Figure 1). This assumption is based on the general hypothesis that intensive experience of a specific profession is likely to bring changes to an individual’s cognitive skills and specifically alter their EF skills (Hertzog, 2009). After intensive practice in deep processing of the source speech, and an introduction to mind shifting in the first phase, students’ domain-general updating skills are expected to be enhanced while the shifting skill is expected to be initiated. In the second phase, with a training objective focused on enhancing students’ coordination skills, students are expected to be better at coordinating subtasks embedded in the SI process, including analytical listening, memorisation, reformulation and production, the experience of which would further transfer into better task-shifting skills. Through exercises in the third phase, students are expected to be more flexible and adaptable in their rendition in the target language, finding creative solutions to keep up with the speaker, adjust their EVS, and maintain fluency, changes which have good potential to transfer into better domain-general shifting skills. Through the fourth phase, with the increasing difficulty of the practice materials and special emphasis on the monitoring of target speech, updating and shifting skills are further enhanced. In the last phase, additional difficulties are incorporated into students’ daily exercises which allow them to gain insight into the demands of professional practice. In this phase, students need not only to polish the skills acquired in the first four phases, but also to be exposed to more surprises, challenges, and even extreme conditions, which would result in the further strengthening of all the domain-general EF skills.

Development of Executive Functions in Simultaneous Interpreting Training.
Although no research to date has directly examined the specific impact of each SI training phase on EFs, longitudinal studies have provided preliminary evidence supporting the theoretical hypothesis proposed in this article. For instance, improvements in the shifting and updating components of EFs have been observed following a period of interpreting training (e.g., Dong & Liu, 2016), offering a general empirical foundation.
5. Conclusion
This article proposes a theoretical map to account for the gradual development of two EF components, namely updating and shifting, in a SI training process based on three elements:
The theoretical frameworks of EFs in the field of psychology, which allow us to understand what the concept of EFs is and what specific skills it comprises.
The empirical and theoretical evidence on the bilingual advantage of EFs. Given the close relationship between SI and bilingualism, the evidence allows us to examine the working mechanisms of EFs in the SI process.
A five-phase SI training framework based on Setton and Dawrant (2016a), which explains training objectives and methods in detail and enables us to link each of the EF components to the specific skills of SI, and to postulate a trajectory for the process of EF development.
Although theoretically promising, the hypothesis proposed here has yet to be empirically validated. Future research should provide empirical evidence addressing the following aspects to validate the hypothesis: (1) establishing the links between EF tasks performances with SI performance at various stages of training, as well as upon completion of the entire SI training programme; (2) conducting repeated assessment of EF tasks and SI performances after each training stage to identify the specific phases in which significant EF development occurs and to determine whether enhanced EFs contribute to improved SI performance; and (3) examining the interrelationships between the SI training exercises and EF performance, such as the predictive value of different SI training exercises on EF tasks performances with SI performance considered as a potential mediator.
Serving as the theoretical foundation for a broader longitudinal programme, the present research sets the stage for a series of empirical investigations that examine the dynamic interplay between EFs and SI training. These studies will track how baseline EFs predict SI performance prior to and after training, how EFs develop over the course of 1 year of intensive SI training, whether gains in SI proficiency are mirrored by measurable changes in EF performance, as well as whether EFs after training predict SI performance after training. Beyond establishing causal links, this design also allows for correlational analyses that capture the degree of reciprocity between EFs and the SI acquisition process. Such empirical investigations would provide insights into which SI training exercises are most effective for enhancing specific components of EFs, thereby informing evidence-based curriculum design for SI training programmes.
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Ethical approval and informed consent statements
This research was approved by The Human Sciences Subcommittee, Macquarie University (Approval No.: 520231283046577). All procedures were conducted in accordance with the requirements set out in the National Statement on Ethical Conduct in Human Research 2007 (updated July 2018). Informed consent was obtained from all participants prior to their participation in the study. Participants were informed about the purpose of the research, their right to withdraw at any time, and the measures taken to ensure confidentiality and anonymity.
