Abstract
Research on cognitive processing and performance optimization under time pressure in sight translation (ST) remains limited, despite ST’s growing significance in professional interpreting practice. Traditional ST studies using static texts fail to reflect the real-world challenges interpreters face. This study introduces a novel approach to ST research by examining the effects of dynamic text presentation rates on interpreting performance. The primary aim was to investigate how controlled time pressure affects interpreter cognition and performance quality through accuracy and fluency measures. Using Microsoft PowerPoint for text presentation and BB Flashback Pro for screen and audio recording, 18 master’s students performed English-to-Chinese ST tasks at three different presentation rates (90, 120, and 150 words per minute). Performance was evaluated through standardized assessment rubrics (0–100%) for accuracy and acoustic measurements for fluency patterns. Analysis of accuracy and fluency measures revealed two significant theoretical contributions: (1) increased presentation rates enhanced performance by triggering more efficient cognitive processing and adaptive coping tactics, challenging conventional assumptions about time pressure effects; and (2) controlled time constraints improved information selection and processing efficiency. These findings advance our understanding of interpreter cognition and coping behaviors under time pressure and provide empirical support for incorporating dynamic text presentation in interpreter training programs. The study’s implications extend beyond pedagogy to professional practice, offering insights into optimizing interpreter performance under real-world time constraints.
Plain Language Summary
This study looked at how different speeds of English text presentation affect sight translation performance in Chinese student interpreters. Eighteen master’s students completed sight translation tasks with texts shown at speeds of 90, 120, and 150 words per minute. The researchers measured accuracy and fluency, and also gathered feedback through interviews and a survey. Surprisingly, faster text speeds improved accuracy and fluency by limiting constant access to the text and encouraging students to focus on key information. Participants used various strategies like amplification, conversion, repetition, and generalization, with their effectiveness varying based on each student’s skill level and prior experience. The findings highlight a complex link between text speed, cognitive load, and interpreter performance, suggesting that interpreter training should be more adaptable. These results are important for improving training methods and developing practical skills in sight translation.
Keywords
Introduction
In an increasingly globalized and digitalized world, sight translation (ST) has become a fundamental skill for professional interpreters. Its applications extend beyond traditional conference, legal, and medical settings to emerging fields such as live streaming events, virtual conferences, remote interpreting platforms, and real-time subtitle translation, where immediate oral rendering of written texts is frequently required under varying temporal constraints. ST, defined as the oral rendering of a written text from one language into another (Agrifoglio, 2004; Lambert, 2004; Setton & Motta, 2007), demands complex cognitive processing as interpreters simultaneously read, comprehend, and produce oral output in real-time.
Research on ST has evolved significantly over the past two decades, with studies employing various methodological approaches to understand its underlying processes. Eye-tracking studies have provided valuable insights into visual attention patterns and cognitive processing during ST (Chmiel et al., 2020; Ma, 2021; Seeber et al., 2020; Su, 2020; Su & Li, 2020; Zou et al., 2023). Other researchers have investigated strategic behaviors (Agrifoglio, 2004; Li, 2014), quality assessment (Liu & Xu, 2017), and pedagogical approaches (Ho et al., 2020). However, these studies share a common limitation: they predominantly examine ST under static text conditions, where interpreters have unrestricted access to complete source text.
This methodological approach, while valuable, fails to reflect the dynamic nature of many real-world interpreting scenarios. With the rapid advancement of digital technology, professional interpreters increasingly encounter diverse text presentation formats, including digital displays with scrolling text, teleprompters, AI-generated real-time transcripts, and time-constrained document reading. The cognitive demands and performance implications of these dynamic presentation conditions remain largely unexplored. While studies have examined time pressure effects in translation (Weng et al., 2022), consecutive interpreting (Zou & Guo, 2024), and particularly in simultaneous interpreting (Barranco-Droege, 2015; Gumul & Łyda, 2007; Lijewska et al., 2022; Ma, 2017), their findings cannot be directly applied to ST due to its unique visual-verbal processing requirements.
Furthermore, existing research has not adequately addressed how interpreters adapt their strategies when faced with varying input rates—a critical skill in professional practice. Understanding these adaptations is essential for developing more effective training approaches and preparing interpreters for real-world challenges. Despite the increasing emphasis on ST in interpreter training programs worldwide and its growing importance in professional settings, empirical research on dynamic processing remains limited, particularly regarding the cognitive adaptations and strategic behaviors required under different temporal constraints.
The present study addresses these research gaps by examining how controlled text presentation rates affect ST performance among student interpreters. Through systematic investigation of performance under dynamic text presentation conditions, this study aims to answer the following research question: How do varying text presentation rates affect ST performance, specifically concerning (1) translation accuracy, (2) output fluency, and (3) strategic adaptations employed by student interpreters?
Literature Review
Defining Sight Translation and Its Cognitive Demands
Sight translation (ST) occupies a distinctive position within the spectrum of language mediation activities, conceptualized as the oral rendering of written text from one language into another (Agrifoglio, 2004; Lambert, 2004; Setton & Motta, 2007). It functions both as an essential skill in its own right and as a preparatory exercise for simultaneous interpreting with text. While ST shares the immediacy of output characteristic of interpreting, it diverges fundamentally in the persistent availability of the source text and the modality discordance between written input and oral output (Chmiel et al., 2020; Lee, 2012; Ma, 2021).
The cognitive architecture underpinning ST is intricate, bridging the domains of written translation and oral interpreting (Agrifoglio, 2004; Lambert, 2004; Liu & Xu, 2017; Setton & Motta, 2007; D. Zou & Chen, 2023). This modality attenuates auditory processing demands while isolating rate effects on linguo-cognitive processes (Lee, 2012; Ma & Li, 2021). Notwithstanding its deviation from mainstream interpreting modes, ST remains firmly situated within the realm of interpreter-mediated transfer, sharing core cognitive processes with other forms of interpreting (Agrifoglio, 2004; Dragsted & Hansen, 2009; Gile, 2004; Li, 2014; Zou & Chen, 2023). Having established the unique position and cognitive demands of ST, it is crucial to examine the specific cognitive processes and strategies employed by interpreters in this modality.
Cognitive Processes and Strategies in Sight Translation
The cognitive processes inherent in ST are multifaceted and interconnected. The presence of static text significantly influences ST quality, with strategies such as preparation, prediction, and reading proficiency playing pivotal roles (Deng, 2017; Li, 2014; Su & Li, 2021; Yang et al., 2020). However, the sustained textual access unique to ST imposes considerable strain on eye-voice coordination faculties. Researchers elucidate that the persistent visual presence of the source text engenders a distinctive challenge, impeding target language production and complicating reading-speaking synchronization in ways not observed in other forms of interpreting (Nilsen & Monsrud, 2015; Seeber et al., 2020; Shreve et al., 2010; Song, 2010; Zou & Chen, 2023).
The cognitive load in ST is further exacerbated by the necessity for interpreters to selectively filter input, mitigating interference while coordinating analysis and output (Agrifoglio, 2004; Ferreira et al., 2020; Zhou et al., 2021; Zhu & Aryadoust, 2022). In contrast to simultaneous interpreting, where the ephemeral nature of aural input necessitates immediate processing, the pre-established structure of written text in ST resists dynamic meaning-based reorganization. This phenomenon imposes greater deverbalization demands on the interpreter (Chmiel & Mazur, 2013; Lambert, 2004; Setton & Dawrant, 2016; Yan & Song, 2022). Consequently, there exists an elevated risk of interpreters adhering excessively to surface-level features, potentially neglecting underlying semantics and resulting in translation errors, omissions, and delayed output.
Strategic behaviors in ST have been extensively documented in empirical studies, revealing a complex interplay of cognitive and linguistic processes. To manage the simultaneous demands of visual input processing, oral output generation, and cross-linguistic conversion, translators employ various strategies including text segmentation, advance reading, syntactic restructuring, and selective attention (Agrifoglio, 2004; Chernovaty et al., 2023; Hussein & Najim, 2023; Li, 2014; Shreve et al., 2010; Yan & Song, 2021, 2022). These strategic behaviors manifest differently between expert and novice practitioners: experienced sight translators demonstrate sophisticated approaches such as parallel text processing and proactive reformulation (Chmiel, 2015; He & Wang, 2021; Korpal & Stachowiak-Stachowiak, 2018; Touil & Benameur, 2024), while novice translators typically rely on more basic strategies like linear reading and word-level processing (Dragsted & Hansen, 2009; Fang & Wang, 2022; Fauzia, 2022; Jakobsen & Jensen, 2008). The selection and implementation of these strategies are influenced by both internal factors such as translator expertise and cognitive load management (Lee, 2012; Seeber, 2015), and external factors including text complexity, time pressure, and working environment conditions (Chmiel et al., 2020; Fang et al., 2022; Ma, 2021), making it crucial for developing effective training approaches and improving performance.
Input Rate Effects on Interpreting Performance
A critical challenge in ST lies in managing temporal constraints, particularly in the transition from training to professional practice. While interpreters can theoretically control their reading pace during training, real-world scenarios often impose external temporal demands through various constraints, such as speaker coordination in simultaneous-with-text interpreting or time pressure in conference settings. This discrepancy between self-paced training and externally-paced professional practice can lead to a significant misalignment between cultivated cognitive routines and the high-pressure demands of real-time interpreting.
Research on input rate effects in interpreting has yielded diverse and sometimes contradictory results, reflecting the complexity of the issue. Gerver’s (1969/2002) seminal study found decreased accuracy and increased omissions as speeds increased from 95 to 164 words per minute (wpm), suggesting a clear negative correlation between input rate and performance quality. This finding was later supported by Barghout et al. (2015), who observed that interpreters tend to produce more omissions when facing fast speeches, though these omissions were often strategic, primarily affecting redundant information. Similarly, Pio (2003) conducted experiments at 108 and 145 wpm, analyzing the strategies and outputs of both student and professional interpreters. The results revealed reduced fluency at faster speeds, particularly among students, highlighting the role of experience in managing increased input rates. This experience factor was further confirmed by Stachowiak-Szymczak and Korpal (2019), who found that professional interpreters consistently produced more accurate interpretations regardless of source text speed.
However, the relationship between input rate and interpreting quality is not uniformly linear. Shlesinger (2003) noted that moderate increases in delivery rate (from 120 to 140 wpm) occasionally led to improved performance, particularly in the rendition of long left-branching noun phrases from English to Hebrew. More intriguingly, Vančura (2013) found no positive correlation between source text speech rate and overall interpretation quality in English-Croatian interpretation, suggesting that other factors may play equally important roles in determining interpretation success.
Recent studies have provided additional insights into specific aspects of high-speed interpreting. Plevoets and Defrancq (2016) identified that fast source speech rates, combined with high lexical density, significantly influence the occurrence of filled pauses in simultaneous interpretation. Dose (2020), examining European Parliament’s plenary debates, confirmed that interpreters’ use of omissions increases with rising source speech delivery rates.
The challenges posed by varying input rates in interpreting have led researchers and educators to explore innovative training methods. Arum (2022) suggests that interpreting training institutions should pay particular attention to coping tactics learning and acquisition in their courses, especially when dealing with challenging input rates. This recommendation is particularly relevant given the evidence that professional experience significantly improves interpreters’ ability to handle faster speech rates (Korpal & Stachowiak-Szymczak, 2019; Pio, 2003).
Dynamic Text Presentation in Sight Translation
The incorporation of dynamic text presentation within ST training represents a significant evolution in interpreter education, addressing the limitations of traditional pedagogical approaches in preparing interpreters for the complex cognitive demands of professional practice. This innovative methodology aligns more closely with the demonstrated cognitive processes of simultaneous interpreting and responds to the multifaceted competencies required in contemporary interpreting contexts (Gile, 2009; Setton & Dawrant, 2016; Zou & Zhang, 2023).
Conventional ST methods, characterized by static text presentation, while foundational, often fail to adequately simulate the temporal pressures and cognitive load experienced in real-time interpreting scenarios (Li, 2014; Shreve et al., 2010; Song, 2010; Zou & Chen, 2023). The integration of dynamic text presentation addresses this shortcoming by introducing a temporal dimension that more accurately reflects the cognitive demands of professional interpreting. This approach necessitates the development of critical skills, including optimizing eye-voice span through regulated input control based on task demands and processing stages (Agrifoglio, 2004; Chmiel & Lijewska, 2023; Zhang et al., 2023), enhancing effort balancing across multiple cognitive tasks (Chmiel & Lijewska, 2023; Gile, 2009; Havnen, 2019, 2021; Ho et al., 2020), facilitating rapid adaptation to text-induced interference (Chernovaty et al., 2023; Setton & Dawrant, 2016), and improving anticipation and prediction skills (Chmiel & Lijewska, 2019; Yan & Song, 2022; Zou & Chen, 2023).
The implementation of dynamic text presentation in ST training has been facilitated by technological advancements in computer-assisted interpreter training (CAIT) tools, which offer features that allow for controlled text reveal rates, simulating the pace of spoken discourse (Kalina, 2000; Sandrelli & de Manuel Jerez, 2007; Song, 2010; Zou & Chen, 2023). These innovations enable a more nuanced and tailored approach to interpreter training, allowing for gradual increases in difficulty and speed of text presentation.
The integration of dynamic text presentation in ST training represents a paradigm shift in interpreter education, with far-reaching implications for curriculum design (Sawyer, 2004), assessment methodologies (Angelelli & Jacobson, 2009), and continuing professional development (Kurz, 2003; Moser-Mercer, 2008; Seeber, 2011; Timarová et al., 2014). This approach, by authentically replicating professional interpreting demands, bridges the gap between traditional ST exercises and simultaneous interpreting practice, thereby enhancing interpreting competencies across various domains. While the benefits are evident, the methodology also underscores the importance of balancing input processing and output production during ST training.
Research Gaps and Future Directions
While existing research provides valuable insights into the cognitive processes and challenges of ST, there remains a significant need for studies that specifically address the impact of dynamic text presentation on ST performance. As demonstrated in this review, three critical gaps emerge in the current literature.
First, a methodological gap exists in the systematic investigation of text presentation rates and ST performance metrics. While studies have examined input rates in simultaneous interpreting (Gerver, 1969/2002; Pio, 2003), similar controlled studies in ST contexts are notably lacking. This gap is particularly significant given the increasing use of dynamic text presentation in professional settings.
Second, current theoretical frameworks inadequately account for the temporal dynamics of ST processing. Existing models have not fully explored how interpreters adapt to varying presentation rates, how cognitive load fluctuates under different temporal constraints, or how expertise mediates these processes. This theoretical void limits our understanding of the underlying mechanisms in ST performance.
Third, despite ST’s widespread use in interpreter training, there is insufficient empirical evidence to guide pedagogical practices, particularly regarding optimal presentation rates for different proficiency levels and progressive rate adjustment strategies in training programs.
Based on these identified gaps and the theoretical framework discussed above, this study aims to address the following research questions:
RQ1: How does the dynamic text presentation rate affect the accuracy of student interpreters’ sight translation output? What factors contribute to changes in accuracy across different presentation rates?
RQ2: What is the impact of varying text presentation rates on the fluency of student interpreters’ sight translation output? What underlying factors influence fluency changes as presentation rates increase?
RQ3: What strategies do student interpreters employ to manage different text presentation rates during sight translation? How do these strategies correlate with the accuracy and fluency of their output at different presentation rates?
To systematically investigate these research questions, the following hypotheses are proposed:
H1a: Lower dynamic presentation rates will lead to higher accuracy in ST performance
H1b: Higher dynamic presentation rates will result in increased omissions and errors
H2a: Higher presentation rates will lead to increased pause frequency and duration
H2b: Higher presentation rates will result in decreased speech fluency measures
H3a: Student interpreters will employ more diverse coping strategies as presentation rates increase
H3b: The effectiveness of coping strategies will correlate with performance outcomes
These hypotheses address the identified gaps by (1) Examining the relationship between presentation rates and accuracy; (2) Investigating the impact of presentation rates on output fluency; and (3) Analyzing the strategic adaptations of student interpreters under different temporal constraints.
Through testing these hypotheses, this study aims to contribute to both theoretical understanding and practical applications in ST training and practice. The findings could inform the development of more effective training methodologies and support tools for interpreters, ultimately enhancing professional practice in an increasingly dynamic interpreting landscape.
Methodology
Research Design
This study employs a mixed-methods approach to investigate the impact of varying text presentation rates on English to Chinese ST performance among student interpreters. Based on Gerver’s (1969/2002) finding that 100 to 120 wpm is optimal for English speeches in simultaneous interpreting, three presentation rates were selected: 90 wpm (below optimal), 120 wpm (optimal), and 150 wpm (above optimal). This range allows for examination of performance across different cognitive load conditions. The experimental design combines quantitative measures for accuracy and fluency assessment at different presentation rates, qualitative analysis through interview data and recall protocols to examine performance differences and coping strategies, within-subjects design to control for individual differences, and ecological validity considerations through professional-like conditions and tools. The following sections detail the specific research instruments and experimental procedures developed to implement this mixed-methods design, ensuring rigorous investigation of the research questions while maintaining ecological validity.
Research Instruments and Materials
The experimental text was derived from a 14-min TED talk, condensed into 1,333 words across three segments of similar word counts. The text selection and presentation were designed to align with professional working conditions, enhancing experimental validity (Albright & Malloy, 2000; Kenny, 2019). The text complexity was quantified through several measures, including a lexical density of 0.56 indicating a high proportion of content words, a type-token ratio (TTR) of 0.42 suggesting a diverse vocabulary range, and 18% of words falling outside the 3,000 most frequent word families. The text maintained an average sentence length of 22.5 words (range: 12–35 words), with 2.3 clauses per sentence on average (40% containing three or more clauses). The Flesch (1948) readability score of 61 indicated a standard difficulty level appropriate for testing interpreter skills.
The text presentation system utilized Microsoft PowerPoint with timed slide transitions to achieve precise control over presentation rates. The setup included standardized 18pt font, triple line spacing, and block margins to ensure consistent visual presentation across all conditions. The presentation rates of 90, 120, and 150 wpm were calculated based on the total number of words in each segment divided by the duration of its display. Timed transitions enabled gradual speed changes to isolate effects on cognitive load. This configuration precisely regulated scroll speeds across conditions, while PowerPoint’s practicality and widespread professional use strengthened ecological validity. Figure 1 illustrates the standardized interface design with automated text scrolling at pre-set rates.

Dynamic text presentation interface showing standardized text format.
BB Flashback Pro software served as the primary recording tool, simultaneously capturing PC screen activity, webcam footage, and audio output. The software’s waveform charts with 0.33-s interval rulers facilitated quantitative assessment of speech flow by tabulating pause frequency and duration. This comprehensive recording system enabled integrated analysis of the source text display and speech output, while archived videos supported stimulated recall interviews for collecting qualitative data on cognitive processes.
This comprehensive set of research instruments—from carefully calibrated source texts to precise presentation controls and multi-modal recording capabilities—provides the necessary tools to systematically investigate ST performance across different presentation rates while maintaining professional working conditions.
Data Processing and Analysis
The collected data was processed and analyzed through both quantitative and qualitative methods to systematically investigate the three research questions concerning accuracy, fluency, and strategic adaptations in ST under different presentation rates:
Impact of dynamic text presentation rates on ST accuracy:
Performance accuracy was evaluated using a standardized scoring rubric (0–100%) by two certified raters. Inter-rater reliability was established through correlation analysis of the scores.
2. Fluency variations in ST output:
Fluency analysis was conducted using acoustic analysis to measure pause patterns. Pauses were categorized as short (0.3–1 s), medium (1–3 s), or long (>3 s). The frequency and distribution of these pauses were then statistically analyzed.
3. Strategic adaptations of student interpreters:
Strategic behavior analysis was conducted through systematic examination of recorded outputs and post-task interviews to identify the coping strategies employed by student interpreters when dealing with dynamic text presentation.
Experiment
The experimental phase implemented the research design and utilized the instruments described above through a systematic process of pilot testing and main study execution, as detailed below.
Prior to the main experiment, a pilot study with eight participants (five females, three males, age range 24–29) was conducted to validate the experimental design and confirm appropriate presentation rates based on Gerver’s (1969/2002) findings on optimal speech rates. Participant feedback confirmed that 90 wpm allowed comfortable processing with minimal cognitive strain, 120 wpm represented a challenging but manageable pace aligned with professional conditions, and 150 wpm induced significant cognitive pressure while remaining within feasible limits. The experimental environment was controlled, with participants completing tasks in a laboratory setting. The pilot study validated both the selected presentation rates and experimental procedures.
The study recruited 18 Chinese-speaking students (12 females, 6 males; age range: 23–28 years,
Multiple screening criteria ensured sample homogeneity and competence. Participants demonstrated advanced English proficiency through the Oxford Quick Placement Test (score range: 55–58/60,
The experiment proceeded in three phases: preparation, familiarization, and main task. Initially, participants completed a comprehensive survey on their interpreting experience, focusing on both static and dynamic ST practices, including task frequency, speed familiarity, and self-assessed comfort levels with dynamic ST.
The familiarization phase (10 min) comprised three components: task explanation (2 min), interface demonstration (3 min), and practice with sample texts at various presentation rates (5 min). This structured approach ensured participants understood the task requirements and became comfortable with the dynamic text presentation format.
For the main task, the experimental text was presented sequentially at three speeds: 90 wpm (4 min 40 s), 120 wpm (4 min 10 s), and 150 wpm (3 min 10 s), with 40-s intervals between segments to minimize fatigue effects. Following the ST task, participants engaged in stimulated recall sessions (approximately 10 min) and semi-structured post-task interviews (approximately 5 min) exploring their experiences with dynamic versus static text presentation and coping strategies. Data collection included audio recordings, video captures, and interview responses, with all materials processed through BB Flashback Pro for subsequent quantitative and qualitative analyses.
Results and Discussion
Impact of Dynamic Text Presentation Rates on Sight Translation Accuracy
This section addresses the first research question:
The analysis of accuracy scores across different presentation rates revealed clear patterns of performance variation. At 90 wpm, interpreters demonstrated relatively lower accuracy (
Individual performance patterns revealed variations in how interpreters adapted to different presentation rates. Analyzing the complete trajectory across all three speeds, three distinct response patterns emerged: continuous improvement (
The line graph (Figure 2) visually represents these performance patterns, highlighting both the general trend of improvement from 90 wpm to higher speeds and the individual variations in optimal presentation rates. The graph plots accuracy scores (

Accuracy of student interpreters in ST.
Building upon the performance patterns observed in the accuracy scores, this section examines the underlying mechanisms and theoretical implications of these findings. The analysis revealed two key factors that influence accuracy: continuous input, which reduces the cognitive load of reading and memorization, and faster text presentation, which accelerates coordination and output.
First, dynamic input helps student interpreters to minimize the cognitive load from the static source text during reading, thereby improving accuracy (Chmiel & Mazur, 2013; Läubli et al., 2020; Shemy, 2022; Yan & Song, 2022). The statements from the interviews indicate that interpreters tend to skim more content at a slower pace, which is not conducive to information processing. A faster text presentation allows interpreters to logically segment, subdivide and advance the translation as they scroll through the text, which fulfills the requirements of immediacy and synchrony in ST. These findings suggest that both text presentation mode and speed are critical factors in determining interpreters’ cognitive load and task performance.
When it’s slow, it’s pretty much like regular sight translation. I can read ahead but can’t really change much ‘cause it keeps moving up. When it’s fast, it feels more like interpreting. I just keep translating as I see stuff, without waiting too long. It’s kinda stressful, but I actually find it easier on my brain. Weird, right? (Interpreter 5) As speed increased, my energy allocation changed. At slower speeds, I could read, memorize, translate, and self-monitor. But at higher speeds, especially 150 wpm, it became mechanical—just reading and translating. I focused less on memorizing and self-monitoring. It turned into a quick transformation process, prioritizing quantity over quality. (Interpreter 11) You know, faster text is actually better for me. It helps me break down the sentences as I go. With static text, I tend to read a bunch first, then translate. That’s not really how you’d do it in real simultaneous interpreting with a text. (Interpreter 13)
In addition to the general variations in accuracy, individual differences were also observed. A repeated measures ANOVA revealed significant individual variation in response to speed increases (
I did best at the second speed, balancing speech rate and self-monitoring. The third speed pushed me to think quickly and speak more. By the last speed, my eye-voice coordination improved while maintaining accuracy. (Interpreter 9) At faster speeds, I couldn’t keep up with reading, understanding, and translating. I found myself guessing meanings and producing output blindly, especially for difficult parts. While I might have said more, it was less accurate. I even left sentences unfinished without realizing it. (Interpreter 8) The scrolling screen created urgency. As speed increased, I spoke faster and produced more. It stimulated my thinking and improved coherence. This speed variation helps us adapt to real simultaneous interpreting scenarios. (Interpreter 15)
This is noteworthy and can be explained by the Dynamic Systems Theory (Golenia et al., 2017; Rose & Fischer, 2008; Thelen & Smith, 1994). DST emphasizes that each individual’s system has its own unique developmental trajectory. The observation that some students achieved optimal accuracy at 120 wpm, while others continued to improve at 150 wpm, aligns with this principle.
Firstly, DST views development as a non-linear process, with periods of stability and instability. The varied responses to increased speeds demonstrate this non-linearity. This variability can be explained by the concept of “attractor states” in DST (Hiver, 2015; Thelen & Smith, 1994), where each interpreter’s cognitive system settles into its own optimal state under different conditions. Interpreter 9’s statement illustrates this: “I did best at the second speed, balancing speech rate and self-monitoring.” This suggests they found an attractor state at 120 wpm, but then adapted further: “By the last speed, my eye-voice coordination improved while maintaining accuracy.”
Secondly, DST emphasizes the interconnectedness of subsystems, while Gile’s model (2009) describes interpreting as a balance between different efforts. Interpreter 8’s experience illustrates how pressure on one subsystem affects others: “At faster speeds, I couldn’t keep up with reading, understanding, and translating. I found myself guessing meanings and producing output blindly.” The strain placed on student interpreters by rapid presentation aligns with cognitive load theory (Sweller, 1988) and the Capacity Theory of Language Comprehension (Just & Carpenter, 1992). These theories suggest that there are limits to cognitive processing capacity, which can be exceeded under high-pressure conditions.
In conclusion, these findings demonstrate the complex and adaptive nature of interpreting skills development through the lens of Dynamic Systems Theory. The varied responses to increased presentation speeds—with some interpreters performing optimally at moderate speeds while others continuing to improve at higher speeds—exemplify both the non-linear nature of skill development and individual variation in developmental trajectories. These observations have important pedagogical implications: first, interpreter training programs should adopt flexible approaches that accommodate individual differences in optimal learning conditions; second, students should be systematically exposed to varied speeds and conditions to develop the adaptability required for real-world interpreting scenarios. This research thus contributes to our understanding of how interpreting skills evolve and how training can be optimized to support this dynamic developmental process.
Fluency Variations in Sight Translation Output
This section addresses the second research question:
The analysis of data revealed systematic patterns in both overall fluency and specific pause characteristics. As shown in Figure 3, the majority of interpreters exhibited enhanced fluency through decreased pauses as rates increased from 90 to 150 wpm. The line graph displays the total number of pauses (

Total number of pauses across presentation rates.
Further examination of pause characteristics revealed systematic patterns in interpreters’ performance, as illustrated in Figure 4. The graph shows distinct trends across different pause durations. Short pauses (0.3–1 s) increased linearly, peaking at 150 wpm, suggesting increased cognitive processing rather than performance degradation. Medium pauses (1–3 s) peaked at 120 wpm before declining, while long pauses (>3 s) showed consistent reduction across increasing rates. This shift toward shorter pauses while reducing longer ones indicates improved cognitive processing efficiency. To better understand these patterns, qualitative data from post-task interviews was analyzed.

Frequency distribution of different pause types across presentation rates.
Post-task interviews revealed two primary factors driving these fluency improvements. First, the dynamic text presentation exerts psychological pressure on the student interpreters, promoting a sense of temporal urgency and accelerating coordination and output rate. The following statements from the post-task interview further explored the reasons that psychological pressure heightens students’ awareness of the immediacy of the source text, thereby reducing pauses and improving fluency.
The faster text rate forced me to synchronize my speech, reducing hesitation and filler words. It significantly shortened my reaction time, improving my usually slow pace. (Interpreter 5) At the highest speed, I spoke faster and more concisely, focusing on delivering the message quickly without pauses or revisions due to the sense of urgency. (Interpreter 7)
Secondly, higher presentation rates facilitated enhanced information processing, allowing interpreters to prioritize information more effectively. The following statements from the post-task interview further explored the reasons for improved fluency.
Dynamic text presentation helped me overcome my habit of hesitation and constant revision. It forced me to start translating immediately, preventing information overload and memory issues with long sentences. (Interpreter 15) Increased speed improved my translation pace, shifting my focus from detailed word-for-word translation to identifying and expressing key information efficiently. (Interpreter 16)
The empirical findings corroborate previous research (Song, 2010; Yan & Song, 2022; Zou et al., 2023) demonstrating that moderate temporal constraints can optimize processing efficiency in ST tasks. Analysis of pause patterns revealed that as presentation rates increased from 90 to 150 wpm, the majority of participants exhibited enhanced delivery patterns, characterized by a significant reduction in overall pause frequency and a systematic shift from extended pauses (>1 s) to brief pauses (0.3–1 s). This enhancement in temporal efficiency appears to be mediated by two primary mechanisms: (1) the cognitive arousal induced by increased temporal pressure, facilitating accelerated speech-text synchronization (Assaneo et al., 2019; Zou et al., 2023), and (2) the development of more sophisticated information triage strategies, enabling more efficient content processing and delivery (Zhou et al., 2021). However, it is crucial to note that this performance trajectory was not uniformly linear across participants, with some exhibiting plateaus or periodic decrements in fluency at specific rates. These findings yield substantial pedagogical implications for ST instruction, suggesting that systematic exposure to incrementally increased presentation rates could facilitate the development of students’ temporal management skills and processing efficiency.
Strategic Adaptations of Student Interpreters
This section addresses the third research question:
Based on the principle of linear processing in ST (Chen et al., 2015; Qin & He, 2009), these four strategies represent different approaches to maintaining meaning while simultaneously processing written input and oral output (Agrifoglio, 2004; Li, 2014). Amplification involves adding contextual information to clarify meaning, conversion focuses on transforming syntactic structure while preserving meaning, repetition refers to strategic reiteration of key elements, and generalization involves summarizing specific details into broader concepts. Specific examples of these strategies are provided in the appendix.
Figure 5 illustrates the distribution of these strategies across three presentation rates. The data reveals several notable patterns: As presentation rates increased, repetition and generalization showed a clear linear upward trend, while conversion and amplification demonstrated a declining trend, with amplification peaking at 120 wpm. At the lowest rate of 90 wpm, conversion dominated while generalization was rarely used. At 120 wpm, the distribution of strategies became more balanced. Across all rates, conversion remained the most frequently used strategy, while generalization was consistently the least employed.

Frequency of coping tactics across presentation rate.
As shown in Figure 6, the distribution of accuracy scores demonstrated distinct patterns across different presentation rates. At 90 wpm, a majority of interpreters (65%) scored below 70%, with 26% scoring between 70% and 80% and only 9% achieving scores above 80%. The distribution shifted markedly at 120 wpm, where only 7% scored below 70%, while 60% scored between 70% and 80% and 33% achieved above 80%. At 150 wpm, the trend toward higher scores continued, with merely 5% scoring below 70%, 43% between 70% and 80%, and notably, 52% achieving scores above 80%.

Correlation between strategy usage and accuracy scores.
This progressive improvement in accuracy scores, particularly the substantial increase in high-performing interpreters (>80%) from 9% at 90 wpm to 52% at 150 wpm, appears counterintuitive. The data suggests that slower input rates may not necessarily facilitate better performance. This phenomenon might be explained by examining the interpreters’ coping strategies (Figure 5) and fluency patterns (Figure 3). At higher rates, interpreters demonstrated fewer pauses and adapted different coping strategies, potentially leading to more efficient processing and delivery. However, further investigation would be needed to establish the precise mechanisms behind this performance pattern.
The relationship between strategy use and fluency showed systematic patterns across presentation rates, as illustrated in Figure 7. High-fluency interpreters demonstrated dynamic strategy adaptation, with their strategy use frequency increasing proportionally with presentation rates. In contrast, low-fluency interpreters showed an inverse pattern, with declining strategy use as rates increased. This divergence was particularly pronounced at 150 wpm, where high-fluency interpreters employed significantly more strategies than their low-fluency counterparts, suggesting that effective strategy deployment may be crucial for maintaining fluent delivery under temporal pressure.

Correlation between strategy usage and fluency measures.
Post-task interviews conducted following the retrospective protocol analysis method (Ericsson & Simon, 1993; Tang, 2018) provided valuable insights into these patterns. The interview data was analyzed using thematic content analysis (Braun & Clarke, 2006), revealing that interpreters with higher accuracy reported more effective discourse segmentation and strategic implementation, particularly in using amplification for complex terminology. For example, when interpreting “professional jobs,” high-performing interpreters would amplify it as “专业性较高的工作” (jobs with higher professionalism) to better convey the meaning (see Example 1 in Appendix). In contrast, those with lower accuracy reported significant challenges with increased cognitive load at higher rates, which impaired their ability to implement strategies effectively, especially in cases requiring syntactic conversion. They often struggled with complex sentence structures, such as failing to properly convert “technology is finally beginning to encroach on that fundamental human capability” into a more natural Chinese expression ““技术最终对人类的基本能力来说,会成为一种侵蚀” (such technology would ultimately become an erosion of basic human capacities; as illustrated in Example 2 in Appendix).
The findings demonstrate that successful adaptation of strategy use to varying presentation rates is crucial for maintaining both accuracy and fluency in ST. This aligns with Cognitive Load Theory in interpreting studies (Seeber, 2015), suggesting that effective strategy implementation helps manage cognitive resources under increasing temporal pressure. The observed patterns of strategy adaptation reveal several important implications. First, the analysis of pause patterns and delivery fluency shows that high performers demonstrated more sophisticated strategy deployment, particularly in their ability to adjust processing approaches as presentation rates increased. This manifested in their systematic reduction of extended pauses and more efficient text segmentation, suggesting that strategy flexibility is a key differentiator of performance levels, consistent with previous findings on expert-novice differences (He & Wang, 2021; Yan & Song, 2022). Second, the relationship between presentation rates and strategy effectiveness indicates that interpreters need systematic exposure to varying speeds during training. Our findings revealed that participants who maintained performance quality at higher rates (120–150 wpm) typically exhibited more advanced processing strategies, such as efficient parallel processing and anticipatory behavior (Zhou et al., 2021), compared to those who struggled with speed increases. This supports earlier observations about the role of temporal management in ST performance (Song, 2010). Finally, these findings challenge the traditional view that slower presentation rates necessarily lead to better performance. The data showed that moderate time pressure (around 120 wpm) actually facilitated more effective strategy use in many participants, aligning with recent research on cognitive arousal in translation tasks (Assaneo et al., 2019). This suggests that controlled temporal constraints might serve as a catalyst for developing more efficient processing patterns, with important implications for training approaches (Chmiel et al., 2020).
Practical Implications
Our findings suggest several implications for interpreter training and professional development. First, training programs could benefit from implementing a progressive rate approach in ST instruction. The results indicate that controlled exposure to increasing presentation rates (from 90 to 150 wpm) may enhance cognitive processing efficiency. This suggests that training modules should consider gradually increasing text presentation rates while monitoring student performance.
The observed variation in strategy use across different presentation rates points to the value of strategy-focused training. Programs might focus on developing students’ ability to use the four primary coping strategies identified: amplification, conversion, repetition, and generalization. This could be achieved through targeted exercises that allow students to practice these strategies at different speeds.
For practicing interpreters, our findings indicate the potential benefits of regular practice with varying presentation rates. Professionals may want to incorporate practice sessions using tools that can simulate different text presentation conditions. These sessions could focus on maintaining performance quality across speeds and developing cognitive adaptation skills.
Implementation of these suggestions would require careful integration into existing training frameworks. Programs might develop assessment criteria based on rate-specific performance metrics and provide feedback on strategy use. Regular assessment could help track improvement and adjust training approaches as needed.
Limitations and Future Research Directions
Several limitations of the current study need to be acknowledged. The sample size of 18 participants from a single institution limits the generalizability of our findings. While providing controlled conditions, our homogeneous participant pool may not fully represent the diversity of interpreter populations. Our focus on English-Chinese ST also raises questions about the applicability of findings to other language pairs.
The study faced methodological constraints. The text genres used may not fully capture the complexity of professional ST scenarios. Without long-term follow-up data, we cannot make definitive claims about the lasting effects of exposure to different presentation rates.
Future research could explore several directions. Cross-linguistic studies extending this methodology to other language pairs would help establish the generalizability of our findings. Longitudinal studies examining the long-term impacts of rate training could provide insights into skill retention. Research on the interaction between text complexity and presentation rates might further inform training approaches.
Additional areas for investigation include the development of rate-adaptive training tools and the application of findings to diverse interpreter populations. The role of technology in training delivery also warrants further examination.
Conclusion
Our analysis of dynamic text presentation rates in English-to-Chinese sight translation has yielded several significant findings. The study revealed that increased presentation rates (90, 120, and 150 wpm) often led to enhanced performance through more efficient cognitive processing and improved information selection. Most notably, we identified four primary coping strategies—amplification, conversion, repetition, and generalization—whose deployment patterns systematically varied with presentation rates. The tendency toward increased use of repetition and generalization at higher rates, coupled with decreased reliance on conversion and amplification, suggests adaptive cognitive responses to temporal pressure.
From a theoretical perspective, this research makes several substantial contributions to the field of interpreting studies. Most notably, it challenges the traditional assumption that increased time pressure necessarily leads to performance degradation. The application of Dynamic Systems Theory has proved particularly valuable in explaining individual variations in optimal performance rates, while our findings extend existing theoretical frameworks by illuminating the relationship between dynamic presentation conditions and effort allocation patterns in sight translation. The empirical evidence supporting the beneficial effects of temporal constraints on cognitive processing efficiency represents a significant advancement in our understanding of interpreter cognition.
The practical implications of our findings are particularly relevant for interpreter training and professional development. Our research strongly supports the integration of dynamic text presentation exercises into sight translation training programs, with emphasis on progressive rate adjustment techniques and individualized approaches based on optimal performance rates. The documented strategy patterns underscore the importance of incorporating explicit training in rate-specific coping tactics, potentially revolutionizing current training methodologies.
While acknowledging the study’s limitations—including the relatively small sample size, focus on a single language pair, and absence of long-term follow-up data—these constraints point toward promising directions for future research rather than diminishing the significance of our findings. The study provides a solid foundation for developing evidence-based training methods that better reflect contemporary professional demands. The integration of our theoretical insights and practical recommendations has the potential to significantly enhance interpreter training and prepare more resilient professionals for the dynamic challenges of modern interpreting contexts.
Footnotes
Appendix
Acknowledgements
Thanks to the anonymous reviewers and academic editors for their helpful comments.
Ethical Considerations
The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of Dalian University of Foreign Languages (protocol code SATI-2023-0016 on September 20, 2023). Informed consent was obtained from all participants. Participants were adequately informed about the purpose and importance of the study before their involvement, and they provided signed consent at the location where the study was carried out. Permission was also obtained from the corresponding institution to conduct the study. Participants were made aware that the results of this study would be published, and measures were taken to maintain confidentiality and anonymity.
Author Contributions
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by Liaoning Provincial Social Science Fund Project, China.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Permissions to Reproduce Material
Permissions for the reproduction of figures and images sourced from external materials have been duly obtained from the respective copyright holders.
