Abstract
Aims and objectives:
This study investigates whether pitch level and pitch range vary across different speech modes in Turkish L2 users of English. Specifically, it examines how monolingual and mixed language speech modes influence pitch characteristics, addressing cross-linguistic effects of relative L1/L2 activation in bilingual speech production.
Methodology:
The study adopts a quantitative, within-subject pre-experimental design. Participants produced speech in four controlled reading conditions representing different speech modes: fully in English (L2-only), mostly in English with some Turkish (L2–L1 mix), mostly in Turkish with some English (L1–L2 mix), and fully in Turkish (L1-only).
Data and analysis:
Data were collected from 76 Turkish university students. Participants read four scripted texts, and their speech was analysed acoustically using Praat to extract fundamental frequency (F0) measures. Pitch level and pitch range were calculated for each speech mode. Repeated measures analyses of variance (ANOVAs) were conducted separately for female and male speakers, followed by post hoc tests and paired samples t-tests.
Findings:
Results show that the mean pitch level remains stable across all speech modes for both female and male participants. In contrast, pitch range varies significantly depending on speech mode, with wider pitch ranges observed in L1-dominant contexts and narrower ranges in L2-dominant speech. These findings indicate that pitch range is especially sensitive to cross-linguistic speech mode variation.
Originality:
This study provides a systematic investigation of pitch range and level across monolingual and mixed language speech modes in Turkish-English sequential bilinguals, extending research on bilingual prosody to an under-explored language pairing.
Implications:
The findings highlight the role of relative L1/L2 activation in shaping suprasegmental features and suggest that pitch range represents a language-sensitive dimension of L2 speech that may require more explicit engagement.
Limitations:
The study relies on scripted reading tasks and a single participant group, limiting generalisability to spontaneous speech and other bilingual populations.
Introduction and Background
Pitch is an essential component of suprasegmental features of pronunciation and plays a crucial role in communication, fulfilling different functions across languages (Dilley & Brown, 2007; Stanlaw & Khor, 2020) and affecting speakers’ comprehensibility, interactional effectiveness, and even perceived accentedness in formal assessments (Kang et al., 2010; Kang & Johnson, 2018; Lima, 2016). Being the perceptual correlate of the fundamental frequency (F0), pitch is often measured in terms of pitch level (mean F0 or pitch height) and pitch range (the span between the minimum and maximum F0 values) (Ladd, 2008). These two dimensions of pitch are critical for both native and non-native speakers (Dobrego et al., 2023), as they can influence interactional proficiency and perceived liveliness (Hincks, 2005; Hincks & Edlund, 2009). Particularly for L2 learners, using appropriate pitch patterns helps them meet cross-linguistic prosodic expectations (Chen et al., 2004), with features such as pitch variability often being associated with enhanced content delivery and listener engagement (Choi & Kang, 2023; Lascotte & Tarone, 2022). However, appropriate uses of pitch require L2 users to acquire specific phonetic settings suited for language-specific features (Mennen et al., 2010). Many language learners struggle with this, demonstrating a slow and gradual acquisition process (Saito, 2018).
From a typological perspective, English and Turkish differ in how pitch is employed at the suprasegmental level. English is commonly described as an intonation language in which pitch variation plays a central role in marking prominence and forming the discourse structure, often involving wider pitch excursions associated with stress-accent (Ladd, 2008; Stanlaw & Khor, 2020). Turkish, by contrast, tends to be more conservative in terms of pitch variation, where pitch accent is present but less systematically tied to lexical stress (Levi, 2005) and where intonational patterns are shaped mainly by pragmatic and sentence-level factors (Göksel & Kerslake, 2004).
On this ground, an important complexity in pitch use arises from L2 users’ access to multiple linguistic repertoires. Different speech modes, defined by Grosjean (1989) in terms of monolingual (L1-only or L2-only) and bilingual language contexts (L1–L2 or L2–L1), can influence these suprasegmental features, as speakers code-switch between their languages or produce speech in mixed language contexts (Olson, 2016). Although the present study draws on Grosjean’s (1989) distinction between monolingual and bilingual modes, we use the term mixed language speech modes to emphasise the observable linguistic composition of bilingual speech output. These mixed language modes, characterised by the alternation or co-occurrence of L1 and L2 elements within a single speech event, can function as intermediate conditions between monolingual and bilingual speech patterns in educational and real-life communication settings. Considering that mixed language use is common in many language teaching environments (e.g., translanguaging) and serves multiple functional purposes—such as managing interactional dynamics and enhancing content delivery (Jing & Kitis, 2024)—it is important to understand how L2 speakers’ speech is affected by cross-linguistic suprasegmental features such as pitch level and pitch range. Despite their prevalence in bi/multilingual contexts, little research has systematically examined how cross-linguistic pitch level and range manifest across different speech modes in sequential bilinguals’ mixed language speech production.
Previous Studies on Cross-Linguistic Pitch Level and Range
An important point of interest in the literature on cross-linguistic pitch is that speakers have language-specific pitch levels, which may differ from those of other languages (Braun, 1995). However, there is surprisingly little research comparing cross-linguistic pitch levels across languages, particularly regarding differences between L1 and L2 (Mennen et al., 2012). Research on cross-linguistic pitch patterns reveals that L1 speakers typically exhibit a broader pitch range compared to their L2 counterparts (Busà & Urbani, 2011; Schüppert & Heisterkamp, 2021; Zimmerer et al., 2014). For instance, Schüppert and Heisterkamp (2021) found that Dutch instructors displayed greater pitch range and liveliness when teaching in their L1 Dutch than in their L2 English, a finding that largely resonates with studies of other bilingual speakers (Busà & Urbani, 2011; Passoni et al., 2022). Similarly, Busà and Urbani (2011) conducted a preliminary study on pitch range and variation, examining Italian L2 speakers of English in comparison to native American English speakers. The researchers found that Italian speakers exhibited significantly narrower pitch ranges and less pitch variation in their L2 English compared with the L1 English group.
Another important study highlighting how L1-specific phonetic settings may influence L2 prosody was carried out by Passoni et al. (2022). The researchers explored pitch range differences among 41 Japanese-English sequential bilinguals, divided between Tokyo and London to examine the effects of immersion and sociocultural context. The study revealed that participants consistently used narrower pitch ranges in L2 English compared to L1 Japanese, regardless of their location and valid for both female and male speakers. In this regard, Zimmerer et al. (2014) also presented corroborating findings by investigating pitch range differences among French and German L2 users of English. The study identified a consistent narrowing of pitch range in L2 speech, which could be attributed to reduced confidence and the cognitive load in L2 speech production. Participants seemingly prioritised lexical and segmental accuracy, limiting their ability to employ native-like pitch patterns in L2 English.
In addition to pitch range, a growing body of voice production research has shown that bilingualism can yield distinct F0 modes. Studies on Spanish-English bilingual and monolingual speakers indicated that bilingual language experience is associated with subtle but systematic differences in acoustic parameters such as F0 distributions and voice quality (Cantor-Cutiva et al., 2021). It was demonstrated that bilingual speakers adjust their vocal effort on F0 in a language-sensitive manner, whereby L2 use is often associated with a brighter voice (Cantor-Cutiva et al., 2026). Importantly, it was further shown that being a bilingual speaker could affect L1 voice production as well, as comparisons between bilingual and monolingual speakers of the same languages revealed systematic differences in F0-related measures during L1 speech, indicating cross-linguistic influence beyond the immediate language of use (Cantor-Cutiva et al., 2023). Collectively, these studies suggest that knowing more than one language may result in distinct F0-related acoustic characteristics, highlighting the need for examining pitch in bilingual speech.
Some studies also identified cross-linguistic differences in L1 pitch use across languages. For instance, Keating and Kuo (2012) found distinct F0 profiles in Mandarin and English L1 speakers, with the former showing higher F0 ranges and means in single-word utterances. However, the researchers found no significant difference in terms of F0 range or pitch level variability when the speakers of these two languages read a prose passage. Another example is provided by Ordin and Mennen (2017), who investigated the pitch use of 30 Welsh-English simultaneous bilinguals and found gender-specific differences in pitch range. Although male speakers displayed some random variation, female speakers exhibited distinct F0 ranges in each language, using a wider pitch range and higher F0 maxima in Welsh than in English. In a cross-language comparison of pitch range between speakers of L1 English (n = 30) and L1 German (n = 30), Mennen et al. (2012) also found that female English speakers exhibited greater pitch range than female German speakers, despite comparably similar mean F0 levels.
Despite the aforementioned research, the existing literature is not exempt from controversial findings and inconclusive results. Morris (2022), analysing the speech of 32 Welsh-English bilinguals, found no significant difference in terms of cross-linguistic pitch range despite certain effects of speakers’ home language on their F0 patterns. In another cross-language comparison of F0 range and level variation, Ng et al. (2010) reported significantly lower F0 range in Cantonese, but negligible variation in F0 means, having analysed the spontaneous speech produced by 86 Cantonese-English sequential bilinguals. Moreover, Nguyễn (2020) reported that L1 English speakers in their sample demonstrated wider F0 range and pitch level variation than L1 Vietnamese speakers, attributing the observed difference to distinct characteristics of tonal and non-tonal languages.
Previous research shows that speech conditions and modes can influence speakers’ pitch characteristics. For instance, Olson (2016) revealed that bilingual speech modes containing code-switching can affect pitch range, with greater pitch range variability observed in monolingual speech modes. Another study reports that bilingual speakers employ different phonatory patterns even in monolingual speech modes, as Duarte-Borquez et al. (2024) observed that Spanish-English bilinguals systematically adjusted their voice qualities depending on the language they were speaking. Regarding speech materials, phonetic properties of intonation were also shown to vary between scripted and unscripted language contexts, affecting speakers’ control over suprasegmental features, including pitch variability and range (Liu & Tseng, 2019). This complexity was further explored by Ulbrich and Canzi (2023), who found that L2 users tend to modify their intonational patterns and pitch ranges to accommodate their interlocutor’s speech in a context-dependent speech mode that blends features of L1 and L2. In this regard, Graham and Post (2018) argued that the extent to which an L2 user can adapt to a particular pitch pattern is under the influence of L1, as they revealed that L1 Spanish speakers aligned more closely with L2 English pitch patterns than L1 Japanese speakers in their sample.
In other studies, the role of prosody was emphasised in terms of communicative effectiveness and intelligibility. For example, Choi and Kang (2023) highlighted that a wider pitch range correlates with improved ratings in oral assessments, often associated with more effective content delivery in L2 English speech. It was demonstrated that suprasegmental features such as pitch significantly influence comprehensibility in L2-accented speech (Kang et al., 2010). Likewise, English L2 users’ increased pitch range was linked with higher proficiency levels and more successful speaking performance in formal assessments (Yan & Kang, 2018), even as evidenced by the algorithms of automated speaking assessment systems (Kang & Johnson, 2018). Accordingly, it was contended that focused instruction can enhance L2 users’ pitch range and is likely to yield more intelligible and dynamic speech (Lascotte & Tarone, 2022), resulting in improved comprehensibility in L2 English regardless of perceived accentedness (Lima, 2016).
Problem Statement and Research Questions
As Levis (2018) discusses, L1 speakers of any language have implicit assumptions about the phonetic and phonological properties of pitch and how it should function in meaning-making. However, this poses a formidable difficulty for many L2 users (Lee et al., 2020; Saito, 2018), who need to acquire language-specific factors to produce expected levels of pitch range and variability (Chen et al., 2004). In this respect, despite some research on pitch range and level across languages, findings remain rather inconclusive. As exemplified in the previous section, while some studies indicate narrower pitch ranges in L2 compared to L1 (Busà & Urbani, 2011; Passoni et al., 2022; Zimmerer et al., 2014), this is not corroborated in other studies (Keating & Kuo, 2012; Ng et al., 2010). Another important problem in the existing literature is the limited variety of languages examined. To the best of our knowledge, there is a lack of research focusing on Turkish-English sequential bilinguals, particularly across different speech modes such as mixed language contexts.
Considering that the modulation of pitch is a suprasegmental feature that contributes to producing authentic speech (Dobrego et al., 2023), affects the perceived accentedness of L2 speech (Pinget et al., 2014; Ulbrich & Mennen, 2016), and is crucial for intelligibility, liveliness, and communicative effectiveness (Choi & Kang, 2023; Hincks, 2005; Kang et al., 2010; Yan & Kang, 2018; Zellers & Ogden, 2014), it is important to understand L2 users’ pitch characteristics across different speech modes and language contexts. In particular, the role of mixed language modes (e.g., L1-L2 or L2-L1 combinations) in modulating pitch features remains under-explored, despite its growing prominence in multilingual educational settings.
Understanding the prosodic nature of L2 speech in terms of pitch range and level is essential in the context of foreign language learning and teaching, in which L1, L2, or mixed language speech modes are often used flexibly. However, limited research has examined how Turkish L2 users of English modulate their pitch patterns across these speech modes. Accordingly, this study aims to fill these gaps by examining cross-linguistic pitch level and range in Turkish L1, English L2, and mixed language speech modes, contributing to a deeper understanding of bilingual prosody and how sequential bilingual speakers adjust their prosodic features when navigating between their L1 and L2. The main research questions guiding the study are as follows:
Method
Research Design
This study employs a quantitative pre-experimental design to explore the effects of cross-linguistic speech modes on pitch level and range among Turkish L2 users of English. A pre-experimental design is one in which a single group is subjected to a number of different conditions without a control group, making it suitable for exploratorily examining under-researched phenomena (Phakiti, 2014). In particular, the research procedures focus on a one-group, one-shot comparative approach, in which the participants read four different types of texts, each representing varying degrees of language mixing between L1 Turkish and L2 English. Accordingly, within-subject comparisons are made across these four speech modes, which denote different conditions of language use.
Setting and Participants
The study was conducted within the English Language Teaching department at a Turkish state university. In Türkiye, English is typically spoken as a foreign language (EFL), and most Turkish students begin to learn English in primary education as an additional language, acquired after their mother tongue (Turkish). Nonetheless, the medium of instruction in this department is English, whereby courses, materials, and assignments are delivered in English, although students can communicate with instructors using Turkish outside class hours. This setting provided access to Turkish EFL students who are familiar with both Turkish (L1) and English (L2), making it possible to examine cross-linguistic interactions across different speech modes.
The participants consisted of 76 first-year students (female = 48 and male = 28) recruited through convenience sampling based on accessibility to the researchers, combined with purposive criteria, targeting those whose L1 is Turkish and L2 is English. None had reported any history of language- or speech-related disorders, and their ages ranged from 19 to 24 years, which reflects a typical age range for first-year university students in Türkiye. Specifically, all the students had completed a compulsory preparatory year prior to their undergraduate studies, during which they were required to pass a general English proficiency exam. This institutional proficiency exam was administered before their enrolment on the programme and ensured that the students met the required B2 level of English according to the Common European Framework of Reference for Languages. Therefore, the participants represent a relatively homogeneous group of Turkish sequential bilinguals of English at the B2 level with similar linguistic backgrounds.
Data Collection Tools and Procedure
The data were collected using four prompt texts based on the story titled ‘the North Wind and the Sun’ (Poyrazla Güneş in Turkish) from the Handbook of the International Phonetic Association (1999), a methodological choice comparable to previous studies (e.g., Morris, 2022) (see Appendix 1). The first text was fully in English, representing an L2-only speech mode. The second text was mostly in English but contained some Turkish words and sentences, prompting the participants to produce speech in an L2-dominant mixed language mode. In contrast, the third text was mostly in Turkish but contained certain English words and sentences, again prompting the participants to produce speech in an L1-dominant mixed language mode. In the creation of these mixed language texts, linguistic integration was counterbalanced. Specifically, each of these texts included two intra-sentential and two inter-sentential instances of language mixing. This balance was intended to ensure the dominance of one language while maintaining comparable linguistic complexity. Finally, the fourth text was fully in Turkish, which was the participants’ native language, representing an L1-only speech mode. The texts were designed to be comparable in complexity despite differences in length due to the differing morphological and syntactic characteristics of Turkish and English. To ensure the linguistic integrity of these texts, an expert consultation was sought from a colleague with a PhD in English linguistics and a background in both Turkish and English language systems, who reviewed all four texts for linguistic appropriateness, fluency, and structural balance. These four speech modes and their configurations are summarised in Figure 1.

Matrix of investigated speech modes.
All 76 participants were audio-recorded reading aloud these four texts representing different speech modes. Recordings took place in a quiet, controlled environment to establish consistency in the collected data, ensuring that external factors such as noise did not influence the quality of audio files. To mitigate reading fatigue and increased familiarity with the texts, the reading order was counterbalanced and randomised across the participants. That is, the participants did not read the texts in a fixed sequence but received the texts in a randomised order to reduce any effects that could have influenced their pitch. To this end, a digital audio recording device (Zoom H4n) was utilised to save files in lossless 96 kHz. WAV format with 24-bit quantisation. The files were directly saved to the first researcher’s computer and organised according to participant number and gender (e.g., P1_female.wav). These audio files were later imported into the Praat software (Boersma, 2001) for further phonetic analysis.
Data Analysis
Data analysis comprised two consecutive steps. First, the audio recordings were analysed in terms of their pitch properties using Praat software, which allows for the digitisation and measurement of complex acoustic features such as speakers’ F0 levels. This procedure included a series of precise measurements of the following metrics for each participant: (a) minimum (i.e., lowest F0 value) and maximum pitch levels (i.e., highest F0 value), (b) mean pitch level (i.e., average F0 value across a speech stream), and (c) pitch range (difference between the highest and lowest F0 values in a speech stream). The operationalisation of pitch level and pitch range based on F0 values follows established practice in bilingual prosody research (e.g., Busà & Urbani, 2011; Mennen et al., 2010; Ordin & Mennen, 2017; Passoni et al., 2022; Zimmerer et al., 2014). Values for each of these metrics were calculated and compiled individually for each of the four speech modes in Hz units. In particular, these measurements were conducted at the level of intonational phrases, identified manually in each recording to allow for a more linguistically robust analysis. The settings used for male and female speakers’ pitch floor and ceiling were 60–250 Hz and 120–440 Hz, respectively. These values were slightly different from Praat’s default settings and were based on a preliminary analysis of the dataset to provide an accurate representation of F0 variation (Boersma, 2001). Furthermore, the researchers visually inspected each pitch track to eliminate artefacts that might distort F0 values. These included methods such as pitch level adjustments, detecting tracking errors as well as creaky voice segments, which were excluded from analysis. This visual inspection through waveform and spectrogram helped validate the integrity of pitch measurements.
Second, the dataset was imported into JASP (version 0.19), an open-source statistical software package widely used for statistical analysis, to conduct within-subject comparisons of pitch patterns across four speech modes. Repeated measures analyses of variance (ANOVAs) were first conducted as omnibus tests to determine whether speech mode had a statistically significant overall effect on pitch level and pitch range. Subsequently, post hoc tests were performed with Bonferroni and Holm corrections to identify specific pairwise differences between the speech modes, including comparisons across both monolingual and mixed language conditions. In addition, a set of a priori paired samples t-tests was conducted as planned comparisons targeting theoretically motivated contrasts within the same speech mode categories. Specifically, one planned comparison examined differences between the two monolingual speech modes (L2-only: fully in English vs. L1-only: fully in Turkish), and a second planned comparison examined differences between the two mixed language speech modes (L2-L1 mix: mostly in English with some Turkish vs. L1-L2 mix: mostly in Turkish with some English). These within-category comparisons were intentionally included to test central theoretical assumptions concerning cross-linguistic pitch modulation under comparable speech mode conditions, serving as a robustness check alongside previously noted conservative post hoc procedures. Due to well-established physiological differences in pitch patterns between female and male speakers (Roach, 2009), the groups were divided according to gender for all the related statistical tests.
Results
Descriptive Statistics for Measured Fundamental Frequency (F0)
Tables 1 and 2 present the descriptive statistics for the F0 values of female and male participants across four speech modes. In this section, minimum, maximum, and average F0 values are displayed in Hz, along with their standard deviation, skewness, and kurtosis, providing general insights into how pitch levels vary across different language contexts.
Descriptive statistics for female participants (n = 48).
Note. Standard error is 0.343 for skewness and 0.674 for kurtosis.
Descriptive statistics for male participants (n = 28).
Note. Standard error is 0.441 for skewness and 0.858 for kurtosis.
Minimum F0 values were highest in L2-only mode (T1_ENG, M = 173.19 Hz, SD = 17.88) and lowest in L1-L2 mixed mode (T3_TReng, M = 159.32 Hz, SD = 16.73), suggesting a tendency for lower pitch floors as L1 Turkish becomes more prominent in speech production. Maximum F0 was highest in L1-only mode (T4_TR, M = 337.56 Hz, SD = 42.19) and lowest in L2-only mode (T1_ENG, M = 321.29 Hz, SD = 32.86), indicating a broader pitch ceiling when speaking in the native language. Average F0 values remained relatively stable across all four speech modes, with slightly higher values observed in L2-L1 mixed mode (T2_ENGtr, M = 238.24 Hz, SD = 20.74) and L1-L2 mixed mode (T3_TReng, M = 237.94 Hz, SD = 20.69) compared to L2-only mode (T1_ENG, M = 236.62 Hz, SD = 20.34) and L1-only mode (T4_TR, M = 237.06 Hz, SD = 21.61). Skewness and kurtosis values for all F0 measures indicate relatively normal distribution patterns across speech modes, with no substantial deviations from normality.
Minimum F0 values for male participants were relatively consistent across the four speech modes, with slightly higher values observed in L2-only mode (T1_ENG, M = 82.40 Hz, SD = 16.76) and L2-L1 mixed mode (T2_ENGtr, M = 82.64 Hz, SD = 15.91). The lowest minimum F0 was recorded in L1-only mode (T4_TR, M = 80.40 Hz, SD = 16.89), indicating a subtle decrease in pitch floor when producing speech in L1 Turkish. Maximum F0 values followed a similar pattern of consistency, with the highest mean observed in L1-L2 mixed mode (T3_TReng, M = 179.30 Hz, SD = 32.68), followed closely by L1-only mode (T4_TR, M = 178.27 Hz, SD = 39.98); while the lowest maximum was in L2-L1 mixed mode (T2_ENGtr, M = 174.32 Hz, SD = 34.92). In terms of average F0 values, male participants demonstrated a stable distribution across all four speech modes, with only marginal differences, suggesting that mean pitch levels remained largely unaffected by language context. Skewness and kurtosis values across all F0 measures indicate relatively normal distribution, with no substantial deviations observed that would suggest irregular patterns in pitch production.
Variation of Mean Pitch Level Across Four Speech Modes
Mean pitch level is a key metric for suprasegmental features, referring to the average F0 produced by a speaker throughout their speech. It reflects the overall vocal register used during spoken discourse and provides insight into prosodic variation across different language contexts. A repeated measures ANOVA was conducted to examine the variation of mean pitch levels across the four speech modes.
Table 3 reports the results of the repeated measures ANOVA that examines variation in mean pitch level across the speech modes for female and male participants. For the female participants, the analysis revealed no statistically significant difference in mean pitch levels across the four speech modes, F(1.651, 77.592) = 0.598, p = .521. Similarly, for the male participants, the analysis also indicated non-significant variation across speech modes, F(2.102, 56.741) = 0.217, p = .816. As both p-values exceeded the alpha level of .05, it can be stated that mean pitch levels remained stable regardless of speech mode for both female and male participants. This finding suggests that whether participants produced speech in L2 English, L1 Turkish, or in mixed language contexts (L1-L2 mix and L2-L1 mix), their average pitch levels did not show meaningful fluctuations. Consequently, the results imply that the mean pitch level is a relatively stable prosodic feature, unaffected by cross-linguistic effects in speech production.
Repeated measures ANOVA for mean pitch levels across four speech modes (within subjects).
Note. Type III Sum of Squares. Since Mauchly’s test of sphericity violates the assumption of sphericity in females and males (p < .05), the Greenhouse–Geisser correction is provided.
Variation of Pitch Range Across Four Speech Modes
Another important metric for suprasegmental features is pitch range, which demonstrates the extent of a speaker’s variation between minimum and maximum pitch levels throughout their speech production. This measure reflects the numerical difference (in Hz) between the lowest and highest F0 values produced by speakers and provides an index of prosodic variability in pitch modulation across different language contexts. A repeated measures ANOVA was conducted to examine the variation of pitch ranges across the four speech modes.
Table 4 reports the results of the repeated measures ANOVA that examines variation in pitch range across the speech modes for female and male participants. For the female participants, the analysis indicated a statistically significant difference in pitch range across the four speech modes, F(2.526, 118.719) = 19.237, p < .001. It was found that the female speakers modulated their pitch range depending on whether they were speaking in L1 Turkish, L2 English, or in mixed language contexts (L1-L2 or L2-L1 mix). The associated large effect size (ηp² = .290) also indicated a considerable proportion of variance in pitch range, which can be attributed to the speech mode variable. Similarly, for the male participants, a significant variation in pitch range was observed, F(3, 81) = 4.254, p = .008, with a moderate effect size (ηp² = .136). These results indicate that speech mode has a significant influence on pitch range, showing that suprasegmental features such as pitch range are sensitive to cross-linguistic effects of the language context in which speech is produced.
Repeated measures ANOVA for pitch ranges across four speech modes (within subjects).
Note. Type III Sum of Squares. Since Mauchly’s test of sphericity violates the assumption of sphericity in females (p < .05), the Greenhouse–Geisser correction is provided.
Table 5 presents the results of the Bonferroni- and Holm-adjusted post hoc comparisons for pitch range for the female participants. Post hoc comparisons confirmed that pitch range significantly increased in speech modes with greater L1 activation. Compared to the L2-only mode (T1_ENG), pitch range was significantly higher in both L1-L2 mixed mode (T3_TReng) and L1-only mode (T4_TR) (p < .001 for both; d = −0.741 and −0.675, respectively). Similarly, significant differences were found between the L2-L1 mixed mode (T2_ENGtr) and both L1-L2 mixed (T3_TReng) and L1-only modes (T4_TR) with medium-to-large effect sizes (p < .001 for both, d = −0.592 and −0.527, respectively). No significant difference was observed between the two English-dominant modes (T1_ENG vs. T2_ENGtr, p = .497) or the two Turkish-dominant modes (T3_TReng vs. T4_TR, p = 1.000), suggesting consistent pitch range within each respective language context. Overall, these results indicate that L1 involvement facilitates broader pitch variation, while L2 English-dominant modes result in narrower pitch ranges, which can reflect more restricted prosodic expression.
Post hoc results for the variation of pitch range across four speech modes for females (n = 48).
Similarly, Table 6 presents the corresponding post hoc comparisons for male participants. Post hoc comparisons revealed a comparatively modest pattern of pitch range variation across speech modes, with fewer significant differences than the female speakers. No significant difference was found between the L2-only (T1_ENG) and L2-L1 mixed mode (T2_ENGtr) (p = 1.000), suggesting similar pitch range in L2 English-dominant contexts. Although comparisons between L2-only (T1_ENG) and L1 Turkish-dominant modes (T3_TReng, T4_TR) showed a tendency towards increased pitch range, these did not reach significance as well (p = .160 and p = .097, respectively). A significant difference emerged between L2-L1 mixed mode (T2_ENGtr) and L1-L2 mixed mode (T3_TReng) (p = .036, d = −0.235), indicating greater pitch range in the mixed speech mode where L1 Turkish is more active. A similar but non-significant trend was observed between L2-L1 mixed mode (T2_ENGtr) and L1-only mode (T4_TR) (p = .079, d = −0.221). Finally, no significant difference was found between L1-L2 mixed mode (T3_TReng) and L1-only mode (T4_TR) (p = 1.000). Overall, it is shown that pitch range tends to increase with greater L1 activation, though the effects were smaller and less consistent than in the female participants.
Post hoc results for the variation of pitch range across four speech modes for males (n = 28).
To further verify and address potential issues with post hoc corrections regarding pitch range variation across speech modes, paired samples t-tests were conducted between conceptually matched speech contexts: L2-only versus L1-only and L2-L1 mixed versus L1-L2 mixed modes. As reported in Table 7, for the female participants, pitch range was significantly wider in L1-only (T4_TR) mode compared to L2-only mode (T1_ENG) (p < .001, d = −0.766, 95% confidence interval (CI) [−1.085, −0.441]), and also significantly higher in L1-L2 mixed mode (T3_TReng) compared to L2-L1 mixed mode (T2_ENGtr) (p < .001, d = −0.636, 95% CI [−0.943, −0.322]), indicating medium-to-large effect sizes. For the male participants, although most post hoc comparisons did not reach statistical significance after Bonferroni–Holm corrections, the paired samples t-tests revealed significant differences in both comparisons. Pitch range was significantly higher in L1-only mode (T4_TR) compared to L2-only mode (T1_ENG) (p = .016, d = −0.485, 95% CI [−0.873, −0.089]), as well as in L1-L2 mixed mode (T3_TReng) compared to L2-L1 mixed mode (T2_ENGtr) (p = .006, d = −0.564, 95% CI [−0.959, −0.160]), suggesting moderate effect sizes.
Paired samples t-test for cross-linguistic pitch range across speech mode pairs.
This divergence between post hoc and paired samples t-test results for male participants may stem from the conservative nature of Bonferroni-based corrections, which are designed to reduce the risk of Type I statistical errors but may also obscure meaningful patterns in relatively small sample sizes. In contrast, paired samples t-tests focus on targeted comparisons with increased statistical power, providing stronger support for the influence of L1 activation on pitch range. These findings reinforce the earlier conclusion that pitch range tends to increase with greater L1 involvement and highlight the importance of complementary statistical approaches when interpreting within-subject variation. Accordingly, Figures 2 and 3 visually illustrate the differences in pitch range for female and male participants, complementing the statistical results from the paired samples t-tests.

Comparison of female speakers’ (n = 48) pitch range (Hz) in monolingual (T1_ENG vs. T4_TR) and mixed language speech modes (T2_ENGtr vs. T3_TReng).

Comparison of male speakers’ (n = 28) pitch range (Hz) in monolingual (T1_ENG vs. T4_TR) and mixed language speech modes (T2_ENGtr vs. T3_TReng).
Discussion
This study contributes to the expanding literature on the cross-linguistic variation of pitch, providing a unique point of reference especially with respect to Turkish sequential bilinguals of English. In this regard, the speech data from 76 Turkish L2 users of English were analysed in terms of mean pitch level and pitch range across monolingual (L2-only and L1-only) and mixed language (L1-L2 mix and L2-L1 mix) speech modes. The overall results were both complementary and contradictory to the previous research, highlighting that there is more to explore about the nature of cross-linguistic pitch characteristics.
In terms of the participants’ mean pitch level, within-subject comparisons using repeated measures ANOVA showed a stable vocal pitch pattern across the four speech modes (L2-only, L2-L1 mix, L1-L2 mix, L1-only). This suggests that the tested speech modes involving varying degrees of L1 Turkish and L2 English did not cause noticeable shifts in mean pitch levels, supporting the notion of a somewhat uniform pitch level across different language contexts. It is worth underlining that this stability was valid for both female and male speakers and corroborated findings from some previous studies comparing language-specific pitch levels (Cantor-Cutiva et al., 2021; Keating & Kuo, 2012; Mennen et al., 2012; Ng et al., 2010) but contradicted others (Passoni et al., 2022). The insignificant variation of pitch level observed across these four speech modes supports the idea that mean F0 levels can be considered a relatively universal feature of pitch (Dobrego et al., 2023). As a caveat, it should nonetheless be kept in mind that the current research design had a single study group consisting of Turkish sequential bilinguals of English, for which monolingual control groups for both of these languages could be useful to verify this finding.
An important cross-linguistic effect was observed in the participants’ pitch range, as shown by significant variation across the four speech modes indicated by repeated measures ANOVA. This finding provided evidence that bilingual speakers employ different pitch ranges according to the given linguistic code (i.e., English or Turkish) and speech mode (i.e., monolingual or mixed modes), as summarised in Figure 1. Furthermore, post hoc results for the female participants revealed significantly greater pitch ranges in L1-only and L1-L2 mix modes than L2-only and L2-L1 mix modes, where narrower pitch ranges were observed when L2 English was activated in speech production. This pattern supports earlier studies, such as Busà and Urbani (2011), Passoni et al. (2022), Schüppert and Heisterkamp (2021), and Zimmerer et al. (2014), all of which reported a narrower pitch range in L2 speech. The present findings also align with bilingual voice studies indicating systematic F0-related adjustments during L2 use (Cantor-Cutiva et al., 2026). However, it stands in contrast to some previous research (Ng et al., 2010), which focused on different linguistic configurations (L1 Cantonese) where pitch fulfils lexical functions different from Turkish. Importantly, effect sizes were larger for the female participants, suggesting that the cross-linguistic effect on pitch range could be more pronounced among female speakers.
As for the male participants, the post hoc results suggested a somewhat inconsistent pattern, in which most comparisons did not reach statistical significance. This may be attributed to the modest sample size and the conservative nature of Bonferroni and Holm post hoc corrections. To further verify the observed trends, paired samples t-tests were conducted between conceptually matched speech modes (L2-only vs. L1-only, and L2-L1 mix vs. L1-L2 mix). As a further check, these pairwise comparisons revealed significant differences for both female and male participants, giving support to the observation of greater pitch range in cross-linguistic speech contexts where L1 Turkish is active and dominant. Similarly, the effect sizes for paired samples t-tests were larger for the female speakers than the males. In this respect, it can be said that a larger effect size is expected of female speakers because of their naturally lower F0 floors and higher F0 ceilings (Roach, 2009), which is a pattern also observed in earlier studies (Morris, 2022; Ordin & Mennen, 2017).
Some of the aforementioned findings were surprising considering the phonetic and phonological properties of Turkish and English. In general, English is typically regarded as a stress-accent language whereby greater variation of pitch range is expected as part of increased duration, pitch range, and amplitude on stressed syllables. Turkish is, however, often considered a syllable-timed language in which pitch accent can be present but does not fulfil important lexico-grammatical functions. In this respect, it is generally expected for English to exhibit greater variation in pitch range than Turkish; however, the findings indicated that it was the opposite case in the current study, overriding phonetic and phonological expectations. One possible explanation for this divergence is that language-specific phonetic settings associated with Turkish may exert a stronger influence on pitch range. As various models of language-specific phonetic settings indicate that sequential bilinguals may retain L1-based prosodic calibration (Chen et al., 2004; Mennen et al., 2010), leading to reduced pitch range variability when using L2 English despite its stress-accent properties. Therefore, a feasible explanation for this finding is likely to be the L1 and L2 status of these respective languages. It is, hence, argued that prosodic structure affected by language-specific factors may sometimes be superseded by the order of acquisition and type of bi/multilingualism. Because the participants spoke English as an L2, they might have been more cautious in their speech production given their reported proficiency level (B2). From a physiological perspective, narrower pitch range in L2 English may reflect increased cognitive and articulatory control, resulting in constrained pitch excursions as sequential bilinguals prioritise segmental accuracy and vocal stability (Cantor-Cutiva et al., 2023, 2026).
Naturally, this study is subject to several limitations that should be considered when interpreting the findings presented. First, the sample was limited to 76 Turkish L2 users of English enrolled in an English Language Teaching department at a Turkish state university. While this sample size is comparatively larger than those used in several earlier studies (e.g., Keating & Kuo, 2012; Ordin & Mennen, 2017; Passoni et al., 2022), caution should still be warranted when generalising the findings to broader bi/multilingual populations, particularly those with different linguistic configurations (e.g., simultaneous bilinguals or speakers of other L1s), educational backgrounds, and proficiency levels. Although all the participants had attained at least B2 level proficiency in English based on departmental placement requirements, there may have been unaccounted interpersonal variation in their spoken language competence, possibly influencing their engagement with different speech modes and affecting pitch level and range. Moreover, the study included only one participant group without any monolingual control groups for comparison. This may limit the extent to which language-specific variation in pitch range can be isolated. In addition, the data were collected through scripted reading tasks in a controlled environment, which may not fully reflect all speech contexts such as spontaneous speech or interpersonal interactions. Although efforts were made to reduce recording bias, such as allowing for re-reading the texts if needed, factors like speaking anxiety or unfamiliarity with the researcher might still have influenced performance.
Conclusion and Suggestions
This study has examined the cross-linguistic effects of four distinct speech modes on pitch level and range exhibited by Turkish L2 users of English. The key findings signify that pitch level remains stable across all speech modes; however, pitch range becomes greater in L1 contexts but narrower in L2 contexts, which is the case for both monolingual and mixed language speech modes. As such, it is highlighted that pitch level might be a relatively universal feature of prosody for bi/multilinguals, remaining largely unaffected by language-based factors or the degree of L1 or L2 activation. Conversely, cross-linguistic factors have more profound effects on pitch range, making it relatively sensitive to the specific language and speech mode. It is, therefore, suggested that speech mode and the extent of L1-L2 activation play a critical role in shaping suprasegmental variation, particularly with respect to cross-linguistic pitch range among L2 users.
While the present findings are based on production data alone, they raise important questions for future research on pitch in bilingual speech production. The results suggest that sequential bilinguals may need to engage more explicitly with language-sensitive suprasegmental features such as pitch range. Moreover, this study proposes mixed language speech modes as distinct from purely monolingual contexts, with their own prosodic characteristics depending on relative L1 and L2 activation, rather than simply defining them as a general bilingual speech mode. Future research could further explore the prosodic dynamics of such mixed language contexts by systematically manipulating the degree of L1-L2 activation or incorporating additional languages (e.g., L3). It would also be valuable to extend this line of inquiry beyond scripted reading tasks to include more spontaneous or interactional speech, which may better capture natural prosodic variation. Finally, cross-linguistic comparisons involving speakers with different L1-L2 pairings, diverse bilingual profiles (e.g., early vs. late bilinguals, simultaneous vs. sequential), and proficiency levels could help determine whether the observed effects are specific to Turkish sequential bilinguals of English or reflect broader patterns in bi/multilingual prosody.
Footnotes
Appendix 1
The texts used for eliciting four different speech modes are listed below. Words in italics represent Turkish, whereas others are in English.
Acknowledgements
This article draws on the first author’s (unpublished) doctoral dissertation, ‘An Investigation of Pronunciation-Based Translanguaging Practices in English Language Teacher Education’, conducted under the supervision of the second author at Hacettepe University, Graduate School of Educational Sciences. We also thank the two anonymous reviewers for their constructive feedback.
Ethical Considerations
The ethical approval was obtained from the Hacettepe University, Social Sciences and Humanities Research Ethics Board on 29 December 2023. All participants were fully informed about the purpose and procedures of the study and provided their informed consent prior to participation.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research has received no specific funding. However, the first author was supported by the TUBITAK 2211-A National PhD Scholarship Programme during the course of this study.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The data collected and analysed during this study are not publicly available due to confidentiality and privacy concerns of the participants. For further inquiries, researchers may contact the corresponding author.
