Sage Journals: Discover world-class research

Abstract

This study investigates the production of English and Tagalog voiceless stops by 14 Tagalog–English bilinguals in California, focusing on the effects of birth country and language dominance. Specifically, the study addresses three questions: (1) How do US-born heritage Tagalog speakers differ from Philippines-born heritage speakers in pronunciation? (2) How do their language-specific acoustic realizations in Tagalog and English vary as a function of birthplace? (3) How does individual language dominance influence the production of voiceless stops in both languages? Data from a reading-aloud task reveal that both US-born (Generation 2) and Philippines-born (Generations 1 and 1.5) bilinguals maintain distinct voice onset time (VOT) patterns for /p/, /t/, and /k/ in each language. Compared to US-born speakers, Philippines-born participants produce more ‘Tagalog-like’ VOTs in English, indicating phonetic convergence. Individual analyses further show that English-dominant bilinguals generally produce longer English VOTs than balanced or Tagalog-dominant speakers, whereas this trend was not statistically significant in their Tagalog VOTs. These findings illuminate the sources of cross-linguistic phonological influence and provide novel insight into the phonetic behavior of a heterogeneous and understudied diasporic community in the US, highlighting the interplay of birthplace and language dominance in heritage bilingual speech.

Keywords

cross-linguistic influence Filipino diaspora heritage speakers speech production voiceless stops VOT

I Introduction

Heritage speakers are early bilinguals who are exposed to both a minority (heritage) language and the majority societal language from a young age. This bilingual exposure may occur either simultaneously, when both languages are acquired from birth, or sequentially, when a child initially acquires the heritage language at home and later learns the majority language upon entering school, typically around the age of 5 or 6 years. As a subgroup of early bilinguals, heritage speakers are distinguished by the fact that their primary language is not the dominant language of their surrounding community. They acquire their heritage language naturally in the home environment, either as the sole language of communication or alongside the majority language in bilingual or multilingual contexts. Heritage languages encompass a wide range of linguistic situations, including diasporic languages spoken by the children of immigrants, indigenous or aboriginal languages threatened by colonization, and historical minority languages that coexist with dominant national languages (Kim, 2024; Montrul and Polinsky, 2021). Over the past several decades, research has demonstrated that heritage speakers occupy a unique position along the bilingual continuum: they differ systematically from both monolingual and second language (L2) speakers yet exhibit overlapping linguistic characteristics with each group (Benmamoun et al., 2013; Polinsky, 2018).

Studies of heritage language speech production have consistently shown that early bilinguals demonstrate advantages over late L2 learners in approximating target-like phonetic categories, largely due to early and sustained exposure to the heritage language (Au et al., 2002; Knightly et al., 2003). However, more recent findings indicate that while heritage speakers can form distinct phonetic categories for each language, these categories are not entirely independent; rather, they exhibit measurable interaction across the bilingual’s sound systems (Chang, 2016; Mayr et al., 2019). Such interlingual phonetic relationships, widely documented in bilingualism and L2 speech research, are attributed to the shared representational networks underlying both languages (Escudero, 2005; Escudero and Yazawa, 2024; Flege, 1995; Flege and Bohn, 2021). These interactions give rise to cross-linguistic influence (CLI), defined as ‘the ways in which a person’s knowledge of the sound system of one language can affect that person’s perception and production of speech sounds in another language’ (Jarvis and Pavlenko, 2008: 62).

CLI has thus become a central focus in bilingual phonetics and phonology, offering insight into how dual language systems interact (Amengual, 2023). To understand CLI more fully, researchers must consider a range of factors, such as proficiency, dominance, and language use, alongside community-specific sociolinguistic variables (Amengual, 2024; Flege et al., 1997, 2003; Guion, 2003). One such variable is the generation of immigration, which can influence both the degree of language exposure and the relative stability of phonetic categories among bilingual speakers whose home language differs from the dominant societal language.

Although research on language shift has long documented intergenerational heritage language attrition, most notably through Fishman’s (1964, 1991) three-generation model, few studies have analysed the acoustic properties of bilingual speech across groups differing by birth country. Building on this framework, the present study primarily investigates birth country as a key determinant of bilingual phonetic outcomes. Specifically, it compares speakers born in the Philippines to those born in the US within Tagalog–English bilingual communities in California. This approach allows for a more precise examination of how place of birth and timing of migration jointly influence phonetic production patterns. Such analyses are essential for understanding the fine-grained phonetic and phonological mechanisms underlying language shift, which reflects broader sociolinguistic dynamics. These processes are particularly salient in diasporic communities in the US, where English increasingly displaces the heritage language as the dominant language. By analysing the acoustic realization of Tagalog and English sounds across these groups, this study examines how birthplace and migration timing interact to shape cross-linguistic influence, providing new insight into the mechanisms of phonetic variation, language maintenance, and shift within a diasporic community that remains markedly underrepresented in the heritage bilingualism literature.

II Background

1 Tagalog and the Filipino diaspora in the US and California

Tagalog, a Malayo-Polynesian language native to the Philippines, boasts over 90 million speakers worldwide (Malabonga, 2009). The Philippines itself is home to a linguistically diverse population, with between 120 and 187 languages spoken (Gordon, 2005; McFarland, 1994). Tagalog, standardized as Filipino, is one of the country’s two official languages, alongside English (Gonzalez, 1998), and serves as the lingua franca for Filipinos across various ethnolinguistic backgrounds. As a result, Tagalog is the most widely spoken language among the Filipino diaspora.

Outside the Philippines, the United States (US) is home to the largest number of Tagalog speakers, including both first-generation Filipino immigrants and US-born Filipino Americans. According to the 2019 American Community Survey by the US Census Bureau (Dietrich and Hernandez, 2022), Tagalog ranks as the third most spoken non-English language in the US, with approximately 1.8 million speakers, following Spanish and Chinese. Nearly half of these speakers live in California (Fonacier, 2010), where Tagalog was the third most spoken language in 2005, with 668,073 speakers. It surpassed Mandarin and Cantonese, ranking only behind English and Spanish (Axel, 2011). Tagalog’s prominence is particularly evident in areas with large Filipino populations, such as Los Angeles, San Diego, and the San Francisco Bay Area, where bilingualism is widespread within the Filipino community.

The use of Tagalog in California extends beyond homes and community spaces; it is also reflected in the state’s media landscape, with Tagalog-language radio stations, television programs, and newspapers catering to the Filipino population. Furthermore, public institutions and educational settings are increasingly offering Tagalog courses and resources in response to the growing demand for bilingual education. Despite Tagalog’s significant presence in the US and California, there remains a surprising lack of research on the language’s linguistic features, particularly in relation to contact-induced changes in the speech of bilingual speakers. This study seeks to fill that gap by analysing acoustic data from Tagalog-speaking residents in California.

2 Intergenerational differences in pronunciation and language shift

In the US, Tagalog, like many other languages, is preserved through both immigration and intergenerational transmission, the passing of language knowledge from one generation to the next within families. Demographers have observed that immigrant minority languages typically undergo an intergenerational language shift toward English in US immigrant families (Fishman, 1965, 1966; Veltman, 1983a, 1983b). This language shift refers to the gradual transition from the immigrant (minority) language to the societal (majority) language, which in the US is English, across successive generations. Fishman’s (1964) three-generation model of language shift conceptualizes the intergenerational trajectory of linguistic assimilation within immigrant communities. In this framework, the first generation (G1) consists of foreign-born immigrants who predominantly retain their native language as the primary medium of intra-group communication. The second generation (G2), comprising the native-born children of immigrants, typically develops bilingual competence, maintaining the heritage language within the familial domain while acquiring and using the dominant societal language in educational and public spheres. By the third generation (G3), representing the native-born grandchildren of immigrants, the process of linguistic shift tends to reach its culmination, as this cohort exhibits near-complete assimilation and monolingualism in the dominant language. While scholars in recent decades have investigated the relationship between life course, age at migration, and generational status among first-generation immigrants, there remains considerable inconsistency in the classification of individuals born abroad who immigrated as adults or children, as well as their offspring, particularly with respect to the varying age thresholds employed to delineate these categories (Waters, 2014).

The absence of scholarly consensus regarding the demarcation of generational boundaries based on age at migration is exemplified in the evolving classificatory frameworks advanced by Portes and Rumbaut. In their earlier formulation, Portes and Rumbaut (2001) situated foreign-born children who migrated in early childhood within the second generation. In a subsequent refinement, Rumbaut (2004) and Portes and Rumbaut (2006) reconceptualized these distinctions through the introduction of the ‘1.5 generation’, denoting those who migrated between the ages of 6 and 12 years, and further delineated fractional categories, 1.75 and 1.25 generations, to designate, respectively, those arriving prior to formal schooling (ages 0–5 years) and those immigrating during adolescence (ages 13–17 years). In this study, we follow the categorization of Silva-Corvalán (1994) who refers to G1 as those foreign-born individuals who immigrated to the US during or after puberty (age of 12 years or after), G1.5 arriving in the US between the ages of 6 and 12 years (Portes and Rumbaut, 2006), and G2 as children of the G1, who were born in the US.

There are still relatively few studies that explore intergenerational differences in pronunciation. In a cross-linguistic investigation of contact-induced change in Toronto, Nagy and Kochetov (2013) examined the acoustic features of voiceless stops in the speech of three generations of heritage speakers (Italian, Russian, and Ukrainian) who also speak English. Their findings revealed a shift toward English voice onset time (VOT) values in the Russian and Ukrainian groups across generations, but not in the Italian speakers. This difference may be linked to the long-established Italian community in Toronto and the city’s educational resources that help preserve the language. Additionally, the study found that reduced heritage language use and weaker ethnic identity were associated with longer, more English-like VOT values.

In a study on the impact of intergenerational transmission of the heritage language, Mayr and Siddika (2018) analysed the production of stop consonants in both Sylheti and English among bilingual children and adults from two sets of Bangladeshi heritage families: G1 migrants from the Sylhet region of Bangladesh who moved to the UK as adults, and their UK-born (G2) children, and G2 UK-born adults and their (G3) children. The results showed significant generational differences in both Sylheti and English, as well as between the children and adult participants. Specifically, these bilinguals demonstrated gradual shifts toward the English VOT range across generations, with G3 children showing the strongest influence from English due to their linguistic environment. The authors suggest that G3 speakers may face both identity-related and input-related challenges that hinder their ability to achieve a native-like accent in their heritage language.

In a related study with the same language pair (Sylheti–English), Mayr et al. (2021) explored speech sound development across different generations of children raised in heritage language environments, focusing specifically on intergenerational differences in the phonological development of G2 and G3 Bengali heritage children in Wales. The results of a picture-naming task in both Sylheti and English showed high levels of accuracy in consonant and vowel production for children from both immigrant generations, particularly in English. Regarding Sylheti consonants, G2 children outperformed G3 children, but only on sounds specific to Sylheti. However, immigration generation did not significantly predict accuracy for English consonants. Additionally, G3 children showed more error types in Sylheti than G2 children, including a more frequent replacement of Sylheti dental stops with alveolar stops. These findings suggest that generational status may be an important factor to consider when assessing bilingual children’s phonological development in their heritage language.

Focusing on Spanish heritage speakers in California, Amengual (2018) examined the pronunciation patterns of four groups of Spanish–English bilinguals, incorporating the variable of immigrant generation. The study analysed the acoustic realization of voiced lateral approximants in the Spanish and English of Spanish heritage speakers (G1.5, G2, and G3) and L2 Spanish learners in California, who varied in their degree of language dominance. The results showed a language-specific phonetic distribution for each language, with Spanish and English laterals differing in their degree of velarization. English laterals exhibited a darker, more velarized articulation, while Spanish laterals were clearer and less velarized across all speaker groups. Intergenerational differences were also found within the Spanish heritage speaker groups: G1.5 speakers produced lighter (more Spanish-like) laterals, while G3 speakers produced darker (more English-like) laterals. These findings align with predictions related to language shift from Spanish to English across immigrant generations in the US.

Returning to the variable of immigrant generation in the Filipino communities in the US, several sociolinguistic studies have classified participants based on the individual’s or family’s connection to the immigration experience, showing a rapid shift towards English. For instance, Axel (2011) conducted interviews with eleven Filipinos (five G1 participants and six G2 participants). The findings reveal a clear shift towards English. For G2 participants, English is the primary language spoken both at home and outside the home, along with Spanish, which is widely spoken in California. The responses to the questionnaire also indicate that G2 children rarely, if ever, acquire Tagalog, as their parents worry that speaking these languages would give their children an ‘accent’ in English (Axel, 2011: 126), which is seen as a significant obstacle to social integration and securing higher-paying jobs (Moro and Russo, 2024).

In this study, birth country serves as the primary factor distinguishing groups of Tagalog–English bilinguals. Participants are first classified based on whether they were born in the Philippines or the US. Within the Philippines-born group, age of arrival is used to operationalize ‘generation’: early-arriving individuals (between the ages of 6 and 11 years) are classified as Generation 1.5 (G1.5), while immigrants who arrived in the US during or after puberty (age 12 years or after) are classified as first generation (G1). The second generation (G2) includes US-born individuals with parents who immigrated from the Philippines. By structuring the sample in this way, the study primarily examines the effects of birthplace on bilingual phonetic outcomes, while also considering age of arrival among Philippine-born participants as a secondary factor influencing language experience. This approach enables a nuanced analysis of how birthplace and migration timing shape the intergenerational transmission of the heritage language within Tagalog–English bilingual communities in California.

3 Heritage Tagalog speech

With respect to the existing research on the pronunciation of heritage speakers, only a handful of studies have focused on the sound system of Tagalog in the Filipino diasporic communities of Toronto, Canada (Kang et al., 2016; Umbal, 2023; Umbal and Nagy, 2021). Kang et al. (2016) compared the production of nine heritage Tagalog speakers’ voiced and voiceless Tagalog stops with ten monolingual Tagalog speakers, and their voiced and voiceless English stops with twelve monolingual English speakers. Their results show that the heritage Tagalog speakers produce target-like voiceless stops in both English and Tagalog, establishing separate phonetic categories in each language, but also reveal that these same heritage speakers exhibit considerable cross-linguistic influence in the form of merged phonetic categories in their acoustic realization of English and Tagalog voiced stops. In the case of voiceless stops, Kang et al. (2016) found that Tagalog heritage speakers can form and maintain separate representations for each language.

In a study examining the speech of heritage Tagalog speakers in Toronto, Umbal and Nagy (2021) use a variationist sociolinguistic framework to analyse Tagalog rhotics in the spontaneous speech of fifteen Generation 1 (G1) and eight Generation 2 (G2) Tagalog speakers, as well as nine homeland speakers from Manila (Philippines). They note that Tagalog has one rhotic phoneme, which typically appears as a tap or trill (Schachter and Otanes, 1972), although an approximant variant has also been observed, likely due to contact with English (Chen et al., 2016; Lesho, 2018). Comparisons between generations and groups show that G2 speakers use the approximant variant more than G1 speakers. Additionally, heritage speakers who report using or preferring English are more likely to use the approximant variant than those who prefer Tagalog. However, the study did not find a significant effect of ethnic identity on the use of the approximant variant: being oriented towards Filipino identity did not appear to influence its use.

In his dissertation, Umbal (2023) examines the production of /p, t, k/ in word-initial voiceless stops by sixteen heritage speakers, categorized by immigrant generation (G1, G2), and twelve homeland speakers, categorized by age group (older, younger). In his analysis of naturalistic speech data, Umbal finds no significant differences between G1 speakers and homeland speakers. Specifically, the degree of English contact among G1 speakers does not appear to influence their Tagalog VOT compared to the homeland speakers. However, a generational difference in VOT is observed between G1 and G2 speakers, with G2 speakers exhibiting a shift towards longer-lag VOT, a pattern more characteristic of English. Umbal interprets this shift as a result of contact-induced change and cross-generational drift.

4 The phonetic variable: Voice onset time (VOT) in Tagalog and English voiceless stops

Voice Onset Time (VOT) refers to the relative timing of the release of the air for a stop consonant and the onset of vocal fold vibration (voicing) of a following vowel. This acoustic measure is widely used as the primary correlate of the voicing contrast in many languages. Since VOT is language-specific (Abramson and Whalen, 2017; Cho and Ladefoged, 1999) and can vary along a continuum from voiceless aspiration to voiced stops, it provides a valuable lens for examining interlingual influence in the pronunciation of Tagalog and English.

Tagalog is considered a true voicing language, where voiced stops are produced with a negative VOT and voiceless stops have a short-lag VOT (Kang et al., 2016). Specifically, word-initial /p, t, k/ in Tagalog have a short VOT and are unaspirated, as [p, t, k]. Previous studies have reported that the VOT of voiceless stops in Tagalog ranges from 0 to 30 ms (Kang et al., 2016; Umbal, 2023). In contrast, English is an aspirating language, where voiceless stops exhibit a significant delay between the release of air and the onset of laryngeal vibration, resulting in a long-lag VOT that ranges from 30 ms to 120 ms (Cho and Ladefoged, 1999; Lisker and Abramson, 1964). Even though previous work on Philippine English phonology describes voiceless stops as unaspirated (Tayao, 2008), Lesho (2018) most recently finds that speakers tend to produce mostly aspirated acoustic realizations, with VOTs ranging between 56 and 87 ms (Lesho, 2018). Measuring the glottal-supraglottal timing (in milliseconds) of voiceless stops in both English and Tagalog, as produced by Californian Tagalog–English bilinguals, serves as a proxy to assess the degree of cross-linguistic influence between their two languages.

III The present study

This study investigates the acoustic realization of voiceless stops /p, t, k/ in the English and Tagalog speech of two groups of Tagalog–English bilinguals, categorized based on their country of birth: one group born in the Philippines, and another born in the US. Drawing from previous research on the speech production of early bilinguals, the study uses VOT to assess the phonetic influence of one language on the other. However, it extends past studies in three key ways: (1) it includes both Philippines-born and US-born bilinguals, representing generations G1, G1.5, and G2, allowing for an exploration of how language shift from Tagalog to English unfold across generations; (2) it examines heritage speakers from the Filipino diaspora in California, a sizable yet underexplored community in the US; and (3) it looks at individual language trajectories to shed light on variation in language contact contexts. The primary research questions guiding this production experiment are the following:

Research question 1: Do Philippines-born and US-born bilinguals produce similar VOT values in Tagalog and English, indicating phonetic convergence, or do they maintain distinct realizations in each language, indicating phonetic divergence?

Research question 2: How does place of birth influence the acoustic realization of these voiceless stops?

Research question 3: How do language dominance profiles of individual bilinguals affect their production of VOT in both Tagalog and English voiceless stops?

IV Method

1 Participants

Fourteen participants (8 males and 6 females) took part in this production experiment. All participants reported growing up in bilingual households, where both Tagalog and English were spoken, and none were native speakers of any other language. The participants were aged between 18 and 22 years (M = 19.7, SD = 1.2) and were undergraduate students at a public research university in California at the time of testing. They were recruited through the Filipino Student Association on campus. All participants reported normal speech and hearing, and normal or corrected-to-normal vision. In exchange for their participation, each received a stipend. For further details on the participants’ age, gender, place of birth, age of arrival to the US (if applicable), place raised, and immigration generation, please refer to Appendix A.

The Tagalog–English bilingual participants were divided into two groups based on their place of birth: Philippines-born and US-born. The Philippines-born group (n = 6) consisted of native Tagalog speakers who had immigrated to the US with their families and included Generation 1 (G1) participants, who moved to the US after the age of 12 years, and Generation 1.5 (G1.5) participants, who arrived between the ages of 6 and 11 years (Silva-Corvalán, 1994). Although these participants had been raised and educated primarily in English in the US, many, particularly the G1 group, had received significant education in Tagalog, with early schooling occurring in the Philippines. The US-born group (n = 8) consisted of second-generation (G2) heritage Tagalog speakers, born in the US to parents who were both born in the Philippines. These participants were raised speaking Tagalog to varying degrees at home but primarily used English in their everyday lives. Table 1 provides additional details on the age, age of exposure to each language, self-rated accents, and typical daily use of both Tagalog and English for each speaker group.

Table 1.

Age, age of exposure, accent self-ratings, and typical daily use of each language.

	US-born	Philippines-born
	M (SD)	M (SD)
Age (years)	19.7 (0.7)	19.6 (1.8)
Age of exposure (years)	English = 0.1 (0.3) Tagalog = since birth	English = 4.6 (3) Tagalog = since birth
Self-reported accent (1 = strongly accented; 9 = native-like)	English = 8.8 (0.3) Tagalog = 3.3 (1.9)	English = 7.3 (1.1) Tagalog = 7.5 (1.7)
Typical daily use (1 = only English; 9 = only Tagalog)	1.3 (0.5)	2.8 (1.9)

Each participant completed the bilingual language profile (BLP) questionnaire (Birdsong et al., 2012), which is designed to assess language dominance through self-reported data. The BLP generates a continuous dominance score and a general bilingual profile based on responses to questions across four modules: language history, language use, language proficiency, and language attitudes. The questionnaire was administered in English before the production experiment began. Based on participants’ responses, the BLP produced a global score for each language (English and Tagalog), a language-specific score for each module, and an overall global dominance score (see Appendix B). The scores were then converted to a scale where the English score was subtracted from the Tagalog score. The range of possible scores spans from −218 to 218. As shown in Figure 1, language dominance scores for participants ranged from −43.9 to 175.2. Participants with negative scores were classified as Tagalog-dominant (n = 2), while those with positive scores were classified as English-dominant (n = 12). Figure 1 illustrates the distribution of language dominance scores for both Philippines-born (G1 and G1.5) and US-born (G2) participants based on their BLP values.

Figure 1.

Language dominance scores as a function of group (US-born, Philippines-born) and generation of immigration (G1, G1.5, G2) according to the bilingual language profile (BLP).

2 Materials and procedure

The voiceless stop production was elicited through a reading-aloud task. The materials consisted of 30 experimental items in Tagalog and 30 in English, with 10 items for each voiceless stop /p/, /t/, and /k/ in both languages. These items were controlled for factors such as syllable position, vowel context, and stress. The target voiceless stop, followed by a low vowel, appeared in a stressed syllable in both English and Tagalog¹ (e.g. Tagalog tabo ‘bucket’ and English tap). This factor was held consistent across all experimental items. Each target item was embedded in a carrier phrase: I can say TARGETWORD today (English), Kaya kong sabihin ang TARGETWORD ngayon ‘I can say the TARGETWORD today’ (Tagalog). The target experimental items appeared among 200 distractors (e.g. sigaw ‘scream’, sukat ‘measure’ in Tagalog). The full list of materials is provided in Table 2.

Table 2.

Experimental items.

Tagalog			English
/p/	/t/	/k/	/p/	/t/	/k/
pasok (‘to enter’)	tapos (‘done’)	kanin (‘rice’)	pack	tank	can
para (‘for’)	tapa (‘cured meat’)	kalat (‘mess’)	pass	task	cast
paso (‘burn’)	tao (‘person’)	kamot (‘to scratch’)	pants	tack	cab
pata (‘leg meat’)	tali (‘rope’)	kahit (‘even’)	pan	tab	cap
palo (‘to hit’)	tapon (‘throw’)	kahoy (‘wood’)	past	tad	catch
pasa (‘to pass’)	tama (‘correct’)	kaso (‘lawsuit’)	patch	tact	cash
pare (‘bro’ slang)	takas (‘escape’)	kanto (‘corner’)	pat	tap	camp
papa (‘dad’)	tabo (‘bucket’)	kapit (‘grip’)	pal	tag	calf
pako (‘nail’)	tae (‘excrement’)	kalma (‘calm’)	pad	tax	cat
pari (‘priest’)	tago (‘to hide’)	kain (‘eat’)	path	tan	cask

The production task was carried out individually in a soundproof booth, with participants seated comfortably in front of a computer display. Each sentence was shown for 5 seconds on the screen, and participants were instructed by a Filipina native Tagalog-speaking researcher in English to read the sentences clearly and at a natural pace. The 60 experimental items were presented four times in a randomized order. Speech samples were recorded using a head-mounted microphone (Shure SM10A) and audio interface (MOTU Ultralite mk3), digitized at 44 kHz with 16-bit quantization, and edited on a computer for subsequent acoustic analysis. The session produced a total of 240 target productions (i.e. 20 /p/, 20 /t/, and 20 /k/ in both English and Tagalog, with 4 repetitions), resulting in a dataset of 3,360 VOT measurements.

3 Data analysis

a Acoustic analysis

The English and Tagalog voiceless stops /p, t, k/ were segmented using Praat (Boersma and Weenink, 2023), with synchronized waveforms and spectrographic displays. Praat scripts were employed to split each recording into individual files for each experimental item, and text grids were created by manually marking the onset and offset of VOT in each target segment. VOT values were measured by determining the time interval between the stop release and the onset of voicing, as identified by the periodic (repeating) cycles on the waveform. The measurement, rounded to the nearest decimal, was taken from the start of the burst (indicated by a sharp spike where the waveform transitions from quiescent to transient) to the beginning of the first regularly repeating voicing cycle. The onset of voicing was identified as the initial zero crossing in the waveform, as illustrated in Figure 2.

Figure 2.

Voice onset time (VOT) measurement obtained from the waveform.

b Statistical analysis

The VOT values were analysed using a generalized linear mixed-effects model in R (R Core Team, 2023) with the lme4 (Bates et al., 2015) and lmerTest (Kuznetsova et al., 2017) packages, which calculated p-values via the Satterthwaite degrees of freedom method. Post-hoc comparisons were conducted using the emmeans package (Lenth, 2020), applying Kenward–Roger degrees of freedom approximation and Bonferroni-adjusted p-values when comparing three levels. The model included fixed effects for Speaker Group (Foreign-born, US-born), Language (English, Tagalog), Place of Articulation (/p/, /t/, and /k/), and their interactions, with a random intercept for Speaker. Marginal and conditional R²_GLMM values were computed to estimate effect sizes (Johnson, 2014), with the marginal R²_GLMM reflecting the variance explained by both fixed and random factors. Figures were generated using ggplot2 (Wickham, 2016). The alpha level was set at p < .05.

V Results

Research question 1 examined whether Philippines-born and US-born Tagalog–English bilinguals produce acoustically similar voiceless stops in English and Tagalog (i.e. cross-language convergence) or if they maintain distinct acoustic realizations in each language (i.e. cross-language divergence). As shown in Figure 3, both speaker groups on average produced longer VOTs in English than in Tagalog (Philippines-born: 65 ms in English vs. 22.8 ms in Tagalog; US-born: 87.1 ms in English vs. 24.1 ms in Tagalog). The data further reveal that the US-born group’s VOT values fell within the expected range for Tagalog, while also producing longer VOTs within the ‘target-like’ range for English. In contrast, the Philippines-born group exhibited substantially shorter VOTs in English, while their VOT values in Tagalog were also slightly shorter than those of the US-born group.

Figure 3.

Voice onset time (VOT) values as a function of group (US-born, Philippines-born) and language (Tagalog, English).

To investigate the VOT patterns of each speaker group based on the place of articulation for each segment in English and Tagalog (/p/, /t/, and /k/), the means and standard deviations were calculated. As shown in Figure 4, the US-born group produced longer mean VOTs across all English segments (/p/: M = 80.4, SD = 25; /t/: M = 90.2, SD = 22.2; /k/: M = 90.6, SD = 23.2) compared to their Tagalog productions (/p/: M = 17.2, SD = 7.5; /t/: M = 20, SD = 9.7; /k/: M = 35, SD = 11). Similarly, the Philippines-born speakers exhibited distinct VOT ranges in their English (/p/: M = 53.4, SD = 23; /t/: M = 73.7, SD = 18.4; /k/: M = 68.2, SD = 19) and Tagalog voiceless stops (/p/: M = 15.9, SD = 6.2; /t/: M = 18.7, SD = 9.1; /k/: M = 33.7, SD = 13.1). These findings align with previous research (Cho and Ladefoged, 1999; Theodore et al., 2009), indicating that voiceless velar stops generally have longer VOT durations than voiceless alveolar/dental stops and bilabial stops.

Figure 4.

Tagalog and English voice onset time (VOT) values plotted separately for /p, t, k/ as a function of group (US-born and Philippines-born Tagalog–English bilinguals).

The dataset was analysed using a linear mixed-effects model to examine the effects of Speaker Group (Philippines-born, US-born), Language (English, Tagalog), and Place of Articulation (/p/, /t/, and /k/) as fixed effects, along with the interactions between these variables and a random intercept for participant. The model was specified as VOT ~ Group * Language * Segment + (1 | Participant). To prevent collinearity, the fixed effects were centered, and sum-coding was applied to facilitate the interpretation of main effects and interactions (Singmann and Kellen, 2019). The results revealed a significant effect of Language on VOT values (β = 27.22, SE = 1.06, t = 27.05, p < .001), confirming that VOTs in English are significantly longer than in Tagalog for both speaker groups. A significant effect of Place of Articulation was found (β = 6.16, SE = 1.06, t = 6.12, p < .001), as well as a significant effect of Speaker Group (β = 5.8, SE = 2.1, t = 2.75, p < .05). The interaction between Speaker Group and Language was also significant (β = 5.18, SE = 1.01, t = 5.11, p < .001), with no other interactions reaching significance. The model’s marginal and conditional R²_GLMM values were 0.86 and 0.91, respectively.

To further investigate the Speaker Group by Language interaction, post-hoc comparisons using simple contrasts were conducted for both speaker group and language. The Bonferroni-corrected post-hoc pairwise comparisons showed a significant difference in VOT between English and Tagalog voiceless stops for both the Philippines-born bilingual group (β = −42.3, SE = 2.81, t = −15.07, p < .001) and the US-born bilingual group (β = −63, SE = 2.43, t = −25.91, p < .001). Additionally, pairwise comparisons revealed significant differences in the acoustic realization of English VOT between the two groups, confirming that the Philippines-born group produces shorter (more Tagalog-like) English voiceless stops compared to the US-born group (β = −22, SE = 4.62, t = −4.76, p < .001), as shown in Figure 3. However, no significant difference was found between the speaker groups in the VOT values of their Tagalog voiceless stops (β = −1.2, SE = 4.62, t = −0.28, p = n.s.).

Because the presentation of group averages may obscure distinct patterns of between-speaker variation, the next step was to analyse the acoustic realization of English and Tagalog VOTs for each participant. To investigate whether individual bilingual language dominance profiles influence the VOTs in both languages, the average VOT for /p/, /t/, and /k/ in English and Tagalog was calculated for each participant yielding two values per participant: one in English and one in Tagalog. A Pearson’s test was run to measure the linear correlation between the VOT means across all segments and the BLP scores, separately for English and Tagalog. The correlation on the English data revealed a strong positive relationship between English VOT and language dominance, as indicated by the BLP (r = 0.75, t(12) = 3.87, p < .01). While a moderately positive correlation was also observed between the BLP values and Tagalog VOT, it was not statistically significant (r = 0.48, t(12) = 1.87, p = .086). As shown in Figure 5, VOT values in both languages were generally higher for English-dominant bilinguals (those with larger positive BLP scores) than for those with more balanced or Tagalog-dominant profiles (smaller positive or negative BLP scores). This pattern, however, was not statistically significant in the Tagalog VOT data.

Figure 5.

Individual mean voice onset time (VOT) values for all stops in English (left) and Tagalog (right) plotted as a function of a speaker’s bilingual language profile (BLP) score.

VI Discussion

1 Summary of results

This study examined the acoustic realization of English and Tagalog voiceless stops /p/, /t/, and /k/ by measuring VOT in the speech of G1, G1.5, and G2 Tagalog–English bilinguals, categorized into two groups based on their place of birth: Philippines-born and US-born. While both groups are early bilinguals, exposed to both languages from birth or an early age and raised speaking Tagalog at home, the Philippines-born group consists of foreign-born G1 and G1.5 individuals, who are either Tagalog-dominant or moderately English-dominant. The US-born group includes G2 heritage Tagalog speakers who were raised speaking Tagalog at home but reported using English predominantly in daily life, leading to a more English-dominant bilingual profile. This production experiment not only explores the impact of intergenerational language shift from G1 and G1.5 to G2 on the acoustic realization of voiceless stops in both languages, but also examines the effects of place of articulation and individual language dominance profiles on the production of /p/, /t/, and /k/ among these bilinguals, all raised and educated within the Filipino diaspora in California.

The results of the production task show that both US-born and Philippines-born Tagalog–English bilinguals have effectively acquired the timing properties of voiceless stops in each of their languages. With respect to their VOT values, these bilinguals maintain language-specific phonetic categories in the production of /p/, /t/, and /k/ in both Tagalog and English, as illustrated in Figure 3. In other words, these early Tagalog–English bilinguals produce voiceless stops with distinct VOT values for each language: a short-lag VOT in Tagalog and a long-lag VOT in English (Kang et al., 2016; Umbal, 2023). Additionally, the acoustic data for both languages reveal that VOT consistently varies according to the place of articulation, which aligns with previous findings on VOT in other languages (Cho and Ladefoged, 1999; Theodore et al., 2009). The ability of both US-born and Philippines-born speakers to establish phonetic categories for voiceless stops in both their dominant and non-dominant languages suggest new category formation (Flege, 1995, 2007).

However, it is important to note that the results also reveal group differences based on language dominance, which can be interpreted as evidence of ‘compromise’ values, indicating cross-linguistic interactions at the phonetic level. The data show that voiceless stops, particularly in English, have longer VOTs when individuals are more dominant in English (i.e. the US-born group). This longer (more ‘normative’) VOT aligns with the predicted intergenerational language shift from Tagalog to English, with values approximating what we expect among English monolinguals. In contrast, only a modest increase in VOT was observed in the Tagalog voiceless stops between the Philippines-born and US-born groups. In short, the analysis of English VOT supports predictions based on the well-documented process of language shift from a minority language to English across generations of immigrant families in California and the US. However, this pattern does not hold for Tagalog, and therefore, does not reflect the cross-generational drift observed in heritage Tagalog speakers in Toronto (Umbal, 2023).

Finally, the analysis of individual data revealed that the degree of dominance in English influenced VOT values in English, with the US-born, English-dominant G2 speaker group showing generally higher VOTs than the Philippines-born G1 and G1.5 groups, who were either more dominant in Tagalog or slightly dominant in English. Specifically, English VOT varied according to language dominance and was significantly correlated with the degree of English dominance. In contrast, for Tagalog, participants were found to produce voiceless stops within the target range, and there was no significant correlation between VOT and language dominance, as measured by the BLP. In line with the results in Kang et al. (2016), Tagalog voiceless stops appear to remain stable within the sound systems of heritage Tagalog speakers.

2 Cross-linguistic influence in heritage language speech

It is commonly assumed that heritage speakers have an advantage in acquiring their heritage language’s sound system due to early exposure. However, as Polinsky (2018) notes, ‘phonetics and phonology remain among the least understood properties of heritage languages’ (p. 162). Considering this, researchers have called for instrumental studies to test the assumption that heritage speakers maintain ‘good phonology’ in their minority language, to better understand the so-called ‘heritage accent’ (Polinsky and Kagan, 2007). This study contributes to this effort by examining the speech production of early bilinguals from different countries of birth. It adds to the broader discussion on intergenerational differences in pronunciation and cross-generational shifts toward the majority language in diasporic communities (Amengual, 2018; Nagy and Kochetov, 2013; Mayr and Siddika, 2018; Mayr et al., 2021; Umbal, 2023).

Previous research has shown that early bilinguals exposed to both their minority (heritage) language and the majority language early in life exhibit persistent effects from their early sound exposure, which continue into adulthood (Amengual, 2019; Sebastián-Gallés et al., 2005). This study explores phonetic variation in the production patterns of Tagalog heritage speakers in California, focusing on how bilingual speech is influenced by the age of onset and the amount of exposure to both languages during early development. For the Tagalog–English bilinguals in this study, both US-born and Philippines-born speakers displayed distinct VOT patterns for /p/, /t/, and /k/ in each of their languages. While these bilinguals maintain language-specific voiceless stops in Tagalog and English, the question remains: does their language experience influence their production patterns? Is language dominance a key factor in understanding the acoustic realization of voiceless stops in bilingual speech?

Heritage speakers, like other bilinguals, tend to have one dominant or stronger language (Cutler et al., 1989; Flege et al., 2002). In immigrant communities in the US, language shift is a well-documented process, with heritage speakers typically becoming more dominant in the majority language across generations (G1 > G1.5 > G2 > G3). Within this Tagalog-speaking immigrant community, bilinguals, especially those who are foreign-born (G1, G1.5), often maintain a high frequency of heritage language use, leading to a different dominance profile compared to those who shift more towards English over time (G2, G3). The effects of language dominance are evident in the analysis of individual data, which shows that dominance operates along a continuum, capturing variations towards more English-like or Tagalog-like VOT values in both the US-born, English-dominant G2 group and the Philippines-born G1 and G1.5 groups.

3 Future directions

This phonetic production experiment examined the phonetic behavior of heritage speakers by considering factors such as language dominance and place of birth. The acoustic feature analysed is VOT, a reliable measure of consonantal voicing distinctions across many languages (Abramson and Whalen, 2017) and one that is particularly sensitive to change in language contact situations (Chang, 2012; Flege and Eefting, 1987). The ease with which VOT can be obtained, measured accurately, and replicated, likely explains why voiceless stops are one of the most studied phonetic variables in bilingual speech research. While this study focuses on VOT, future research on bilingual cross-linguistic influence could complement VOT analysis with measurements of other language-specific acoustic properties related to laryngeal timing and stop articulation in voiceless stops. These could include spectral characteristics of stop bursts and aspiration intensity (Repp, 1979; Sundara, 2005), F0 onset frequency and movement patterns (Dmitrieva et al., 2015; Hombert et al., 1979), F1 onset frequency and movement patterns (Hillenbrand, 1984), or spectral tilt and H1–H2 ratios (Kong et al., 2012).

Recent research has increasingly examined short-term, dynamic phonetic interactions, particularly through bilingual studies on language mode induced in laboratory settings (Amengual, 2018, 2021; Simonet, 2014; Simonet and Amengual, 2020). In these experiments, language mode is manipulated by having participants complete separate monolingual and bilingual sessions, spaced at least 72 hours apart. In monolingual sessions, participants read words in only one target language; in bilingual sessions, carrier phrases from both languages are presented in random order. Although monolingual sessions may not fully suppress the non-target language for bilingual speakers, thus still engaging them in a partial bilingual mode, it is assumed that bilingual sessions induce a higher degree of bilingual activation. When both languages are active, competition between their phonetic representations can occur, increasing interference during speech processing (Grosjean, 2001). Future research can expand on the results of the present study by exploring the potential ‘cost’ of this dual activation, investigating how language mode manipulations may measurably affect the phonetics of both languages in heritage speakers.

It is important to note that participant gender is not balanced across generation or country of birth in this study. Research on VOT in American English has produced mixed results regarding gender effects: some studies report that females produce longer VOTs than males for voiceless stops in both American and British English (Koenig, 2000; Robb et al., 2005; Whiteside and Irving, 1998; Whiteside and Marshall, 2001; Whiteside et al., 2004), while others find no systematic gender differences (Morris et al., 2008; Smith, 1978). Cross-linguistic research in languages such as Korean and Mandarin further suggests that VOT variation cannot be explained solely by physiological factors (Li, 2013; Oh, 2011). In the present study, it is unlikely that gender confounds account for the observed patterns, as the Philippines-born group includes more females and the US-born group more males, a distribution that would, if anything, reduce group differences. Nonetheless, future research should consider gender in combination with place of birth, generational status, and language dominance to more fully understand potential gender effects in bilingual speech.

Given the heterogeneity of heritage speakers and the variability observed in acoustic data, it is essential to incorporate larger sample sizes to enable a more fine-grained analysis of generational differences in the speech of Tagalog–English bilinguals. Beyond sample size, research can benefit from focusing on the linguistic and social factors that contribute to this variation, thereby refining the definition of the ‘heritage’ speaker group. One important factor in heritage language acquisition is the potential influence of ethnic identity on speech production. For Tagalog heritage speakers, Umbal (2023: 58) suggests that variability in speech patterns may reflect alignment, or lack thereof, with Filipino identity, predicting that speakers with stronger ties to Canada may produce more English-like patterns (e.g. longer VOTs), whereas those more strongly aligned with Filipino identity may retain homeland-like patterns (e.g. shorter VOTs). Although the present study did not measure ethnic orientation directly (Hoffman and Walker, 2010; Noels, 2014), prior research indicates that ethnic identity can shape linguistic variables in heritage languages, albeit with mixed effects: some studies report measurable influences (Nagy et al., 2014; Umbal, 2023), while others find weak or non-significant effects (Nagy and Kochetov, 2013; Nagy et al., 2011). These findings highlight the complex interplay between social identity and phonetic variation in heritage bilinguals and underscore the importance of considering social factors alongside linguistic variables in future studies.

In addition to broadening the scope of heritage language pairings and generational groupings, expanding research on segmental and suprasegmental features, and examining the relationship between perception and production in heritage language sound systems, greater attention should be directed toward the role of ethnic identity in shaping phonetic variation among heritage speakers. Specifically, it remains unclear which dimensions of ethnic orientation are most strongly associated with factors such as place of birth, generation of immigration, language dominance, patterns of use and exposure, and other biographical or non-linguistic variables that influence the linguistic behavior of heritage speakers. Addressing these questions is critical for developing a more nuanced understanding of the social and cognitive mechanisms underlying variation in heritage language phonetics.

VII Conclusions

This study investigated the acoustic realization of English and Tagalog voiceless stops produced by fourteen Tagalog–English bilinguals in California, categorized into a Philippines-born group and a US-born group. Analyses of a reading-aloud task in both languages revealed three key findings. First, both US-born and Philippines-born bilinguals maintained distinct VOT patterns for /p/, /t/, and /k/ in each language. Second, evidence of cross-linguistic phonetic interaction was observed: English voiceless stops exhibited longer VOTs, particularly among those more English-dominant speakers (i.e. the US-born group), whereas Tagalog voiceless stops showed no comparable intergenerational shift. Third, individual-level analyses indicated that English VOTs were higher for the most English-dominant bilinguals relative to those with more balanced or Tagalog-dominant language profiles, a trend not mirrored in Tagalog VOTs for either group. Collectively, these results provide new insight into cross-linguistic phonological influence within a diverse and underexplored diasporic community, highlighting the differential impact of language dominance and birth country on bilingual phonetic production.

Footnotes

Appendix

Appendix B.

Bilingual language profile (BLP) scores for each module per participant.

	Language history		Language use		Language proficiency		Language attitudes		Global score English	Global score Tagalog	Dominancescore
	ENG	TAG	ENG	TAG	ENG	TAG	ENG	TAG	ENG	TAG	Score
P05	46	92	29	21	20	24	14	24	129.67	173.61	−43.94
P06	54	86	32	12	10	18	14	18	113.87	133.84	−19.96
P10	86	73	37	13	24	24	17	21	172.44	149.46	22.98
P01	116	81	44	6	23	18	23	17	205.04	122.76	82.28
P07	103	63	39	11	24	12	23	16	195.96	104.15	91.81
P11	86	67	43	7	24	13	18	16	181.25	103.87	77.37
P02	116	20	50	0	24	1	24	13	216.12	40.86	175.26
P03	117	60	46	4	23	14	21	19	203.13	106.51	96.62
P04	112	42	49	1	24	2	22	6	208.67	38.31	170.36
P08	116	26	44	6	24	9	24	16	209.58	75.09	134.49
P09	116	42	43	7	24	8	23	17	206.22	83.44	122.77
P12	92	43	39	9	24	7	24	21	193.23	92.89	100.34
P13	96	51	44	6	24	13	23	19	198.23	102.33	95.9
P14	120	57	41	9	24	12	24	22	208.13	112.86	95.26

Acknowledgements

We would like to thank our participants for their contribution to this study. We would also like to thank the anonymous reviewers and Associate Editor Jeff Holliday for the very helpful comments and suggestions we received during the peer review process. Finally, we wish to express our appreciation to the audience of the 11th International Symposium on the Acquisition of Second Language Speech (New Sounds 2025) at the University of Toronto, for the feedback on our project.

Author contributions statement

Conception and design: MA and MuA; collection of data: MA and MuA; data analysis and interpretation: MA and MuA; drafting of the paper: MA; revisions: MA and MuA; final approval of the manuscript: MA and MuA: agreement to be held accountable for all aspects of the work: MA and MuA.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Ethics approval and informed consent statements

The University of California, Santa Cruz Institutional Review Board at the Office of Research Compliance Administration approved this research (HS-FY2023-122). Prior to data collection, adult participants signed consent forms. This study was conducted according to the guidelines of the Declaration of Helsinki and approved by UCSC IRB on 16 December 2022.

ORCID iDs

Mark Amengual

Maxine Uy Altura

Open Badges Statement

The experiment in this article earned Open Materials and Open Data badges for transparent practices. The materials, analysis scripts, and anonymized data are available at

Data availability statement

The materials, anonymized data and analysis scripts are available on the Open Science Framework at

Notes

References

Abramson

Whalen

(2017) Voice Onset Time (VOT) at 50: Theoretical and practical issues in measuring voicing distinctions. Journal of Phonetics 63: 75–86. https://doi.org/10.1016/j.wocn.2017.05.002

Amengual

(2018) Asymmetrical interlingual influence in the production of Spanish and English laterals as a result of competing activation in bilingual language processing. Journal of Phonetics 69: 12–28. https://doi.org/10.1016/j.wocn.2018.04.002

Amengual

(2019) Type of early bilingualism and its effect on the acoustic realization of allophonic variants: Early sequential and simultaneous bilinguals. International Journal of Bilingualism 23: 954–70. https://doi.org/10.1177/1367006917741364

Amengual

(2021) The acoustic realization of language-specific phonological categories despite dynamic cross-linguistic influence in bilingual and trilingual speech. Journal of the Acoustical Society of America 149: 1271–84. https://doi.org/10.1121/10.0003559

Amengual

(2023) Cross-language influences in the acquisition of L2 and L3 Phonology. In: Elgort

Siyanova-Chanturia

Brysbaert

(eds) Cross-language influences in bilingual processing and second language acquisition. Amsterdam: John Benjamins, pp. 74–99. https://doi.org/10.1075/bpa.16.04ame

Amengual

(2024) Phonetics of early bilingualism. Annual Review of Linguistics 10: 191–210. https://doi.org/10.1146/annurev-linguistics-031522-102542

Knightly

Jun

(2002) Overhearing a language during childhood. Psychological Science 13: 238–43. https://doi.org/10.1111/1467-9280.00444

Axel

(2011) Language in Filipino America. Unpublished PhD thesis, Arizona State University, Tempe, AZ, USA.

Bates

Maechler

Bolker

Walker

(2015) Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67: 1–48. https://doi.org/10.18637/jss.v067.i01

10.

Benmamoun

Montrul

Polinsky

(2013) Heritage languages and their speakers: Opportunities and challenges for linguistics. Theoretical Linguistics 39: 129–81. https://doi.org/10.1515/tl-2013-0009

11.

Birdsong

Gertken

Amengual

(2012) Bilingual language profile: An easy-to-use instrument to assess bilingualism. Austin, TX: COERLL, University of Texas at Austin. Available at: https://sites.la.utexas.edu/bilingual/ (accessed January 2026).

12.

Boersma

Weenink

(2023) Praat: Doing phonetics by computer: Version 6.4.01 [computer program]. Available at: http://www.praat.org (accessed January 2026).

13.

Chang

(2012) Rapid and multifaceted effects of second-language learning on first-language speech production. Journal of Phonetics 40: 249–68. https://doi.org/10.1016/j.wocn.2011.10.007

14.

Chang

(2016) Bilingual perceptual benefits of experience with a heritage language. Bilingualism: Language and Cognition 19(4): 791–809. https://doi.org/10.1017/S1366728914000261

15.

Chen

Bernhardt

Stemberger

(2016) Phonological assessment and analysis tools for Tagalog: Preliminary development. Clinical Linguistics and Phonetics 30: 599–627. https://doi.org/10.3109/02699206.2016.1157208

16.

Cho

Ladefoged

(1999) Variation and universals in VOT: Evidence from 18 languages. Journal of Phonetics 27: 207–29. https://doi.org/10.1006/jpho.1999.0094

17.

Cutler

Mehler

Norris

Segui

(1989) Limits on bilingualism. Nature 340: 229–30. https://doi.org/10.1038/340229a0

18.

Dietrich

Hernandez

(2022) Language use in the United States: 2019. American Community Survey Reports (ACS‑50). US Census Bureau. Available at: https://www.census.gov/content/dam/Census/library/publications/2022/acs/acs-50.pdf (accessed January 2026).

19.

Dmitrieva

Llanos

Shultz

Francis

(2015) Phonological status, not voice onset time, determines the acoustic realization of onset f0 as a secondary voicing cue in Spanish and English. Journal of Phonetics 49: 77–95. https://doi.org/10.1016/j.wocn.2014.12.005

20.

Escudero

(2005) Linguistic perception and second language acquisition: Explaining the attainment of optimal phonological categorization. Utrecht: LOT.

21.

Escudero

Yazawa

(2024) The Second Language Linguistic Perception Model. In: Amengual

(ed.), The Cambridge handbook of bilingual phonetics and phonology. Cambridge: Cambridge University Press, pp. 173–96. https://doi.org/10.1017/9781009105767.009

22.

Fishman

(1966) Language loyalty in the United States: The maintenance and perpetuation of non-English mother tongues by American ethnic and religious groups. The Hague: Mouton.

23.

Fishman

(1964) Language maintenance and language shift as a field of inquiry. A definition of the field and suggestions for its further development. Linguistics – An Interdisciplinary Journal of the Language Sciences 9: 32–70. https://doi.org/10.1515/ling.1964.2.9.32

24.

Fishman

(1965) Language maintenance and language shift: The American immigrant case within a general theoretical perspective. Sociologus 16: 19–39.

25.

Fishman

(1991) Reversing language shift: Theoretical and empirical foundations of assistance to threatened languages: Volume 76. Bristol: Multilingual Matters. https://doi.org/10.2307/jj.33169466

26.

Flege

(1995) Second-language speech learning: Theory, findings and problems. In: Strange

(ed.) Speech perception and linguistic experience: Theoretical and methodological issues. Timonium, MD: York Press, pp. 229–73.

27.

Flege

(2007) Language contact in bilingualism: Phonetic system interactions. Laboratory Phonology 9: 353–81.

28.

Flege

Bohn

(2021) The revised speech learning model (SLM-r). In: Wayland

(ed.) Second language speech learning: Theoretical and empirical progress. Cambridge: Cambridge University Press, pp. 3–83. https://doi.org/10.1017/9781108886901.002

29.

Flege

Eefting

(1987) Production and perception of English stops by native Spanish speakers. Journal of Phonetics 15: 67–83. https://doi.org/10.1016/s0095-4470(19)30538-8

30.

Flege

Frieda

Nozawa

(1997) Amount of native-language (L1) use affects the pronunciation of an L2. Journal of Phonetics 25: 169–86. https://doi.org/10.1006/jpho.1996.0040

31.

Flege

MacKay

Piske

(2002) Assessing bilingual dominance. Applied Psycholinguistics 23: 567–98. https://doi.org/10.1017/s0142716402004046

32.

Flege

Schirru

MacKay

(2003) Interaction between the native and second language phonetic subsystems. Speech Communication 40: 467–91. https://doi.org/10.1016/s0167-6393(02)00128-0

33.

Fonacier

(2010) Tagalog in the USA. In: Potowski

(ed.) Language diversity in the USA. Cambridge: Cambridge University Press, pp. 96–109. https://doi.org/10.1017/cbo9780511779855.007

34.

Gonzalez

(1998) The language planning situation in the Philippines. Journal of Multilingual and Multicultural Development 19: 487–525. https://doi.org/10.1080/01434639808666365

35.

Gordon

Jr. (2005) Ethnologue: Languages of the world. Dallas, TX: SIL International.

36.

Grosjean

(2001) The bilingual’s language modes. In: Nicol

(ed.) One mind, two languages: Bilingual language processing. Oxford: Blackwell, pp. 1–22.

37.

Guion

(2003) The vowel systems of Quichua–Spanish bilinguals: Age of acquisition effects on the mutual influence of the first and second languages. Phonetica 60: 98–128. https://doi.org/10.1159/000071449

38.

Hillenbrand

(1984) Perception of sine-wave analogs of voice onset time stimuli. Journal of the Acoustical Society of America 75: 231–40.

39.

Hoffman

Walker

(2010) Ethnolects and the city: Ethnic orientation and linguistic variation in Toronto English. Language Variation and Change 22: 37–67. https://doi.org/10.1017/s0954394509990238

40.

Hombert

Ohala

Ewan

(1979) Phonetic explanations for the development of tones. Language 55: 37–58. https://doi.org/10.2307/412518

41.

Jarvis

Pavlenko

(2008) Crosslinguistic influence in language and cognition. New York: Routledge. https://doi.org/10.4324/9780203935927

42.

Johnson

PCD

(2014) Extension of Nakagawa and Schielzeth’s R²_GLMM to random slopes models. Methods in Ecology and Evolution 5: 944–46. https://doi.org/10.1111/2041-210x.12225

43.

Kang

George

Soo

(2016) Cross-language influence in the stop voicing contrast in heritage Tagalog. Heritage Language Journal 13: 184–218. https://doi.org/10.46538/hlj.13.2.6

44.

Kim

(2024) The phonetics and phonology of heritage language speakers. In: Amengual

(ed.) The Cambridge handbook of bilingual phonetics and phonology. Cambridge: Cambridge University Press, pp. 560–83. https://doi.org/10.1017/9781009105767.026

45.

Knightly

Jun

(2003) Production benefits of childhood overhearing. Journal of the Acoustical Society of America 114: 465–74. https://doi.org/10.1121/1.1577560

46.

Koenig

(2000) Laryngeal factors in voiceless consonant production in men, women, and 5-year-olds. Journal of Speech, Language, and Hearing Research 43: 1211–28. https://doi.org/10.1044/jslhr.4305.1211

47.

Kong

Beckman

Edwards

(2012) Voice onset time is necessary but not always sufficient to describe acquisition of voiced stops: The cases of Greek and Japanese. Journal of Phonetics 40: 725–44. https://doi.org/10.1016/j.wocn.2012.07.002

48.

Kuznetsova

Brockhoff

Christensen

RHB

(2017) LmerTest package: Tests in linear mixed effects models. Journal of Statistical Software 82: 1–26. https://doi.org/10.18637/jss.v082.i13

49.

Lenth

(2020) emmeans: Estimated marginal means, Aka least-squares means: R package version 146. Available at: https://CRAN.R-project.org/package=emmeans (accessed January 2026). https://doi.org/10.32614/cran.package.emmeans

50.

Lesho

(2018) Philippine English (Metro Manila acrolect). Journal of the International Phonetic Association 48: 357–70. https://doi.org/10.1017/s0025100317000548

51.

(2013) The effect of speakers’ sex on voice onset time in Mandarin stops. Journal of the Acoustical Society of America 133: EL142–47. https://doi.org/10.1121/1.4778281

52.

Lisker

Abramson

(1964) A cross-language study of voicing in initial stops: Acoustical measurements. Word 20: 384–422. https://doi.org/10.1080/00437956.1964.11659830

53.

Malabonga

(2009) Heritage voices: Language: Tagalog. Washington, DC: Center for Applied Linguistics (Heritage Languages).

54.

Mayr

López-Bueno

Fernández

Tomé Lourido

(2019) The role of early experience and continued language use in bilingual speech production: A study of Galician and Spanish mid vowels by Galician–Spanish bilinguals. Journal of Phonetics 72: 1–16. https://doi.org/10.1016/j.wocn.2018.10.007

55.

Mayr

Siddika

(2018) Inter-generational transmission in a minority language setting: Stop consonant production by Bangladeshi heritage children and adults. International Journal of Bilingualism 22: 255–84. https://doi.org/10.1177/1367006916672590

56.

Mayr

Siddika

Morris

Montanari

(2021) Bilingual phonological development across generations: Segmental accuracy and error patterns in second-and third-generation British Bengali children. Journal of Communication Disorders 93: Article 106140. https://doi.org/10.1016/j.jcomdis.2021.106140

57.

McFarland

(1994) Subgrouping and number of the Philippine languages or how many Philippine languages are there? Philippine Journal of Linguistics 25: 75–84.

58.

Montrul

Polinsky

(eds) (2021) The Cambridge handbook of heritage languages and linguistics. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781108766340

59.

Moro

Russo

(2025) Family language policy in multilingual Filipino families in Italy. Journal of Multilingual and Multicultural Development 46: 2965–79. https://doi.org/10.1080/01434632.2024.2321389

60.

Morris

McCrea

Herring

(2008) Voice onset time differences between adult males and females: Isolated syllables. Journal of Phonetics 36: 308–17. https://doi.org/10.1016/j.wocn.2007.06.003

61.

Nagy

Aghdasi

Denis

Motut

(2011) Null subjects in heritage languages: Contact effects in a cross-linguistic context. University of Pennsylvania Working Papers in Linguistics 17: 135–44.

62.

Nagy

Chociej

Hoffman

(2014) Analyzing Ethnic Orientation in the quantitative sociolinguistic paradigm. Language and Communication 35: 9–26. https://doi.org/10.1016/j.langcom.2013.11.002

63.

Nagy

Kochetov

(2013) VOT across the generations: A cross-linguistic study of contact-induced change. In: Siemund

Gogolin

Schulz

Davydova

(eds) Multilingualism and language diversity in urban areas: Acquisition, identities, space, education. Amsterdam: John Benjamins, pp. 19–38. https://doi.org/10.1075/hsld.1.02nag

64.

Noels

(2014) Language variation and ethnic identity: A social psychological perspective. Language and Communication 35: 88–96. https://doi.org/10.1016/j.langcom.2013.12.001

65.

(2011) Effects of speaker gender on voice onset time in Korean stops. Journal of Phonetics 39: 59–67. https://doi.org/10.1016/j.wocn.2010.11.002

66.

Polinsky

(2018) Heritage languages and their speakers. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781107252349

67.

Polinsky

Kagan

(2007) Heritage languages: In the ‘wild’ and in the classroom. Language and Linguistics Compass 1: 368–95. https://doi.org/10.1111/j.1749-818x.2007.00022.x

68.

Portes

Rumbaut

(2001) Legacies: The story of the immigrant second generation. Berkeley, CA: University of California Press.

69.

Portes

Rumbaut

(2006) Immigrant America: A portrait. Berkeley, CA: University of California Press.

70.

R Core Team (2023) R: A language and environment for statistical computing [software]. Vienna: R Foundation for Statistical Computing. Available at: http://www.R-project.org/ (accessed January 2026).

71.

Repp

(1979) Relative amplitude of aspiration noise as a voicing cue for syllable-initial stop consonants. Language and Speech 22: 173–89. https://doi.org/10.1177/002383097902200207

72.

Robb

Gilbert

Lerman

(2005) Influence of gender and environmental setting on voice onset time. Folia Phoniatrica et Logopaedica 57: 125–33. https://doi.org/10.1159/000084133

73.

Rumbaut

(2004) Ages, life stages, and generational cohorts: Decomposing the immigrant first and second generations in the United States. International Migration Review 38: 1160–205. https://doi.org/10.1111/j.1747-7379.2004.tb00232.x

74.

Schachter

Otanes

(1972) Tagalog reference grammar. Berkeley, CA: University of California Press.

75.

Sebastián-Gallés

Echeverría

Bosch

(2005) The influence of initial exposure on lexical representation: Comparing early and simultaneous bilinguals. Journal of Memory and Language 52: 240–55. https://doi.org/10.1016/j.jml.2004.11.001

76.

Simonet

(2014) Phonetic consequences of dynamic cross-linguistic interference in proficient bilinguals. Journal of Phonetics 43: 26–37. https://doi.org/10.1016/j.wocn.2014.01.004

77.

Simonet

Amengual

(2020) Increased language co-activation leads to enhanced cross-linguistic phonetic convergence. International Journal of Bilingualism 24: 26–37. https://doi.org/10.1177/1367006919826388

78.

Silva-Corvalán

(1994) Language contact and change: Spanish in Los Angeles. Oxford: Oxford University Press. https://doi.org/10.1093/oso/9780198242871.001.0001

79.

Singmann

Kellen

(2019) An introduction to mixed models for experimental psychology. In: Spieler

Schumacher

(eds) New methods in cognitive psychology. New York: Routledge, pp. 4–31. https://doi.org/10.4324/9780429318405-2

80.

Smith

(1978) Temporal aspects of English speech production: A developmental perspective. Journal of Phonetics 6: 37–67. https://doi.org/10.1016/s0095-4470(19)31084-8

81.

Sundara

(2005) Acoustic-phonetics of coronal stops: A cross-language study of Canadian English and Canadian French. Journal of the Acoustical Society of America 118: 1026–37. https://doi.org/10.1121/1.1953270

82.

Tayao

MLG

(2008) A lectal description of the phonological features of Philippine English. In: Bautista

Bolton

(eds) Philippine English: Linguistic and literary perspectives. Hong Kong: Hong Kong University Press, pp. 157–74. https://doi.org/10.1515/9789888052639-013

83.

Theodore

Miller

DeSteno

(2009) Individual talker differences in voice-onset-time: Contextual influences. Journal of the Acoustical Society of America 125: 3974–82. https://doi.org/10.1121/1.3106131

84.

Umbal

(2023) A comparative variationist analysis of phonetic variation and change in Toronto Heritage Tagalog. Unpublished PhD thesis, University of Toronto, Toronto, ON, Canada.

85.

Umbal

Nagy

(2021) Heritage Tagalog phonology and a variationist framework of language contact. Languages 6: Article 201. https://doi.org/10.3390/languages6040201

86.

Veltman

(1983a) Anglicization in the United States: Language environment and language practice of American adolescents. International Journal of the Sociology of Language 44: 99–114. https://doi.org/10.1515/ijsl.1983.44.99

87.

Veltman

(1983b) Language shift in the United States. Berlin: Mouton de Gruyter. https://doi.org/10.1515/9783110824001

88.

Waters

(2014) Defining difference: The role of immigrant generation and race in American and British immigration studies. Ethnic and Racial Studies 37: 10–26. https://doi.org/10.1080/01419870.2013.808753

89.

Whiteside

Henry

Dobbin

(2004) Sex differences in voice onset time: A developmental study of phonetic context effects in British English. Journal of the Acoustical Society of America 116: 1179–83. https://doi.org/10.1121/1.1768256

90.

Whiteside

Irving

(1998) Speakers’ sex differences in voice onset time: A study of isolated word production. Perceptual and Motor Skills 86: 651–54. https://doi.org/10.2466/pms.1998.86.2.651

91.

Whiteside

Marshall

(2001) Developmental trends in voice onset time: Some evidence for sex differences. Phonetica 58: 196–210. https://doi.org/10.1159/000056199

92.

Wickham

(2016) ggplot2: Elegant graphics for data analysis. 2nd ed. Springer. https://doi.org/10.1007/978-0-387-98141-3

Cross-linguistic influence in the speech of Philippines-born and US-born Tagalog–English bilinguals in California

Abstract

Keywords

I Introduction

II Background

1 Tagalog and the Filipino diaspora in the US and California

2 Intergenerational differences in pronunciation and language shift

3 Heritage Tagalog speech

4 The phonetic variable: Voice onset time (VOT) in Tagalog and English voiceless stops

III The present study

IV Method

1 Participants

2 Materials and procedure

3 Data analysis

a Acoustic analysis

b Statistical analysis

V Results

VI Discussion

1 Summary of results

2 Cross-linguistic influence in heritage language speech

3 Future directions

VII Conclusions

Footnotes

Appendix

Acknowledgements

Author contributions statement

Declaration of conflicting interests

Funding

Ethics approval and informed consent statements

ORCID iDs

Open Badges Statement

Data availability statement

Notes

References