Sage Journals: Discover world-class research

Abstract

Objectives:

Bringing linguistic experience into code-switching (CS) constraints, a new hypothesis considers cross-language variable equivalence, which arises from within-language variability. Bilingual choices are assessed for Spanish-English CS between clauses, where subordinating conjunctions may not be consistently equivalent.

Methodology:

Equivalence exists at the main-and-adverbial clause junction, inasmuch as the conjunctions are consistently present and placed the same way in the two languages. Equivalence is variable with main-and-complement clauses, because English complementizer that is mostly absent. Tokens of clause combining were extracted from the prosodically transcribed speech of members of a long-standing community in northern New Mexico who use both languages in their everyday interactions. Bilingual clause combinations were compared with their unilingual counterparts produced by the same speakers, as benchmarks.

Data and Analysis:

Over 2,000 tokens of clause combining were coded for conjunction, subordinate clause type, prosodic connection, and CS direction for bilingual instances (n = 189).

Findings:

Bilinguals treat CS with complement and adverbial clauses differently. With complement clauses, the rate of CS is lower, prosodic separation is greater and, most notably, conjunction language choice is more asymmetrical. Spanish complementizer que is overwhelmingly selected over English that. In contrast, choice between causal conjunctions porque and (be)cause is affected by CS direction.

Originality:

The Variable Equivalence hypothesis states that bilinguals favor CS with the equivalent option from one of the languages that is more frequent and predictable in their combined linguistic experience, considering both languages.

Significance:

CS constraints are probabilistic (preferred CS sites) rather than categorical (permissible CS sites). The Variable Equivalence hypothesis accommodates variation in actual language use. Methodologically, comparing spontaneous CS with the same speakers’ unilingual production allows discovery of CS asymmetries. These asymmetries reveal quantitative bilingual preferences to switch at particular sites.

Keywords

Code-switching Variable Equivalence hypothesis within-language variability clause combining complement clauses causal clauses Spanish English

Code-switching and Variable Equivalence

Code-switching (CS) is widely seen to be constrained by equivalence or congruence of some kind between the two languages (e.g., Deuchar, 2005, p. 255; Lipski, 1978, p. 258; Muysken, 2000, p. 27; Poplack, 1980, p. 581), for example, pre- versus postnominal adjective position would restrict CS between adjective and noun. But how do bilinguals construe equivalence in practice? In this paper, we consider CS at language junctions at which equivalence may not be consistent, or sites of variable equivalence. These are sites of cross-language differences that lie in quantitative preferences rather than grammaticality (Bresnan et al., 2001, p. 29; Givón, 1979, pp. 22–43; Torres Cacoullos & Travis, 2019). One such junction occurs in Spanish-English bilingual clause combining. Although there is equivalence in the order of main and subordinate clause, subordinating conjunctions can be variably equivalent. It will be shown that bilinguals opt for the conjunction that is more frequent and predictable, considering their linguistic experience of both languages.

We begin with the Equivalence constraint (Poplack, 1980, p. 581). Of the many theories of CS, it has been tested in a corpus of bilingual speech, which is sorely needed for “establishing the ecological validity of experimental data” (Green, 2018, p. 1). Avoiding abstract features and theory-internal notions, this account does not assume that constraints on switching between two languages need be derivable from either formal syntactic principles of monolingual grammars (e.g., López et al., 2017, p. 1) or the syntactic principles of only one, the Μatrix language (e.g., Myers-Scotton & Jake, 2009). The Equivalence constraint is concerned with boundaries between the languages. It states that CS is avoided at points of word placement incompatibility between the two languages, or conversely, that CS occurs at syntactic boundaries where analogous elements are placed the same way in both languages (Poplack, 1980, p. 586; 2015, p. 919; Sankoff, 1998, pp. 46–47). Evidence for the role of Equivalence in bilingual speech is syntactic priming across languages, or repeating parallel structures, which is favored when the word orders are homologous (e.g., Kootstra et al., 2010, p. 227).

Here, we view CS constraints as probabilistic (preferred CS sites) rather than categorical (permissible CS sites), and ask: of the places where bilinguals can switch in theory, where do they switch in usage? To answer this, we consider sites of variable equivalence, syntactic boundaries at which equivalence between languages is not consistent, due to internal variability within the languages. In a usage-based approach to bilingualism, explanations are to be found in analysis of both languages, since our experience with language shapes the cognitive representations for language and, thus, use shapes structure (e.g., Bybee, 2010). In a variationist approach, individual linguistic knowledge is mastery of the general variable patterns of the wider community (e.g., Labov, 2012). Blending the two, in a usage-based variationist approach, explanations are to be found in quantitative analysis of speakers’ choices in both languages used in the speech community (cf. Backus, 2021; Poplack, 2021). Thus, accounting for bilinguals’ CS preferences, the Variable Equivalence hypothesis brings linguistic variation and experience into CS constraints.

CS between clauses, where bilinguals combine a main clause in one language with a subordinate clause in the other, is illustrated in (1), with an adverbial clause, and (2), with a complement clause. We will refer to such instances as Spanish-English CS, though note that the switching is bidirectional, going from a Spanish main verb to an English subordinate verb (1), or the reverse, from an English main to a Spanish subordinate (2). In the translation on the right, italic and roman type represent stretches of speech originally produced in English and Spanish, respectively.¹

(1)	CS between clauses: Main-and-adverbial clause, Spanish to English
	y estaba uno que le decían ~Miguel Alto,	“and there was one they called ~Miguel Alto,
	because he must’ve been six nine,	because he must’ve been six nine,”
		[17, 12:15-12:19]
(2)	CS between clauses: Main-and-complement clause, English to Spanish
	so I told them,	“so I told them,
	que iban a salir en el Sun.	that they were going to come out in the Sun.”
		[22, 17:05−17:08]

At the junction of Spanish-English main-and-adverbial clauses (1) there is equivalence, in that the analogous conjunctions, porque and (be)cause, are present and placed the same way in both languages, as depicted in (3a) (e.g., Diessel, 2005, p. 466; Gras, 2023, p. 488). For the same language pair, however, with main-and-complement clause combining (2) there is variable equivalence. Spanish complementizer que is always present, while English complementizer that is most often absent, its use being conditioned by lexical and grammatical factors (e.g., Shank et al., 2016, pp. 202–213). These differing probabilities of the analogous options que and that within the languages make the main and complement clause junction a point of variable equivalence for CS between the languages. Compare (3b) with (3a).

(3a)	Equivalence: Main-and-adverbial clause
	VERB_{SPAN or ENG} + CONJ porque _or (be)cause + VERB_{SPAN or ENG}

(3b)	Variable Equivalence: Main-and-complement clause
	VERB_{SPAN or ENG} + CONJ que _or that presence ~ that absence + VERB_{SPAN or ENG}

The language of the complementizer has long been debated. In appeals to formal syntactic constructs, some have argued that the complementizer is governed by the main clause verb and will therefore be in the same language, while others have asserted the opposite, that CS is impossible between a complementizer and its complement (e.g., López et al., 2017, p. 5). In asymmetric models of CS, complementizers have been said to come from whichever of the two languages is designated as the Matrix Language, which provides the grammatical frame in CS and is usually identified as the language of the verbal morphology (Myers-Scotton & Jake, 2009, p. 352). Here, we reframe the question of complementizer language as one of CS sites. We reason that, if what matters for CS are language junctions, bilinguals should treat variable equivalence sites differently from equivalence sites. Comparing main-and-complement clauses with main-and-adverbial clauses, we thus ask, for Spanish-English clause combining:

Do bilinguals have a lower rate of CS between main and complement clauses?

Do bilinguals prefer the conjunction from one of the languages?

How do bilinguals choose the language of the conjunction?

Comparisons show that complement clauses have a lower CS rate, greater prosodic separation, and notable asymmetry in the language of the conjunction. Whereas for adverbial clauses conjunction language tends to follow CS direction—an English conjunction is more likely when CS goes from a Spanish main to an English subordinate clause—for complement clauses Spanish que is chosen regardless of CS direction. We offer the Variable Equivalence hypothesis for CS, which states that:

Where cross-language equivalence is not consistent due to within-language variability, bilinguals prefer CS with the equivalent option from one of the languages that is more frequent and predictable in their cumulative linguistic experience of both languages.

Speech community and corpus

Participants

Spontaneous bilingual speech, from members of a long-term, non-immigrant community, is recorded in the New Mexico Spanish-English Bilingual (NMSEB) corpus (Torres Cacoullos & Travis, 2018, Chapters 2 and 3). In northern New Mexico, Spanish has been spoken for over 400 years. English-speaking settlers arrived in the second half of the 19th century together with the railroad, following annexation of the territory to the United States. Today, New Mexican Spanish is endangered by shift to English and denigration in comparison with both immigrant varieties and textbook Spanish, which is taught as a foreign language. The sociolinguistic interviews (Labov, 1984) constituting the NMSEB corpus add up to 29 hours of speech, or about 300,000 words. The speakers (n = 40) have birth years from 1922 to 1993 and a range of occupations including mineworkers, ranchers, teachers, and service employees. The speaker sample is made up of bilinguals who use both languages regularly in their everyday interactions.

In minoritized communities, bilinguals’ self-reports may be influenced by the stigmatization of local speech varieties (see Torres Cacoullos & Travis, 2018, pp. 57–73 on measures of bilingualism). As an illustration, one participant responded to a questionnaire item that he learned English in school but remarked during the conversation that when he first attended school he didn’t “know” Spanish either (4). In addition, where use of both languages is commonplace, the L1-L2 distinction may be blurry and even the notion of language dominance may be moot, as indicated by another participant (5). Here, the notion of heritage, as opposed to native, speakers does not apply (cf. Otheguy, 2015, pp. 309–311). The strongest evidence of the bilingualism of the participants is the very usage of each of their languages. While the language of thinking may be unknown as implied in (5), we can reliably assess the language of actual speech production. Quantitative analysis shows no transfer from one language to the other (e.g., Barking et al., 2022). These bilinguals keep both languages intact, aligning with their respective monolingual benchmarks, for subject expression, tense-aspect-mood, and word order, among other linguistic variables (e.g., Benevento & Dietrich, 2015; Dumont & Wilson, 2016; LaCasse & Torres Cacoullos, 2023; Torres Cacoullos & Vélez-Avilés, 2023; Travis & Torres Cacoullos, 2020, pp. 141–142).

(4)	(re: language self-assessment)
	pues no sabíanos ni el inglés ni el mexicano.	“well we didn’t know either English or Spanish.
	porque ,	because,
	los a- —	the a- —
	los —	the —
	los mexicanitos esos,	those Mexican guys,
	they —	they —
	.. hablaban muy diferente el mexicano que nosotros,	.. they spoke Spanish very different from us,”
	yeah. ((interviewer))	“yeah.”
	y nosotros hablábanos otra class de language sandwich <@ que le dicen @>,	“and we spoke another class of language sandwich <@ as they call it @>,
	@@@	@@@
	de Nuevo México.	of New Mexico.”
		[25, 05:07-05:17]
(5)	(re: language dominance)
	.. I think Ø it’s equal because,	“.. I think Ø it’s equal because,
	...(1.2) l- como te estaba diciendo el otro día,	...(1.2) l- like I was telling you the other day,
	.. that I don’t,	.. that I don’t,
	...(0.8) que —	...(0.8) that —
	a lot of people say that you think in Spanish.	a lot of people say that you think in Spanish.
	...(1.5) or you think in English.	...(1.5) or you think in English.
	and to me I don’t know what I’m thinking.	and to me I don’t know what I’m thinking.”
		[06, 24:46-25:00]

Prosodic structure of CS

Each line of transcription represents an Intonation Unit (IU) and ends in punctuation marking the transitional continuity between IUs, or the terminal pitch contour (Du Bois et al., 1993; see Appendix 1 for transcription conventions). In example (5) above, the first three IUs end with a continuing intonation contour, indicated by a comma, and the fifth has a final intonation contour, represented by a period. (The fourth IU is truncated [“—”], where the speaker breaks off before completing the prosodic contour.) A prosodic sentence is an IU or series of IUs, containing at least one finite verb, that ends in intonational completion (final “.” or appeal “?”) (Chafe, 1994, p. 139). For example in (5), the first five IUs make up a sentence, with three tokens of clause combining (not counting the que alone on a truncated IU [fourth line] nor the because [first line] with no clearly identifiable associated subordinate clause; see below on datasets).

Bilinguals tend to demarcate the two languages on the basis of IU boundaries and terminal pitch contours (Figure 1). CS is more likely at the boundary of IUs than within them. Furthermore, CS at an IU boundary is more likely following final than continuing intonation (cf. the distinction between inter-sentential and intra-sentential CS). This prosodic structure of multi-word CS indicates the IU as a processing unit of bilingual speech (cf. Chafe, 1994, p. 69; Torres Cacoullos & Travis, 2018, pp. 51–52 and references therein).

Figure 1.

IU-Boundary constraint: Bilinguals quantitatively prefer CS across Intonation Unit (IU) boundaries rather than within the same IU (Torres Cacoullos & Travis, 2018, p. 51).

Datasets

Clause combining data inclusions and exclusions

Clause combinations were extracted manually and exhaustively from all NMSEB transcripts. A total of 2199 tokens remain after applying criteria for inclusion. For main-and-complement clause combinations, the largest class of exclusions is English high-frequency first-person singular collocations, for example, I think, I guess, when they occur alone in their own IU, behaving more like discourse markers than main clauses (see Steuck, 2016, pp. 77–80, on extraction and exclusion protocols).

The main-and-adverbial data set contains causal conjunctions porque, because, or cause. Causal conjunctions are frequent (Gras, 2023, p. 481) and, like complementizers, positioned after the main verb (whereas si or if clauses tend to precede the main clause; Diessel, 2005, p. 454). We set aside adverbial clauses preceding the main clause (“guideposts” according to Chafe, 1984, p. 444; n = 32). Each instance of porque and (be)cause was classified by considering stretches of the surrounding discourse. The data set includes instances that are more subordinate, as in (6), and more “independent” (Diessel, 2005, p. 465), as in (7), where the content of the (be)cause clause (set apart by a 1.2-second pause and laughter) does not justify a main clause proposition, but rather the relevance of the speaker’s assertion in the context (recounting meeting her husband when they were still in school).

(6)	((re: leaving school to work))
	me corté y me fui pa’ ~Helena,	“I dropped out and went to ~Helena,
	porque mi gente estaba ... pobre,	because my people were ... poor,”
		[23, 0:13-0:16]
(7)	((re: first meeting her husband in school))
	I tease him that he,
	joined the Marines and gave me a chance to grow up,
	...(1.2) @@@@@@ yeah,
	cause he’s four years older than me.
		[21, 44:37-44:45]

Excluded, however, were a variety of cases in which the causal conjunction was not clearly associated with two clauses, which may result from the relatively loose syntactic connection between main and adverbial clauses, compared with the link between main and complement clauses (Croft, 1995, pp. 860–861; Diessel, 2005, p. 465). It is not uncommon to find a clause beginning with porque and (be)cause with no associated main clause (n = 200), as in (8). Sometimes the subordinating conjunction serves to co-construct clause combinations across two speakers (Gras, 2023, pp. 495–496), for example, as a question and answer (n = 26 of 31 co-constructions). Finally, excluded were cases in which there was no associated subordinate clause because of truncation, abandonment, or interruption (n = 70; cf. Croft, 1995, p. 840) and cases in which it is impossible to ascertain the absence of other-language material due to unclear speech in either clause (generally, three or more syllables of unclear speech; n = 22).

(8)	((re: an accident while playing))
	cause he’s all,
	do it do it.
	I was like,
	are you ready?
		[20, 42:55-42:57]

Organizing bilingual speech data

CS between clauses is defined here as clause combinations in which the verbs are in a different language, as in (9) (n = 189). Counted separately are cases where the verbs are in the same language but there is multi-word CS within one of the clauses (n = 115) or where mixing consists only of other-language single-word incorporations, mostly lone English nouns (n = 178), as in (10).² Unilingual benchmarks are all Spanish (n = 856) or all English (n = 861) instances of clause combining, where there is no other-language material in either of the clauses (11).

(9)	CS between clauses
	.. y luego quizás era chiple,	“.. and then maybe she was spoiled,
	because she was a —	because she was a —
	... a twin.	... a twin.”
		[04, 17:08-17:11]

(10)	Mixing within clauses
(10a)	Multi-word CS
	... y para nosotros it was a snap,	“... and for us it was a snap,
	.. cause we already knew,	.. cause we already knew,
	.. English,	.. English,
	you know.	you know.”
		[10, 1:23-1:26]
(10b)	Lone item
	...(1.0) pero cuando vinimos pa’trás no nos dejaban entrar a San Diego.	“...(1.0) but when we came back they didn’t let us enter San Diego.
	... San Diego Bay porque tiraba mucha radiation.	... San Diego Bay because it was giving off a lot of radiation.”
		[02, 32:40-32:48]

(11)	Unilingual
(11a)	se iba a ir pa ~Tomillo Texas.	“she was going to go to ~Tomillo Texas.
	...(0.9) porque allá era donde trabajaban.	...(0.9) because that’s where they worked.”
		[04, 1:06:56–1:06:59]
(11b)	they’re,
	.. holding some of my orders,
	.. because I haven’t paid that order.
		[27, 48:47-48:53]

CS rates and the prosodic demarcation of languages

Remember that the junction of main and complement clause is a site of Spanish-English variable equivalence, due to variability in the presence of English complementizer that, whereas for adverbial clauses equivalence is consistent, in that a conjunction is invariably present in both languages. A first question is whether CS is less frequent between main and complement clauses than between main and adverbial clauses.

We calculate rates of CS between clauses by taking the total number of clause combinations as the denominator—all eligible instances, where CS could have occurred. The distribution according to language is shown in Figure 2. Most are unilingual, and about evenly divided between Spanish and English, at 43% (488/1135) and 41% (472/1135), respectively, for main-and-complement clauses (left column), and 35% (368/1064) and 37% (389/1064), for main-and-adverbial clauses (right column). CS occurs at rates of 6% (62/1135) between main and complement clauses and 12% (127/1064) between main and adverbial clauses.

Figure 2.

Distribution of bilinguals’ clause combinations according to language, comparing main-and-complement with main-and-adverbial.

Not only is CS less frequent, but there is greater prosodic separation of switched main and complement clauses (Figure 3). To see this, consider first unilingual instances (second column in each pair). Adverbial clauses often occur in a separate prosodic sentence from their main clause, as in (11a) and (12) (40%, 300/757, of unilingual, coded by position of the verbs). When in the same sentence, just a small minority occur in the same Intonation Unit (IU) as the main clause (5%, 35/757). In contrast, a majority of complement clauses are in the same IU with their main clause (71%, 684/960). Similar percentages have been reported for English narratives (9% adverbial vs 81% complement in same IU as main, n = 229) (Croft, 1995, p. 861).

(12)	Main and adverbial clause in separate prosodic sentences
	and you can see a little bit of the gray,
	.. that I wasn’t able to,
	... take off.
	.. because it was a lead based paint.	[28, 52:06-52:12]

Figure 3.

Prosodic relationships in clause combining. CS between clauses (first column in each pair) compared with unilingual benchmarks (second column); main-and-complement clauses (left pair) compared with main-and-adverbial clauses (right pair).

Remember now the prosodic IU-Boundary constraint, by which CS is generally less likely within an IU than at IU boundaries (see Figure 1). The IU-boundary CS constraint is seen here by comparing bilingual with unilingual clause combinations. The separation of the complement from the main clause is disproportionate in the bilingual instances. In a reversal of the unilingual tendency, when CS occurs between main and complement clauses, they tend not to occur in the same IU (same IU: 40%, 25/62, CS, vs 71%, 684/960, unilingual, p < .0001 by Fisher’s exact test) (Figure 3, left pair of columns). For main and adverbial clauses, while the proportion in the same IU (13a) is somewhat smaller for CS than for the unilingual benchmarks (2%, 2/127, vs 5%, 45/757, p = .0511), the proportion of same-IU instances is generally very low to begin with. In fact, close to one-half of adverbial clauses are in a separate prosodic sentence from their main clause, as in (13c), for both CS and unilingual instances (48%, 61/127, vs 40%, 300/757, p = .0796) (Figure 3, right pair of columns).

(13)	Prosodic relationships in clause combining with adverbial clauses: same IU, separate IUs-same sentence, separate sentences
(13a)	... it was so pretty porque entonces tenían color.	“... it was so pretty because then they had color.”
		[06, 1:08:37-1:08:39]
(13b)	well they would buy you the shoes a little bit big,	“well they would buy you the shoes a little bit big,
	porque te tenían que aguantar.	because they had to last you.”
		[22, 1:01:31-1:01:33]
(13c)	... en la mañana se van bien pa’ abajo.	“... in the morning they go way down.
	... cause in the morning the water’s cold.	... cause in the morning the water’s cold.”
		[13, 12:26-12:31]

The answer to our first question, then, is that CS is less frequent between a main and complement clause than between a main and adverbial clause. In addition, when switching between main and complement clauses, bilinguals prosodically separate the two languages, with disproportionate placement of the clauses in separate IUs compared with unilingual benchmarks.

Language of conjunction: main-and-complement clause versus main-and-adverbial clause

For a direct test of the Variable Equivalence hypothesis we turn to the language of the conjunction, to address the pair of questions: Do bilinguals prefer the conjunction from one of the languages? and How do they make the choice? With CS between main and complement clauses, the conjunction is overwhelmingly Spanish que, as in (14a), at 87% (54/62) (Table 1).³ The rates of the English options, complementizer that (14b) and complementizer absence (14c), are just 5% (3/62) and 8% (5/62), respectively. Choice is more balanced for main-and-adverbial clause CS. Spanish conjunction porque, as in (13a), (13b) above, is at 60% (76/127) and English (be)cause, as in (1), (9), and (13c), at 40% (51/127).

(14a)	Main + CONJ_SPAN que + Complement
	... saben ustedes que that is my place,	“... you know that this is my place,”
		[23, 43:56-43:58]
(14b)	Main + CONJ_ENG that + Complement
	.. ella me dijo,	“.. she told me,
	.. that she’d rather go to ∼Nancy’s.	.. that she’d rather go to ∼Nancy’s.”
		[31, 21:53-21:55]
(14c)	Main + CONJ_ENG ∅ + Complement
	I guess Ø quería estar contenta porque,	“I guess Ø I wanted to feel content because,”
		[06, 37:29 -37:31]

Table 1.

Mixed-effects logistic regression model predicting an English conjunction in Spanish-English bilingual clause combining.

	Estimate	Std. error	z value	p	n	%Eng conj
(Intercept)	−0.09	0.33	0.29	.774	189	31%
Clause type (Reference level: Adverbial)					127	40%
Complement	−2.23	0.61	−3.65	<.001	62	13%
CS direction (Reference level: Spanish to English)					108	41%
English to Spanish	−1.51	0.45	−3.35	<.001	81	19%
Clause type × CS direction	1.27	0.95	1.34	.180

Note. Random intercept: speaker (n = 26; pooling speakers with just one adverbial or complement clause), variance = 0.80 (SD = 0.89). As the model predicts an English conjunction, negative coefficient estimates indicate that a factor increases the likelihood of a Spanish conjunction. Values of n and %Eng conj reported for full dataset (main effects).

The language of the conjunction in bilingual main-and-adverbial clauses largely aligns with the direction of CS. Spanish porque is preferred when switching from an English main to a Spanish adverbial clause, as in (13a) and (13b) (78%, 43/55), but not when CS goes in the opposite direction, as in (13c), where English (be)cause is more likely to be chosen (54%, 39/72) (Figure 4). The tendency for the conjunction to match the language of the adverbial rather than main clause (despite some asymmetry in favor of Spanish porque) corresponds to the prosodic position of the conjunction in unilingual speech. Only approximately 10% of causal conjunctions occur in the IU of the main clause (9%, 68/757), whereas two-thirds occur in the same IU as the subordinate clause verb (68%, 518/757) and close to one-fifth in their own IU (as in (4)) (17%, 126/757) (the balance is same-IU clause combinations (13a)).⁴

Figure 4.

Language of conjunction in bilingual clause combining according to clause type and CS direction.

Is, then, the patent asymmetry in favor of the Spanish complementizer (que) an accidental product of data distributions according to the directionality of CS, that is, is it simply that these bilinguals tend to switch from English to Spanish between main and complement clause? It turns out that CS is as bidirectional for complement clauses as it is for adverbial clauses, from English to Spanish (14c) and from Spanish to English (14a, 14b), respectively, at 42% (26/62) and 58%, compared to 43% (55/127) and 57% for adverbial clauses. Nevertheless, unlike adverbial conjunctions, complementizers are predominantly Spanish, regardless of CS direction. The rate of que is 88% (23/26) when switching to a Spanish complement clause and a virtually identical 86% (31/36) when switching to an English complement clause (Figure 4).

Bilinguals’ different treatment of the two clause types is reflected in logistic regression analysis (using the glmer() function from the lme4 package in R; Bates et al., 2015; R Core Team, 2021) with language of the conjunction, as the dependent variable, and clause type and CS direction, as independent variables. Both have a significant effect (Table 1). However, the effect of CS direction is due to the (numerically stronger) adverbial clauses, which differ between themselves according to CS direction as we have seen. While an interaction of CS direction with clause type is not significant (because all three cells differ from the reference level), complement clauses do not differ between themselves.⁵

In sum, we confirm that the subordinate clause types are treated differently, beyond the lower frequency of CS with complement clauses (preceding section). In answer to our questions, conjunction language choice in main-and-adverbial combinations is affected by CS direction, with the causal conjunction more likely in the language of the subordinate than that of the main clause. However, when combining main and complement clauses, bilinguals choose the Spanish conjunction que, regardless of CS direction.

Variable equivalence: choosing the more frequent and predictable option

The explanation for this preference for one of the complementizers lies in the usage of the two languages and the role of equivalence in combining them. Alternative, formal syntactic explanations must face the fact that there is no universal generalization that the complementizer must be in the language of either the main or the complement clause (e.g., López et al., 2017). On the other hand, positing a matrix language that supplies the complementizer (cf. Myers-Scotton & Jake, 2009, p. 355) confronts the fact that, whether this language is designated at the sentence or clause level and whether it is designated as that of the main or subordinate verb, these bilinguals select que at the same rate for Spanish and English verbs. Besides, a matrix language for main-and-complement clause combining would have to work differently for adverbial clauses.

How, then do bilinguals choose the language of the complementizer? The Variable Equivalence hypothesis is that bilinguals deal with CS points at which inter-linguistic equivalence is not consistent due to intra-linguistic variability by choosing the more frequent and predictable option. We are able to gauge the frequency and predictability of the conjunction options by considering both of the bilinguals’ languages, thanks to the bilingual corpus.

Choosing the more frequent option

To contextualize conjunction choice in bilingual clause combining, we compare with the same bilinguals’ unilingual instances. Figure 5 shows the language of the conjunction in bilingual clause combining (left), unilingual English (middle) and unilingual Spanish (right), for each clause type. Consider first unilingual use. In main-and-complement clauses (top), complementizer que is invariably present, at 100%, in bilinguals’ Spanish, just as in monolingual varieties (cf. Mazzola et al., 2022, pp. 229–230). Its English analogue that, at approximately one-quarter (27%, 127/472), is by far the minority variant when compared to its alternative, complementizer absence, also as in monolingual varieties. Note, too, that the numbers are nearly even between the two languages (see also Figure 2).

Figure 5.

Language of conjunction in bilingual and unilingual clause combining, for complement clauses (top) and adverbial clauses (bottom).

Now, if the language of the complementizer in CS were proportional to unilingual complementizer use, we would expect about 50% que, 15% that, and 35% complementizer absence. By comparing CS with bilinguals’ unilingual instances, it is clear that complementizer absence, the non-equivalent option, is indeed quantitatively avoided (at a disproportional 8%). But que still overshadows its analogue that, by more than 15 to 1 (87% que, 5% that). The overuse of que in CS between clauses is confirmed by comparison with the language of the causal conjunction in bilingual main-and-adverbial clause combinations (Figure 5, bottom). This approximates the unilingual proportions (60% porque, 23% because, 17% cause, in CS, compared to 50%, 30%, 20%, respectively, in unilingual).

The Variable Equivalence hypothesis draws our attention to the fact that, when we consider bilinguals’ linguistic experience in both their languages, Spanish que is the more frequent complementizer option, across the two languages overall. Cumulatively, que would be the more entrenched, with greater representational strength (cf. Hakimov & Backus, 2021, p. 463).

Choosing the more predictable option

In unilingual use, choice of English complementizer that is not only less frequent, it is intricately conditioned by a set of linguistic factors (e.g., Shank et al., 2016, pp. 202–213; Torres Cacoullos & Walker, 2009, pp. 19–32). Across dialects, that presence is favored by frequent main clause verbs know, say, tell; intervening material such as adverbials between the clauses; main clause subjects other than I; and complement clause subjects that are lexical NPs (the form in which new referents are mostly introduced). That absence is highly favored by high-frequency first-person singular collocations, particularly I think and I guess. The same conditioning is observed in bilinguals’ unilingual English (LaCasse & Torres Cacoullos, 2023, pp. 20–23). In addition, bilinguals’ unilingual that rate, at 27%, is well within the monolingual range (10%–30%, for example, Wulff et al., 2018, p. 105).

However, Spanish complementizer que is a “semantically vacuous” conjunction (Pérez Saldanya, 2023, p. 358) and is always used, automatically. This means that que is more predictable (accessible) than context-dependent, probabilistically-conditioned that. Therefore, que is a more likely bilingual choice than that at the juncture of the two languages. Returning to our question on complementizer language choice, the answer is given by the CS strategy for variable equivalence: opt for the variant from one of the languages that is more frequent and predictable considering both languages.

Conclusion

The Variable Equivalence hypothesis, that bilinguals prefer CS with equivalent variants that are more frequent and predictable, brings variability in actual language use into CS constraints, in line with a usage-based variationist approach. Such a probabilistic view of CS provides the explanation for why CS requires both proficiency in two languages—to be able to identify grammatical equivalence sites (Poplack, 1980, p. 615)—and experience with bilingual community practices—to develop adaptive strategies for variable equivalence.

Testing variable equivalence for Spanish-English CS between main and subordinate clauses revealed bilinguals’ different treatment of complement and adverbial clauses. With complement clauses, the rate of CS is lower, prosodic separation is greater and, most notably, conjunction choice is more asymmetrical. The same asymmetry appeared in an elicited oral production task, where subjects produced que more often than that whether the stimulus began in Spanish or English (Dussias, 2001, p. 33). Here, we have seen that, regardless of CS direction, Spanish que is overwhelmingly selected over English that, but not Spanish porque over English (be)cause.

The main-and-complement clause junction is a site of Spanish-English variable equivalence. Whereas in Spanish unilingual contexts, bilinguals use complementizer que 100% of the time, in English unilingual contexts they have a choice between two options, that and complementizer absence. So, when we expand our scope to bilingual contexts, there are three options: non-equivalent complementizer absence, which is avoided, that, and que. Since their English production is variable, while their Spanish production is invariable, in speakers’ combined inventory, que is by far the most frequent and predictable option.

Usage also explains the modest disproportion in favor of porque over (be)cause, despite word placement equivalence. As there are two English alternatives, choice of the one Spanish option is more automatic. Also contributing to perhaps more ready access of porque is that it is strengthened by its inclusion of que. Indeed, que occurs as part of a number of other conjunctions (e.g., aunque “even though,” ya que “since,” mientras que “while”).

Preference for que in bilingual main-and-complement clauses would additionally be reinforced by its generalization as a marker of clause beginning in Spanish. Que is not only obligatory for complement clauses, it also begins main clause uses, such as in well-wishing, quotation, and other instances of insubordination (Gras, 2023, pp. 487–490). It would make sense that, because of its high type frequency due to the variety of its uses, que is not bound to particular main clause verbs and thus is readily joinable with an English verb (cf. Bybee, 2010, p. 95).

In sum, whereas in theory either que or that would fulfill equivalence, accounting for linguistic experience in both languages brings forth choice of the more frequent and predictable option, que. Moving forward, it will be profitable to contextualize CS by considering the same bilinguals’ unilingual language use. These comparisons will allow us, on the one hand, to discover quantitative preferences for CS at particular junctures and, on the other hand, to account for those preferences by cumulative linguistic experience in both languages.

Footnotes

Appendix 1

Carriage return	New Intonation Unit^a
.	Final intonation contour	-	Truncated word
,	Continuing intonation contour	..	Short pause (0.2 seconds)
?	Appeal intonation contour	...	Medium pause (0.3–0.6 seconds)
—	Truncated intonation contour	...( )	Timed pause (0.7 seconds or longer)
~	Pseudonymized proper noun	@	One syllable of laughter
!	Booster; notably high pitch	<@ @>	Speech uttered while laughing

Where the IU does not fit on one line, the second line is indented.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

Support from the National Science Foundation (BCS-1019112/1019122, 1624966) is gratefully acknowledged.

ORCID iD

Rena Torres Cacoullos

Notes

Author biographies

Rena Torres Cacoullos is Liberal Arts Professor of Spanish and Linguistics at The Pennsylvania State University. She studies language contact through the lens of language-internal variation.

Dora LaCasse is an Assistant Professor of Spanish at the University of Montana. Her research focuses on language variation, change and contact, and draws on broader cross-linguistic patterns to further our understanding of Spanish morphosyntax.

References

Backus

(2021). Usage-based approaches. In Adamou

Matras

(Eds.), The Routledge handbook of language contact (pp. 110–126). Routledge.

Barking

Mos

Backus

(2022). Comparing forward and reverse transfer from Dutch to German. International Journal of Bilingualism, 26(4), 389–404.

Bates

Maechler

Bolker

Walker

(2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48.

Benevento

N. M.

Dietrich

A. J.

(2015). I think, therefore digo yo: Variable position of the 1sg subject pronoun in New Mexican Spanish-English code-switching. International Journal of Bilingualism, 19(4), 407–422.

Bresnan

Dingare

Manning

(2001). Soft constraints mirror hard constraints: Voice and person in English and Lummi. In Butt

Holloway King

(Eds.), Proceedings of the LFG01 Conference (pp. 13–31). CSLI Publications.

Bybee

(2010). Language, usage and cognition. Cambridge University Press.

Chafe

(1984). How people use adverbial clauses. Berkeley Linguistics Society, 10, 437–449.

Chafe

(1994). Discourse, consciousness and time: The flow and displacement of conscious experience in speaking and writing. University of Chicago Press.

Croft

(1995). Intonation units and grammatical structure. Linguistics, 63, 839–882.

10.

Deuchar

(2005). Congruence and Welsh–English code-switching. Bilingualism: Language and Cognition, 8(3), 255–269.

11.

Diessel

(2005). Competing motivations for the ordering of main and adverbial clauses. Linguistics, 43(3), 449–470.

12.

Du Bois

J. W.

Schuetze-Coburn

Cumming

Paolino

. (1993). Outline of discourse transcription. In Edwards

J. A.

Lampert

M. D.

(Eds.), Talking data: Transcription and coding in discourse research (pp. 45–89). Lawrence Erlbaum.

13.

Dumont

Wilson

D. V.

(2016). Using the variationist comparative method to examine the role of language contact in synthetic and periphrastic verbs in Spanish. Spanish in Context, 13(3), 394–419.

14.

Dussias

P. E.

(2001). Psycholinguistic complexity in codeswitching. International Journal of Bilingualism, 5, 87–110.

15.

Givón

(1979). On understanding grammar. Academic Press.

16.

Gras

(2023). Las conjunciones. In Rojo

Vázquez Rozas

Torres Cacoullos

(Eds.), Sintaxis del español / The Routledge handbook of Spanish syntax (pp. 483–497). Routledge.

17.

Green

(2018). Language control and code-switching. Languages, 3(8), 1–16.

18.

Hakimov

Backus

(2021). Usage-based contact linguistics: Effects of frequency and similarity in language contact. Journal of Language Contact, 13, 459–481.

19.

Kootstra

G. J.

Van Hell

J. G.

Dijkstra

(2010). Syntactic alignment and shared word order in code-switched sentence production: Evidence from bilingual monologue and dialogue. Journal of Memory and Language, 63(2), 210–231.

20.

Labov

(1984). Field methods of the project on linguistic change and variation. In Baugh

Sherzer

(Eds.), Language in use: Readings in sociolinguistics (pp. 28–53). Prentice Hall.

21.

Labov

(2012). What is to be learned. The community as the focus of social cognition. Review of Cognitive Linguistics, 10, 265–293.

22.

LaCasse, D., & Torres Cacoullos

(2023). Simplification in bilinguals’ parallel structures? Spanish and English main-and-complement clauses. In Waltermire

Bove

(Eds.), Mutual Influence in Situations of Spanish Language Contact in the Americas (pp. 7–28). Routledge.

23.

Lipski

(1978). Code-switching and the problem of bilingual competence. In Paradis

(Ed.), Aspects of bilingualism (pp. 250–264). Hornbeam Press.

24.

López

Alexiadou

Veenstra

(2017). Code-switching by phase. Languages, 2(3), 1–17.

25.

Mazzola

Cornillie

Rosemeyer

(2022). Asyndetic complementation and referential integration in Spanish. A diachronic probabilistic grammar account. Journal of Historical Linguistics, 12(2), 194–240.

26.

Muysken

(2000). Bilingual speech: A typology of code-mixing. Cambridge University Press.

27.

Myers-Scotton

Jake

(2009). A universal model of code-switching and bilingual language processing and production. In Bullock

Toribio

(Eds.), The Cambridge handbook of linguistic code-switching (pp. 336–357). Cambridge University Press.

28.

Otheguy

(2015). The linguistic competence of second-generation bilinguals: A critique of “incomplete acquisition.” In Tortora

den Dikken

Montoya

O’Neill

(Eds.), Romance Linguistics 2013: Selected papers from the 43rd Linguistic Symposium on Romance Languages (LSRL) (pp.301–319). John Benjamins.

29.

Pérez Saldanya

. (2023). Modo y modalidad. In Rojo

Vázquez Rozas

Torres Cacoullos

(Eds.), Sintaxis del español/The Routledge handbook of Spanish syntax (pp. 354–368). Routledge.

30.

Poplack

(1980). “Sometimes I’ll start a sentence in Spanish y termino en español”: Toward a typology of code-switching. Linguistics, 18, 581–618.

31.

Poplack

(2015). Code switching: Linguistic. In Wright

(Ed.), International Encyclopedia of the social & behavioral sciences (2nd ed., Vol. 3, pp. 918–925). Elsevier.

32.

Poplack

(2021). A variationist perspective on language contact. In Adamou

Matras

(Eds.), The Routledge handbook of language contact (pp. 46–62). Routledge.

33.

R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/

34.

Sankoff

(1998). A formal production-based explanation of the facts of code-switching. Bilingualism: Language and Cognition, 1(1), 39–50.

35.

Shank

Plevoets

Van Bogaert

(2016). A multifactorial analysis of that/zero alternation: The diachronic development of the zero complementiser with think, guess and understand. In Yoon

Gries

S. T.

(Eds.), Corpus-based approaches to construction grammar (pp. 201–240). John Benjamins.

36.

Steuck

(2016). Exploring the syntax-semantics-prosody interface. In Cuza

Czerwionka

Olson

(Eds.), Inquiries in Hispanic linguistics: From theory to empirical evidence (pp. 73–94).John Benjamins.

37.

Torres Cacoullos

. (2020). Code-switching strategies: Prosody and syntax. Frontiers in Psychology, 11, Article 2130.

38.

Torres Cacoullos

Travis

C. E

. (2018). Bilingualism in the community: Code-switching and grammars in contact. Cambridge University Press.

39.

Torres Cacoullos

Travis

C. E

. (2019). Variationist typology: Shared probabilistic constraints across (non-)null subject languages. Linguistics, 57, 653–692.

40.

Torres Cacoullos

Vélez-Avilés

. (2023). Mixing adjectives: A variable equivalence hypothesis for bilingual word order conflicts. Linguistic Approaches to Bilingualism.

41.

Torres Cacoullos

Walker

. (2009). On the persistence of grammar in discourse formulas: A variationist study of that. Linguistics, 47, 1–43.

42.

Travis

C. E.

Torres Cacoullos

(2020). The role of pragmatics in shaping linguistic structures. In Félix-Brasdefer

J. C.

Koike

(Eds.), The Routledge handbook of Spanish pragmatics (pp. 129–147). Routledge.

43.

Trawick

(2022). Revisiting the concept of “triggering” of code-switching [Doctoral dissertation]. The Pennsylvania State University ETDA.

44.

Wulff

Gries

S. T.

Lester

(2018). Optional that in complementation by German and Spanish learners. In Tyler

Huang

Jan

(Eds.), What is applied cognitive linguistics? Answers from current SLA research (pp. 99–120). Mouton.

Bilingual clause combining: A Variable Equivalence hypothesis for conjunction choice

Abstract

Objectives:

Methodology:

Data and Analysis:

Findings:

Originality:

Significance:

Keywords

Code-switching and Variable Equivalence

Speech community and corpus

Participants

Prosodic structure of CS

Datasets

Clause combining data inclusions and exclusions

Organizing bilingual speech data

CS rates and the prosodic demarcation of languages

Language of conjunction: main-and-complement clause versus main-and-adverbial clause

Variable equivalence: choosing the more frequent and predictable option

Choosing the more frequent option

Choosing the more predictable option

Conclusion

Footnotes

Appendix 1

Declaration of conflicting interests

Funding

ORCID iD

Notes

Author biographies

References