Sage Journals: Discover world-class research

Abstract

Norwegian allows filler-gap dependencies into embedded questions, which are islands for filler-gap dependency formation in English. We ask whether there is evidence that Norwegian learners of English transfer the functional structure that permits island violations from their first language (L1) to their second language (L2). In two acceptability judgment studies, we find that Norwegians are more likely to accept ‘island-violating’ filler-gap dependencies in L2 English if the corresponding filler-gap dependency is acceptable in Norwegian: Norwegian learners variably accept English sentences with dependencies into embedded questions, but not into subject phrases. These results are consistent with models that permit transfer of abstract functional structure. Norwegians are still less likely to accept filler-gap dependencies into English embedded questions than Norwegian embedded questions. We interpret the latter finding as evidence that, despite transfer, Norwegian speakers may partially restructure their L2 English analysis. We discuss how indirect positive evidence may play a role in helping learners restructure.

Keywords

filler-gap dependencies Full Transfer indirect evidence island effects Norwegian

I Introduction

This article addresses first language (L1) transfer in the acquisition of filler-gap dependencies in adult second language (L2) acquisition. We ask whether Norwegian learners of English transfer acceptable filler-gap dependencies from their L1 Norwegian to their L2 English, including dependencies that are unacceptable (and therefore unattested) in English. We also consider whether and how Norwegians might learn that English is more restrictive than Norwegian.

Norwegian and English allow long-distance filler-gap dependencies into embedded declarative clauses. For example, the relative clause (RC) head the signals / signalene can be interpreted as either the direct object (1a, 2a) or subject (1b, 2b) of an embedded verb.

(1) a. Those were the signals_i that the sailors said [(that) folks could understand ____i ].

b. Those were the signals_i that the sailors said [ ____i meant danger].

(2) a. Det var signal-ene_i som sjømenn-ene sa [(at) folk kunne forstå ____i ].

That was signal-def.pl that seamen-def.pl said that folks could understand

‘Those were the signals that the sailors said that folks could understand.’

b. Det var signal-ene_i som sjømenn-ene sa [(at) ___i betydde fare].

That was signal-def.pl that seamen-def.pl said that meant danger

‘Those were the signals that the seamen said meant danger.’

Norwegian and English differ, however, in subtle ways. Embedded questions are islands in English in that they block filler-gap dependency formation (Chomsky, 1977; Sprouse et al., 2012). Attempting to associate the filler the signals with the embedded verbs in (3) results in unacceptability. In Norwegian embedded questions are not islands (Maling and Zaenen, 1982). It is acceptable to associate the filler signalene with the embedded gaps in (4).

(3) a. * Those were the signals_i that the sailors knew [who could understand __ _i].

b. * Those were the signals_i that the sailors knew [what ___i meant].

(4) a. Det var signal-ene_i som sjømennene visste [hvem som kunne forstå ____i ].

That was signal-def.pl that seamen-def.pl knew who expl could understand

‘Those were the signals that the sailors knew who could understand.’

b. Det var signal-ene_i som sjømennene visste [hva ___i betydde].

That was signal-def.pl that seamen-def.pl knew what meant

‘Those were the signals that the sailors knew what meant.’

~ ‘Those were the signals that the sailors knew the meaning of.’

In the present study, we investigate if the acceptability of sentences like (4b) leads native Norwegian speakers to accept sentences like (3b) in their L2 English.

We expect Norwegians to accept sentences like (3b) if they inappropriately transfer to L2 English those features of their L1 grammar that render embedded questions non-islands. What could such features be? Under many generative syntactic analyses ‘long-distance’ movement out of an embedded clause as in (1) and (2) requires successive-cyclic movement through the left-periphery of the embedded clause (e.g. Chomsky, 1977, 2000). In languages like English, this movement uses the specifier of the complementizer phrase (henceforth spec,CP) as an intermediate landing site. Ordinary declarative clauses are not islands, because spec,CP is empty, allowing the moved element to transit through. Embedded questions are islands because spec,CP is already occupied by the wh-phrase – who/what in examples (3a) and (3b) – so an intermediate stop-over is blocked.

Cross-linguistic differences in the islandhood of embedded questions are assumed to reflect parametric variation in the functional structure of the left-periphery of the clause (e.g. Reinhart, 1983; Rizzi, 1982).¹ For the sake of concreteness we make use of a specific proposal made for Mainland Scandinavian languages like Norwegian: Recent work argues that such languages have multiple specifiers in the complementizer domain that would permit successive-cyclic movement through the edge of an embedded question (e.g. Lindahl, 2017; Kush et al., 2018, 2019; Vikner et al., 2017). The relevant specifiers are generated by an extra functional head (e.g. the head c under Vikner and colleagues’ proposal).

Under this analysis, Norwegians would treat English embedded questions as non-islands if they transfer the extra functional structure of their L1 complementizer domain to English. As we discuss below, whether such transfer is possible is a point of disagreement between models of L2 acquisition. In our investigation, we address three inter-related theoretical questions at the intersection of second-language influence and learnability:

To what extent does L1 functional structure transfer to L2?

Are L1 features transferred to L2 in a conservative fashion?

How do L2 learners restructure after erroneous L1 transfer?

We discuss each question in turn.

1 What can transfer?

Cases of L1–L2 transfer are well documented. For example, learners often produce or accept L1 word order patterns that are ungrammatical in L2 (Ayoun, 1999; Rankin, 2012; Trahey and White, 1993; Westergaard, 2003; White, 1991). Such instances suggest that L2 learners use some aspects of their L1 (as a starting point) to analyse their L2, but exactly what transfers is a matter of considerable debate.

Models of L2/Ln acquisition disagree on the degree to which L1 functional structure transfers (for review, see Rothman et al., 2019). The Minimal Trees approach of Vainikka and Young-Scholten (1994, 1996, 2006) admits no transfer of functional projections from L1 to L2, positing that learners transfer only lexical projections (VPs) during early acquisition. Higher-level functional projections (e.g. CP) are assumed to emerge later in development via the interaction of L2 input and principles of Universal Grammar (UG) without using L1 functional heads as templates. Most other models of transfer assume that functional projections from L1 transfer to L2, though they differ as to what this entails. Eubank’s (1993) Weak Transfer hypothesis holds that functional heads from L1 transfer along with their parameter settings (e.g. basic directionality), but L1-specific lexical feature-values associated with those heads do not transfer. Full Transfer models contend that L1 functional heads, their parameter settings, and their associated feature values serve as the initial interlanguage template for L2 development (e.g. Schwartz and Sprouse, 1994, 1996).² As far as island-insensitivity in L1 Norwegian can be attributed to the presence of extra functional structure, observing comparable island insensitivity in L2 English would constitute evidence for transfer of that functional structure.

2 Conservativity and transfer

Language learners often encounter input that is compatible with two (or more) analyses differing in generative capacity: a restrictive analysis that closely fits the observed data and another more powerful analysis that generates both the observed data and additional unattested sentences. In such cases the strings generated by the first analysis represent a subset of the strings generated by the more powerful analysis (with respect to a given phenomenon).³ When learners must choose between the two analyses, they face a version of the classic subset–superset problem (e.g. Berwick, 1985; Wexler and Manzini, 1987; White, 1989a, 1989b; for cases in L2, see Judy and Rothman, 2010; Yuan, 1997): should they choose the more, or less, restrictive analysis? What if they choose the less restrictive analysis and it turns out the be incorrect? If so, rejecting the superset analysis may be difficult since strings consistent with the subset analysis are equally consistent with the superset analysis.

Similar learnability considerations apply in L2 acquisition, where the problem is aggravated by the possibility of transfer: If the learner’s L1 supports the superset analysis and the analysis is transferred to L2, the result is an overly permissive L2 grammar that generates both acceptable and unacceptable L2 forms. The case of Norwegian is arguably such an instance: if Norwegian learners transfer their L1 functional structure to their analysis of L2 English filler-gap dependencies, they would be able to generate acceptable long-distance dependencies in English, but also island-violating dependencies that should be unacceptable in English.

Prior work in L1 acquisition has argued that learners can avoid erroneous overgeneralization by adopting conservative learning strategies that prefer restrictive analyses (e.g. Snyder, 2007; Westergaard, 2014).⁴ In principle, it is possible that transfer is also conservative: L2 learners could eschew transferring features that would potentially over-generate or avoid transferring typologically marked structures (e.g. Mazurkewich, 1984) without direct evidence for those features. Previous research has shown that L2 learners are less conservative than L1 learners, but these studies have not directly considered the role that transfer might play in these situations (e.g. Anderssen et al., 2018; Clahsen and Muysken, 1986; White, 1989b). If Norwegians treat embedded questions as non-islands in L2 English, this would constitute evidence against conservative transfer.

3 Retraction and restructuring after transfer

L2 learners can undo transfer of an L1 feature (e.g. restructure) based on positive evidence of conflict between L2 input data and L1 analyses. Many models assume that direct positive evidence of conflict with the L1 analysis of phenomenon, P, is required for restructuring the L2 analysis of P (see, e.g. Schwartz and Sprouse, 1996): metalinguistic-type negative evidence (e.g. correction) is believed not to be useful for prompting underlying grammatical restructuring (Schwartz, 1993; White, 2003). For some basic phenomena like determining head-directionality the relevant evidence is in abundance, so restructuring should happen quickly. As the similarity between or number of (surface) forms predicted by L1 and L2 increases, however, the possibility of direct conflict diminishes: the relevant positive data are scarce, if they exist at all. In the absence of (enough) conflicting data, inappropriately transferred features are expected to persist late into acquisition or become fossilized (see, amongst others, Franceschina, 2005; Hawkins et al., 1993; Judy and Rothman, 2010; Lardiere, 2007; Schwartz and Sprouse, 1996). As such, instances where L2 surface forms are a subset of acceptable L1 forms represent paradigm cases where ‘persistent’ or fossilized transfer should obtain. Given that (most) acceptable English filler-gap dependencies are compatible with a transferred Norwegian analysis, we predict that Norwegians are likely to have restructured their L2 English grammars if transfer has occurred.

4 Past work on learnability of islands/movement

Before proceeding to our experiments, we briefly consider past work that investigated islands in L2 acquisition to highlight the difference from our research questions. Most prior studies were framed as tests of access to principles of Universal Grammar (UG) during L2 acquisition rather than transfer.

Some earlier experiments explored if L1 speakers of languages without overt wh-movement accept island-violating wh-movement in English (Johnson and Newport, 1991; Li, 1998; Martohardjono, 1993; White and Genesee, 1996; White and Juffs, 1998; Wolfe Quintero, 1992). For example: As part of a larger study, Martohardjono (1993) had L1 Chinese and L1 Indonesian participants judge sentences with long-distance wh-dependencies in their L2 English. Test sentences contained wh-dependencies that into five types of constituents that are islands in English: embedded questions (wh-islands), RCs, complex NPs, adjunct clauses, and sentential subjects. Martohardjono found that participants in both groups correctly rejected island violations on a non-trivial portion of trials,⁵ which was taken as evidence for access to UG constraints on wh-movement during L2 acquisition (for similar conclusions, see Li, 1998; White and Juffs, 1998).

Other experiments have tested whether learners accept island-violating L2 filler-gap dependencies that correspond to unacceptable dependencies in their L1. Martohardjono (1993) again provides an example. Martohardjono asked L1 Italian participants to rate the same English sentences as the native Chinese and Indonesian participants in the experiment above. In Italian, wh-movement from all five of the constituents is unacceptable, just as in English (Rizzi, 1982; Sprouse et al., 2016). Martohardjono found that Italian participants rejected the test sentences at rates comparable to L1 English natives.⁶ These results demonstrate that participants do not allow island-violating dependencies in their L2 if those dependencies are unacceptable in L1, an empirical conclusion that is also supported by the growing body of research on the real-time processing of islands in L2 (Felser et al., 2012; Kim et al., 2015; Omaki and Schulz, 2012).

The results above do not directly address the limits of transfer because they are in principle compatible with transfer either having or not having occurred. If speakers of non-wh-movement languages initially transferred their L1 analysis of wh-dependencies to L2 English, observing overt movement dependencies would prompt them to restructure and generate a new analysis for the observed forms in the L2 input. If transfer did not occur, they would similarly base an analysis of English wh-dependencies on input forms. Judgments of English dependencies, then, would be based on their input-driven analyses. In the case of Italian, if participants conservatively learn the distribution of acceptable English dependencies from the L2 input alone or transfer their L1 analysis, they should reject island-violating wh-dependencies all the same.

Unlike prior experiments, our work tests whether transfer occurs by testing cases where the dependencies allowed by L1 constitute a larger set than is allowed in L2. If transfer occurs, we expect ‘unlearning’ the L1 analysis should prove difficult because there is arguably little to no direct evidence that would contradict the transferred analysis. As such, we expect the transferred analysis to persist and to affect participant judgments even despite high otherwise proficiency in the L2 and significant time knowing the L2 well. In particular, we expect participants to accept unacceptable L2 forms generable under the L1 analysis. We tested whether L2 speakers of English make such errors with two acceptability judgment studies. To preview our main results, we find evidence for the predicted non-conservative transfer from L1 Norwegian to L2 English. However, we also find evidence that suggests some degree of restructuring: Norwegians do not uniformly treat embedded questions as non-islands in English as they do in Norwegian. We consider the implications of these facts in the General Discussion.

II Experiments

We ran two acceptability judgment studies that tested Norwegian speakers’ intuitions about the acceptability of relative clause dependencies in configurations like (3b) and (4b) in both English and Norwegian. We henceforth refer to such examples as Wh-Trace Configurations to highlight two characteristics of the constructions: (1) the islands in question are embedded questions (wh-islands) and (2) the filler is associated with a subject gap/trace immediately adjacent to the embedded wh-word. Both aspects of the constructions are presumed to result in unacceptability in English: (1) because of an island violation, and (2) because it is unacceptable in (most dialects of) English to have a gap next to an overt element in the complementizer domain (so called Comp-trace effects; Chomsky and Lasnik, 1977; Perlmutter, 1971). As both experiments had the same design, we present information about the materials, procedure, and analysis before discussing the specifics of each experiment.

1 Materials and design

Both experiments employed the factorial definition of island effects developed by Sprouse (2007) and used in many recent studies of island-sensitivity cross-linguistically (e.g. Kush et al., 2018, 2019; Sprouse et al., 2011, 2012, 2016). The standard 2×2 factorial design for island effects crosses the factors Structure and Distance. All test sentences contain a filler-gap dependency. Distance controls the length of the dependency, while Structure controls the presence or absence of an island configuration. Example (5) illustrates the design with an item for testing Wh-Trace configurations.

(5) Sample Wh-Trace Item (English Conditions):

The sailors . . .

a. found someone that __ knew [ that the signal meant danger ]. Short | No-Island

b. saw the signal that they knew [ ___ meant danger ]. Long | No-Island

c. found someone that __ knew [what the signal meant ]. Short | Island

d. saw the signal that they knew [what ___ meant ]. Long | Island

In (5) the filler-gap dependency is a relative clause dependency. Distance determined whether the head of the RC (either someone or the signal) was linked to the highest subject position in the RC (Short) or to the embedded subject position (Long). Structure manipulated whether the most deeply embedded clause inside the relative clause was declarative (No-Island) or an embedded question (Island). The Long–Island condition corresponds to the only sentence with an ‘island violation’. According to the logic of the factorial design, an island effect is defined as a Structure × Distance interaction: it reflects the residual unacceptability associated with a structure like (5d) once the independent effects that long-distance extraction and structural complexity have on acceptability have been accounted for.

We crossed the standard 2×2 manipulation above with an additional factor: Language. Norwegian counterparts for all English items were created, (6), resulting in a 2×2×2 design.

(6) Sample Wh-Trace Item (Norwegian Conditions):

Sjømennene . . . Sailors.def.pl

a. fant noen som __ visste [at signal-et betydde fare].

found someone that __ knew that signal-def meant danger Short | No-Island

b. så signal-et som de visste [at __ betydde fare].

saw signal-def that they knew at __ meant danger Long | No-Island

c. fant noen som __ visste [hva signal-et betydde].

found someone that __ knew what signal-def meant Short | Island

d. så signalet som de visste [hva __ betydde].

saw signal-def that they knew what ___ meant Long | Island

In addition to Wh-Trace items, we tested sensitivity to another island type: Subject islands. The islandhood of subject phrases is determined by different syntactic constraints (e.g. the Condition on Extraction Domains of Huang, 1982) than the embedded questions. As a result, the extra functional structure that permits Wh-Trace island violations should have no effect on the islandhood of subjects. Thus, subjects should be islands in both Norwegian and English. This prediction has been verified by previous studies using the factorial design (Kush et al., 2018, 2019; Sprouse et al., 2011, 2016).

We adapted materials from Kush et al. (2018, 2019) to test the acceptability of RC-dependencies into subject islands. The design crossed Structure × Distance × Language, yielding 8 conditions as exemplified below.

(7) Sample Subject Island Item (English Conditions):

The judge . . .

a. met the lawyer that __ hoped that the report would confirm the suspicions. Short | No-Island

b. read the report that the lawyer hoped would __ confirm the suspicions. Long | No-Island

c. met the lawyer that __ hoped that the information in the report would confirm the suspicions. Short | Island

d. read the report that the lawyer hoped that [the information in __] would confirm the suspicions. Long | Island

(8) Sample Subject Island Item (Norwegian Conditions):

Dommeren. . .

a. møtte advokaten som __ håpet at rapporten ville bekrefte mistankene. Short | No-Island

b. leste rapporten som advokaten håpet at __ ville bekrefte mistankene. Long | No-Island

c. møtte advokaten som __ håpet at opplysningene i rapporten ville bekrefte mistankene. Short | Island

d. leste rapporten som advokaten håpet at [opplysningene i __ ] ville bekrefte mistankene. Long | Island

Subject island judgments provide an independent baseline of island-sensitivity that is not expected to be affected by the hypothesized transfer of functional structure.

2 Procedure

Test items were distributed across lists according to a Latin Square design and intermixed among filler sentences. The experiment was hosted on IbexFarm (Drummond, 2012). Participants participated on their own personal computers. Sentences were presented one at a time. Participants rated their acceptability on a 7-point scale. All participants rated English items first before judging a Norwegian block to minimize L1 interference. Instructions were presented in English and participants received a break between English and Norwegian blocks.

3 Analysis

Raw ratings were z-score transformed before analysis. We z-scored ratings by participant and language tested. Z-scoring by-participant helps to control for biases in how individual participants used the 7-point scale. Z-scoring by-language for each participant helps control for the fact that participants may use the scale differently in their L1 and L2 (Sorace, 1996; Spinner and Gass, 2019).

Z-scores were analysed with linear mixed effects models implemented using the lme4 (Bates et al., 2015) and lmerTest (Kuznetsova et al., 2017) packages in R (R Core Team, 2013). All models included fixed effects of Structure, Distance and their interaction and random intercepts for both subject and item. When appropriate we also included fixed effects of Island Type and Language. We included by-subject random slopes for Structure, Distance and their interaction when such models converged. When the more complex model did not converge, we simplified the random effects structure. Details of individual models are presented in the tables of statistical results below. P-values were computed using likelihood ratio tests. Effect size was measured as a Difference-in-differences (DD) score (Maxwell and Delany, 2003). DD scores were calculated by-participant.

III Experiment 1

1 Materials

Sixteen items of 8 conditions apiece were created for each island type following the Distance × Structure × Language design. The 32 test items (16 Wh-Trace, 16 Subject Island) were interspersed among 76 filler sentences (38 English, 38 Norwegian). Each set of language-specific filler sentences contained 22 unacceptable and 16 acceptable fillers varying in length and complexity.

2 Experiment 1a: Native English controls

Thirty-one native English volunteers recruited as control participants via social media (mean age = 38.0, SD = 11.9, 17 female; 27 from the United States) judged sentences in the English block of the experiment on Ibex Farm. One participant was excluded from analysis for having multiple response times < 500 ms. Because participants rated only the English sentences, participants rated 4 tokens per condition per island.

3 Results

Average acceptability judgments by condition are found in Figure 1. A summary of statistical analysis is found in Table 1. Native English speakers rated RC dependencies into both subject phrases and embedded Wh-Trace constructions much lower than RC dependencies into non-islands. There were clear island effects (Distance × Structure p < .001). The sizes of Subject and Wh-Trace island effects (DDs = 1.55, 1.69, respectively) did not differ significantly, as evidenced by the absence of a three-way Distance × Structure × Island Type interaction.

Figure 1.

Average z-scored acceptability judgments from native English control participants in the Subject island (left panel) and Wh-Trace island (right panel) sub-experiments.

Table 1.

Summary of statistical analysis of native English Control judgments from Experiment 1a.

Effect	Beta (SD)	t	p
Distance	1.29 (0.09)	14.291	< .000
Structure	1.63 (0.09)	18.404	< .000
Language	–0.233 (0.13)	–1.835	0.071
Distance × Structure	–1.55 (0.12)	–13.351	< .000
Distance × Island	0.230 (0.12)	1.962	0.050
Structure × Island	0.0027 (0.12)	0.023	0.981
Distance × Structure × Island	–0.0540 (0.16)	–0.329	0.742

Notes. Significant effects are in bold face. Model: zscore ~ Distance * Structure * Island + (1 + Distance + Structure | subject) + (1|item).

We also inspected by-participant ratings of the Subject and Wh-Trace Long–Island sentences to check for inter-trial consistency. Native English participants rejected Wh-Trace Long–Island sentences nearly uniformly. Twenty-eight of 30 participants rejected 4 out of 4 Wh-Trace Long–Island tokens, judging all below z = 0. The two remaining participants rated a single Wh-Trace Long–Island token above z = 0, but rejected the remaining 3 tokens.

Judgments of Subject Long–Island sentences showed slightly more variability. Fourteen participants rejected all four tokens that they rated. Twelve participants rejected three of four tokens. Three participants exhibited more variability: two of the three rejected only two of four Subject Long–Island sentences and one participant rejected only one of four. Overall, however, participants rejected RC-dependencies into subject phrases on the clear majority of trials.

4 Experiment 1b: Norwegian L1, English L2

a Participants

Twenty-seven native speakers of Norwegian took part (16 female). Two participants’ data were excluded because the participants reported exposure to English during infancy. Participants were students enrolled in the English Studies program at the Norwegian University of Science and Technology (NTNU) either at the bachelor’s or master’s level. Norwegian university students are assumed to have a proficiency in English at least commensurate to the Common European Framework of Reference for Languages (CEFR) at B2 level, as this is the minimum standard for enrollment for foreign students (see, for example, Samordna opptak, 2020). Participants filled out a short survey on their language background and their English exposure. An overview of responses to this survey is in Table 2.

Table 2.

Demographic information for Norwegian participants in Experiment 1b.

	Mean (SD)	Median	Range
Age	23.1 (4.3)	22	19–39
Age when began learning English	6.56 (1.4)	6	5–10
English spoken (hours per week)	1.62 (1.7)	1	0–6
English media (hours per week)	5.14 (2.4)	4.5	1.5–12
English proficiency (self-reported, 7-point scale)	5.6 (0.8)	6	4–7

Unlike the English control participants, Norwegian participants rated 8 items per island in each language (2 tokens per condition per language). Average judgments by language and island type are plotted in Figure 2.

Figure 2.

Average z-scored acceptability judgments from Norwegian participants in Experiment 1.

We first report the results of the omnibus Distance × Structure × Island Type × Language analysis before planned analyses of each island type in isolation; see Table 3. Sentences with long-distance RC dependencies were rated lower on average than sentences with short RC dependencies (p < .000), as were sentences with island structures (p < .000). However, these main effects were qualified by several higher-order interactions. We focus on the two three-way interactions. The significant interaction of Distance × Structure × Language (p = .0385) reflects the fact that Norwegians exhibited larger average Distance × Structure island effects in English than in Norwegian. The significant Distance × Structure × Island Type interaction (p < .000) indicates that Norwegians exhibited larger Distance × Structure island effects for Subject island sentences than for Wh-Trace sentences. Although we did not observe a significant four-way interaction, we conducted planned comparisons of the three-way interaction of Distance × Structure × Language for each island type separately.

Table 3.

Omnibus statistical analysis of judgments from Experiment 1b.

	Beta (SD)	t	p
Distance	0.630 (0.07)	9.024	< .000
Structure	0.743 (0.07)	10.664	< .000
Island Type	0.160 (0.11)	1.474	0.151
Language	–0.010 (0.07)	–0.142	0.887
Distance × Structure	–1.313 (0.14)	–9.428	< .000
Distance × Island Type	–0.181 (0.10)	–1.836	0.067
Structure × Island Type	–0.692 (0.10)	–7.030	< .000
Distance × Language	0.180 (0.14)	1.296	0.195
Structure × Language	–0.134 (0.14)	–0.965	0.335
Island Type × Language	0.279 (0.10)	2.837	0.005
Distance × Structure × Island	0.970 (0.20)	4.925	< .000
Distance × Structure × Language	0.576 (0.28)	2.073	0.038
Distance × Island × Language	–0.230 (0.20)	–1.170	0.242
Structure × Island × Language	–0.170 (0.20)	–0.864	0.388
Distance × Structure × Island × Language	0.223 (0.39)	0.569	0.569

Notes. Significant effects are in bold face. Model: zscore ~ Distance * Structure * Island * Language + (1|subject) + (1|item).

b Subject islands

A statistical summary is given in Table 4. The size of Subject island effect was significantly larger in English (DD = 1.69) than in Norwegian (DD = 1.01), as indicated by a Distance × Structure × Language interaction (p = .016). Resolving this interaction confirmed that significant, sizable subject island effects were present both in English and Norwegian (ts = ‒10.05, ‒6.16, respectively; ps < .001).

Table 4.

Statistical analysis of judgments of the subject island items from Experiment 1b.

	Beta (SD)	t	p
Distance	0.627 (0.06)	10.241	< .000
Structure	0.737 (0.09)	9.003	< .000
Language	–0.010 (0.06)	–0.155	0.877
Distance × Structure	–1.321 (0.12)	11.107	< .000
Distance × Language	0.180 (0.12)	1.521	0.129
Structure × Language	–0.134 (0.13)	–1.030	0.308
Distance × Structure × Language	0.576 (0.24)	2.426	0.016

Notes. Significant effects are in bold face. Model: zscore ~ Distance * Structure * Language + (1 + Distance + Structure + Language |subject) + (1|item).

c Wh-Trace Islands

A statistical summary can be found in Table 5. Again, the three-way Distance × Structure × Language interaction was significant (p < .01). Resolving the three-way interaction revealed that although there was a significant Wh-Trace island effect in English (t = ‒4.10, p < .001; DD = 0.38), there was not a significant Wh-Trace island effect in Norwegian (DD = ‒.01). Visual inspection of Figure1 confirms the absence of even trend towards an interaction in Norwegian.

Table 5.

Statistical analysis of judgments of the wh-trace island items from Experiment 1b.

	Beta (SD)	t	p
Distance	0.441 (0.08)	5.333	< .000
Structure	0.051 (0.08)	0.661	0.510
Language	0.266 (0.07)	3.556	< .001
Distance × Structure	–0.338 (0.15)	–2.252	0.025
Distance × Language	–0.048 (0.15)	–0.317	0.751
Structure × Language	–0.302 (0.17)	–1.732	0.095
Distance × Structure × Language	0.797 (0.30)	2.661	0.008

Notes. Significant effects are in bold face. Model: zscore ~ Distance * Structure * Language + (1 + Distance + Structure + Language |subject) + (1|item).

The Wh-Trace island effect observed in English is smaller in magnitude than the subject island effects in either language, while the average rating of the English Wh-Trace Long–Island sentence is considerably higher (roughly ‒0.25) than the English Subject Long–Island sentence (roughly ‒0.80). Following Kush et al. (2018, 2019), we investigated whether the smaller effect reflected inconsistent judgments across trials. Figure 3 plots the distribution of z-scores in both Long conditions for each island–language combination. Response consistency is reflected in the degree to which judgments in a condition follow a unimodal distribution. Inconsistent judgments manifest as bimodal or uniform distributions. In each of the island–language pairs, the Long–NoIsland condition provides a baseline level of consistency against which to judge the responses in the Long–Island conditions. The extent of the overlap between the Long–NoIsland and Long–Island judgments provides a rough way of approximating the extent to which the RC-dependencies into islands were perceived as run-of-the-mill long-distance dependencies.

Figure 3.

Distribution of judgments in Long–NoIsland and Long–Island conditions for each island and language pair in Experiment 1b.

Beginning with subject island sentences, we observe relatively little overlap between that the ratings for Long–Island and Long–NoIsland sentences. Judgments of Subject Long–NoIsland sentences cluster unimodally around the higher end of the scale (z = +1), with a thicker left tail. Judgments of the Subject Long–NoIsland condition, by contrast, cluster at the opposite end of the scale (z = ‒1) and exhibit less of a right skew.

Judgments of Long conditions in the Subject island sub-experiment provide a template for consistent judgment. Judgments in the Wh-Trace sub-experiments clearly do not conform to that template. Judgments in the Norwegian Wh-Trace Long–Island condition are consistent with general acceptability: z-scores are unimodally distributed about the high end of the scale with a fat left tail. The pattern of responses indicates that participants perceived the test sentences as unobjectionable on most trials. Bimodality in the corresponding Long–NoIsland condition suggests that participants were less consistent in judging those sentences. Turning to the English Wh-Trace sub-experiment we see bimodality in both Long conditions, though the larger mode falls on opposite ends of the range between conditions. Norwegian participants tended to accept Long–NoIsland sentences more often than reject them, but there were still a number of trials where they judged the sentences to be unacceptable. Most relevant to our purposes, the Norwegian participants often rejected Long–Island sentences, but there was a non-negligible number of trials on which they accepted structures that native English speakers reject.

5 Individual differences

Analysis of the rating distributions shows that there was inter-trial inconsistency in the ratings of English Wh-Trace Long–Island sentences, but it does not establish whether the cause was inter- or intra-participant inconsistency. To ascertain whether individual participants were inconsistent, we plotted each participant’s maximum judgment against their minimum judgment for each island–language combination (see Kush et al., 2019). In Figure 4, each dot corresponds to an individual participant.

Figure 4.

Plots of by-participant minimum and maximum judgments for each island–language pair in Experiment 1b.

For the purposes of the analysis we adopt a crude definition of ‘acceptance’ and ‘rejection’: we treat all judgments that fall below z = 0 as rejections and all judgments that are above z = 0 as acceptances. Using this coarse categorization technique permits identification of three participant response types: Participants that rejected both tokens of an island type occupy quadrant 3 (bottom left). Those that accepted both tokens occupy quadrant 1 (top right). Those that occupy quadrant 4 (top left) rated island tokens inconsistently, accepting one and rejecting the other.

A few participants accepted one or both Norwegian subject island tokens, but subject island judgments otherwise exemplify consistent rejection: In Panels 1 and 2 of Figure 4 most participants fall into quadrant 4. Judgments of the Norwegian Wh-Trace sentences show a different pattern: all participants fell into quadrant 1 (consistent accepters) or quadrant 4 (inconsistent raters). Judgments of English Wh-Trace islands show more variability. Eleven participants consistently rejected Wh-Trace islands in English and 3 consistently accepted the constructions. The remaining 11 participants rated the sentences inconsistently. This level of inter- and intra-participant inconsistency stands in contrast to the relative uniformity of the same participants’ judgments of English Subject island tokens.

Given the differences in participant response patterns for the English Wh-Trace, we conducted an exploratory analysis of whether individual variability correlated with self-reported proficiency, weekly hours of English spoken, or English media consumption. We used participants’ English Wh-Trace island DD score as the dependent measure of island sensitivity. A positive correlation between DD score and individual measure would be expected on the assumption that increased exposure or proficiency made participants behave more like native English speakers. Hours of spoken English did not correlate with DD score (|t| < 1), nor did self-reported English proficiency (|t| < 1). There was a small, but significant negative correlation between DD score and English media consumption (t = ‒2.036, p < .05; adjusted R2 = .116). As Figure 5 shows, this correlation indicates – counter-intuitively – that participants who consumed more English media showed reduced sensitivity to English Wh-Trace island effects.

Figure 5.

Correlation between participant Wh-Trace DD scores and self-reported hours of English media exposure in Experiment 1b.

6 Discussion

As expected, English participants rejected RC-dependencies into subject phrases. Norwegian participants also rejected subject island violations in their L1 Norwegian and L2 English. These results are expected, if subjects are islands in both languages and the extra functional structure that allows filler-gap dependencies into embedded questions does not amnesty subject island violations.

Participants diverged in their judgments of Wh-Trace items. Native English speakers exhibited large Wh-Trace island effects, rejecting RC-dependencies in Wh-Trace configurations. We failed to find a Wh-Trace island effect in Norwegian. Norwegian participants generally accepted RC-dependencies into Wh-Trace configurations in their L1 as readily as RC-dependencies into declarative complement clauses. Interestingly, we found a significant island effect with English Wh-Trace constructions, indicating that Norwegians rated RC-dependencies in Wh-Trace constructions less acceptable on average than RC-dependencies into non-island declarative complement clauses. However, the Wh-Trace island effect was smaller than subject island effects, because Norwegian participants rated English Wh-Trace islands inconsistently: participant ratings were a mix of ‘accept’ and ‘reject’ trials. We defer further discussion and interpretation of this finding to the General Discussion.

The number of trials where Norwegian participants accepted English Wh-Trace violations provides suggestive support for transfer from L1 Norwegian to L2 English. However, the experiment was relatively low-powered, with only two observations of the relevant configuration per participant. We wished to test if our findings would replicate, and whether participants would provide more consistent judgments of Wh-Trace island violations in English if given more trials. Therefore, we ran Experiment 2, in which we doubled the number of observations per participant. We also increased our sample size and drew from a wider pool.

IV Experiment 2

1 Participants

Forty-nine native speakers of Norwegian took part in experiment 2 (29 female). Like the participants in Experiment 1, participants in Experiment 2 were enrolled as bachelor’s and master’s students in a Norwegian university. Unlike the previous participants, participants in Experiment 2 were enrolled in a wide range of degree programs, not only English. All these courses of study presuppose that students have studied English from upper secondary school and have achieved minimum proficiency at CEFR B2 level. Participants provided the same information as in Experiment 1. Table 6 provides an overview of descriptive statistics.

Table 6.

Demographic information for Norwegian participants in Experiment 2.

	Mean (SD)	Median	Range
Age	23.3 (3.9)	23	19–39
Age when began learning English	6.58 (1.7)	6	5–10
English spoken (hours per week)	1.35 (1.4)	1	0–6
English media (hours per week)	4.9 (2.6)	4	1.5–12
English proficiency (self-reported)	5.7 (0.7)	6	4–7

2 Materials

Participants rated the same items as in Experiment 1 plus 16 new Wh-Trace items. As a result, participants judged 4 tokens per condition per language in the Wh-Trace sub-experiment instead of 2 as in Experiment 1.

3 Results

Participants’ average judgments by island type and language are plotted in Figure 6. A summary of the omnibus statistical analysis can be found in Table 7. As in the analysis of Experiment 1, we focus only on the highest-order interaction effects.

Figure 6.

Average z-scored acceptability judgments from Norwegian participants in Experiment 2. Rows correspond to the island judged and columns correspond to the language of presentation.

Table 7.

Omnibus statistical analysis from Experiment 2.

	Beta (SD)	T	p
Distance	0.559 (0.03)	18.986	< .000
Structure	0.390 (0.03)	13.349	< .000
Island Type	0.262 (0.07)	3.685	< .001
Language	0.176 (0.03)	5.984	< .000
Distance × Structure	–0.720 (0.06)	12.305	< .000
Distance × Island Type	–0.453 (0.06)	–7.695	< .000
Structure × Island Type	–0.632 (0.06)	10.795	< .000
Distance × Language	0.167 (0.06)	2.862	.004
Structure × Language	–0.272 (0.06)	–4.634	< .000
Island Type × Language	0.092 (0.06)	1.575	0.115
Distance × Structure × Island	0.946 (0.12)	8.082	< .000
Distance × Structure × Language	0.627 (0.12)	5.360	< .000
Distance × Island × Language	–0.051 (0.12)	–0.436	0.663
Structure × Island × Language	–0.150 (0.12)	–1.278	0.201
Distance × Structure × Island × Language	0.066 (0.23)	0.283	0.778

Notes. Significant effects are in bold face. Model: zscore ~ Distance * Structure * Island * Language + (1|subject) + (1|item).

There was a significant Distance × Structure × Island Type interaction (p < .000): Subject island effects were, on average, larger than Wh-Trace island effects, irrespective of language. The significant Distance × Structure × Language interaction reflects that when collapsing across Subject and Wh-Trace island sentences participants exhibited numerically smaller average island effects in their judgment of Norwegian test items than with English test items. We proceed to the planned comparisons, focusing on each island type separately.

a Subject islands

A statistical summary is in Table 8. As in Experiment 1, Long sentences were rated lower on average than Short sentences (p < .000) and Island sentences were rated lower NoIsland sentences (p < .000) collapsing across languages. The three-way Distance × Structure × Language interaction was again significant (p < .001). The interaction was driven by the fact that the Norwegian Subject Island effect was numerically smaller than the English effect. However, subject island effects were large in both English and Norwegian (DDs = 1.47, 0.89, respectively) and significant (Distance × Structure: ts = ‒12.54, ‒8.01, respectively; ps < .000).

Table 8.

Statistical analysis of judgments of the subject island items from Experiment 2.

	Beta (SD)	t	p
Distance	0.790 (0.05)	16.076	< .000
Structure	0.712 (0.06)	12.892	< .000
Language	0.136 (0.04)	3.160	.002
Distance × Structure	–1.179 (0.08)	–14.370	< .000
Distance × Language	0.196 (0.08)	2.385	.017
Structure × Language	–0.194 (0.08)	–2.363	.018
Distance × Structure × Language	0.588 (0.16)	3.583	< .001

Notes. Significant effects in bold face. Model: zscore ~ Distance * Structure * Language + (1 + Distance + Structure + Language |subject) + (1|item).

b Wh-Trace islands

Table 9 presents a statistical summary. Long conditions were rated significantly lower on average than Short conditions (p < .000) and Island conditions lower than NoIsland conditions (p < .05). The Distance × Structure × Language interaction was significant (p < .001). Figure 6 makes evident that the interaction is because Norwegians exhibited an average island effect for English sentences (DD = 0.568), but not for Norwegian sentences (DD = ‒0.069). Follow-up analyses verified that there was a significant Wh-Trace island effect in English (t = ‒4.83, p < .001), but not in Norwegian (t < 1).

Table 9.

Statistical analysis of judgments of the wh-trace island items from Experiment 2.

	Beta (SD)	t	p
Distance	0.332 (0.04)	7.995	< .000
Structure	0.075 (0.04)	2.135	0.034
Language	0.221 (0.04)	5.810	< .000
Distance × Structure	–0.247 (0.07)	–3.499	0.001
Distance × Language	0.142 (0.07)	2.034	0.042
Structure × Language	–0.349 (0.07)	–4.988	< .000
Distance × Structure × Language	0.662 (0.14)	4.753	< .000

Notes. Significant effects in bold face. Model: zscore ~ Distance * Structure * Language + (1 + Distance * Structure + Language |subject) + (1|item).

We again examined the distribution of ratings in Long conditions for all island–language pairs. Distributions are plotted in Figure 7. Subject Long–Island and Long–NoIsland distributions are roughly bimodal with modes at opposite ends of the rating scale. However, in the Norwegian Wh-Trace sentences, the distribution of ratings for Long–Island sentences is essentially indistinguishable from Long–NoIsland sentences: participants accepted the majority of test sentences in both conditions. English Wh-Trace judgments diverged from the ratings of their Norwegian counterpart sentences. Participants generally accepted Long–NoIsland sentences, judgments of Long–Island sentences are bimodally distributed. The group as a whole appears to accept and reject English Wh-Trace sentences with near equal frequency.

Figure 7.

Distribution of judgments in Long–NoIsland and Long–Island conditions for each island and language pair in Experiment 2.

Figure 8 plots individuals’ minimum and maximum ratings for each island–language pair, to visualize rating consistency. Participants consistently rejected Subject Island violations in English and were generally consistent in their judgment of Norwegian Subject Island violations, as evidenced by the clustering in quadrant 3 in panels 1 and 2 of Figure 8.

Figure 8.

Plots of by-participant minimum and maximum judgments for each island–language pair in Experiment 2.

Panel 3 shows that every participant accepted at least one Norwegian Wh-Trace island token – most participants fall into quadrant 4 – while many accepted all four tokens. Panel 4 indicates that almost all participants accepted at least one English Wh-Trace island token.

Figure 8 only provides information about the range of individual participants’ judgments. We were also interested in how many of the 4 Long–Island Wh-Trace tokens each participant accepted. Therefore, we binned participants by how many tokens they rated above 0. The result is in Table 10. Forty of 49 participants accepted 3 or 4 Norwegian Wh-Trace island tokens and none rejected all 4 tokens. Judgments of English Wh-Trace tokens showed less consistency: fewer participants accepted most of the items (18 of 49). Five participants consistently rejected Long–Island tokens. Nevertheless, Norwegian participants clearly displayed a different response pattern than Native English speakers in Experiment 1a, where all participants either uniformly rejected the Wh-Trace island tokens, or rejected 3 of 4.

Table 10.

Participants binned by the number of wh-trace island violation items they rated above the midpoint of the scale.

Number of tokens rated z > 0	Wh-trace items
Number of tokens rated z > 0	Norwegian	English
0	0	5
1	1	14
2	8	12
3	23	10
4	17	8

One question that Table 10 leaves unaddressed is how strongly participants’ judgments in the Norwegian Wh-Trace experiment correlate with their judgments in English. We addressed this question in two follow-up analyses. First, we plotted participants’ Wh-Trace DD scores in Norwegian against their DD scores in English, to determine whether there was a correlation between island effect size. This plot is in Figure 9. Second, we looked for a correlation between individual participants’ probability of accepting a Wh-Trace island violation in Norwegian and English. The correlation plot is provided in Figure 10.

Figure 9.

Correlation between individual participants’ Wh-Trace DD scores in Norwegian and English in Experiment 2.

Figure 10.

Correlation between individual participants’ probability of accepting a Wh-Trace island violation in Norwegian and English.

There was no reliable correlation between Norwegian and English Wh-Trace DD scores. As Figure 9 makes apparent, there were many participants who exhibited no island effects in Norwegian (z ⩽ 0), but nevertheless had a positive Wh-Trace DD score in English. Figure 10 shows a numeric trend such that participants who accepted a high proportion of Wh-Trace island violations in Norwegian were slightly more likely to accept Wh-Trace island violations in English, though this correlation was not significant (Adjusted R² = .015; t = 1.32). The correlation was weakened by the group of 19 participants who readily accepted Wh-Trace island violations in Norwegian (> 50%), but were less likely to do so in English. Importantly, all but five of these participants still accepted Wh-Trace violations in English. Finally, we checked whether any of the three individual-level variables correlate with a participant’s Wh-Trace DD score. None of the measures correlated with DD score (ts < 1).

V General discussion

Embedded questions are islands for filler-gap dependency creation in English, but not in Norwegian. The difference between the two languages has been linked to extra functional structure in the left-periphery of the Norwegian clause (Kush et al., 2018, 2019; Vikner et al., 2017). We were interested in determining whether Norwegians transfer this extra functional structure from their L1 to their L2, English. We reasoned that if Norwegians transfer the functional structure to English, they should erroneously treat embedded questions as non-islands in English. Insofar as the set of acceptable Norwegian filler-gap dependencies represents a superset of the acceptable English filler-gap dependencies, acquiring the appropriate generalization in English should prove difficult if transfer has occurred. The difficulty reflects the fact that there is arguably little if any direct evidence to counter-exemplify the less restrictive hypothesis (White, 1989a).

To test whether such transfer occurs, we tested whether adult Norwegian speakers accept filler-gap dependencies into embedded questions (wh-islands) in English. Our results provide evidence of transfer from L1 Norwegian L2 English: Participants accept filler-gap dependencies into wh-islands in English even though they have never encountered those structures in their English input. Importantly, participants do not accept all island-violating filler-gap dependencies in L2 English, as evidenced by participants’ consistent rejection of subject-island-violating filler-gap dependencies. The fact that subject island violations were consistently rejected militates against an interpretation that attributes Norwegian participants’ acceptance of English Wh-Trace island violations to general island-insensitivity in L2. As predicted by transfer, our participants only accepted island-violations in English if the corresponding dependency was acceptable in Norwegian. Insofar as the non-island status of embedded questions is due to extra CP-level functional structure, our results are consistent with models that permit such transfer, including Weak Transfer (Eubank, 1993) or traditional Full Transfer models (Schwartz and Sprouse, 1994, 1996) over models that restrict transfer to minimal grammatical information (Vainikka and Young-Scholten, 1994, 1996).

We predicted that transfer of L1 functional structure could lead native Norwegians into a ‘superset trap’: having assumed that an analysis that allows more filler-gap dependencies than are acceptable in English, learners would be unable to retract to a more restrictive analysis. All else equal, we would therefore predict that Norwegian participants should accept island-violating dependencies as often in L2 English as they do in L1 Norwegian. Participant judgments yielded a more complicated picture: Almost all participants accepted Wh-Trace island violations in English on some portion of trials, but roughly one-third of our participants accepted the English structures less readily than in Norwegian.

1 The source of inconsistent judgments

Participants’ inconsistent judgments of English wh-island violations are consistent with two broad interpretations. First, participants may have rejected the sentences simply due to their increased complexity. This could occur if participants have greater difficulty processing wh-dependencies in their L2 than in their L1 (Juffs, 2005; Juffs and Harrington, 1995). We point out that this explanation presupposes that transfer must have occurred, otherwise the Norwegians would not accept the island-violations in English at all. The explanation holds, however, that Norwegians’ tendency to probabilistically reject island-violations does not constitute evidence of learning the appropriate English analysis: rejection occurs for orthogonal, extra-grammatical reasons.

The second option, which we favor, is that probabilistic rejection provides evidence of learning and partial restructuring. By ‘restructuring’ we simply mean that changes are made to some aspect of the holistic system or feature set transferred from L1. These changes could represent target-like restructuring, such that Norwegian speakers specifically reject the extra functional structure from Norwegian and adopt a simplified left-periphery identical to native English speakers. Alternatively, Norwegians could engage in ad-hoc compensatory restructuring wherein other grammatical changes are made to ensure closer surface alignment with acceptable English forms, without directly retracting the L1 functional structure. Our current results do not allow us to distinguish these two possibilities. Which of these outcomes is more likely depends, in part, on what types of L2 input learners receive as evidence that the set of filler-gap dependencies is different in English and how directly that evidence contradicts the L1 analysis. We consider the issue of evidence in the input presently.

Participants’ stochastic or inconsistent judgments are compatible with the notion that they have learned that the distribution of filler-gap dependencies differs between the two languages, but that learning or restructuring is not ‘complete’. Within a parameter-setting model of L2 acquisition (Schwartz and Sprouse, 1996; White, 2003) or a grammar competition/multiple grammars model (Amaral and Roeper, 2014; Rankin, 2014), this uncertainty could be modeled as a probabilistic competition between different grammars. Transfer would entail that Norwegian learners begin acquisition by assigning a high probability to their L1 analysis. Over time, however, they would accumulate evidence against that analysis and would shift probability to a more restrictive analysis (provided they could avoid the preemption problem; Rothman and Iverson, 2013; Trahey and White, 1993).

2 Evidence of difference

The question remains what cues learners could use in their English input as evidence in favor of the restrictive analysis. Negative evidence could, in principle, play a role. We consider direct negative evidence in the form of corrections an implausible mechanism given: (1) the relative infrequency of relevant productions, (2) the unreliability and ambiguity of interlocutors’ correction, (3) the low probability that wh-island violations are ever addressed explicitly in the English classroom (see, Carroll, 1995, 2001; Schwartz, 1993). Indirect negative evidence is another option: if Norwegian learners of English expect to encounter English wh-island violations at a rate comparable to Norwegian, then the absence of the structures could over time lead the learner towards the restrictive hypothesis. Prior research has argued that indirect evidence may play a role in L1 (Foraker et al., 2009; Perfors et al., 2011; Ramscar et al., 2013; Regier and Gahl, 2004; Rohde and Plaut, 1999) and L2 acquisition (Dahl, 2004; Plough, 1992). However, it is unclear whether the frequency of the island-violations is high enough in L1 to form the basis for strong predictions in L2.

Direct positive evidence of the unacceptability of wh-island violations does not occur, but some learning models allow indirect positive evidence to play a role (e.g. Pearl and Mis, 2016, or more traditional parameter-based models). Learners can rely on indirect positive evidence if there exist implicational relations between observed (non-island) structures and the possibility of island violations. Under the assumption that additional CP-level functional structure underlies the non-island status of embedded questions in Norwegian, Norwegians would require evidence that this structure is absent in English. Such evidence is only possible if some overt property of the English CP-domain conflicts with the Norwegian analysis. What could such cues be?

We assume that most sentences do not provide unambiguous evidence for deep differences in functional structure of the CP domain, given the similarity of surface word order patterns in the two languages. However, one piece of evidence might prove useful:

It has been suggested that evidence for an articulated CP-domain in Mainland Scandinavian comes from embedded V2 phenomena (e.g. Vikner et al., 2017). Mainland Scandinavian languages exhibit V2 word order in main clauses (9a; Holmberg and Platzack, 1995): the finite verb (skal) is the second constituent in the linear string regardless of whether a subject (9a) or non-subject (9b) occupies sentence-initial position:

(9) a. Han skal antakeligvis ikke synge i-morgen.

He shall presumably neg sing tomorrow

‘He probably won’t sing tomorrow.’

b. I morgen skal han antakeligvis ikke synge.

tomorrow shall he presumably neg sing

The traditional analysis holds that V2 movement requires movement of the finite verb to C⁰. Canonical word order in embedded clauses is not V2, as evidenced by the position of the verb with respect to adverbs and negation in (10). This entails that the verb does not move to the embedded C position.

(10) Han er lei for at han antakeligvis ikke skal synge i morgen.

He is sad for that he presumably neg shall sing tomorrow

‘He is sad because he probably won’t sing tomorrow.’

It has been observed, however, that V2 word order is possible in some embedded clauses (see, amongst others, Bentzen, 2014; Julien, 2007). For example, in (11) the frame adverbial i morgen (‘tomorrow’) has been fronted internal to the embedded clause and the verb has moved past the embedded subject:

(11) Han sa [at i morgen skal han ikke synge.]

He said that tomorrow shall he neg sing

‘He said that tomorrow he won’t sing.’

Sentences like (11) provide evidence for extra functional structure in the left-periphery of the clause under the assumption that skal has moved to a head in the CP-domain distinct from the head hosting the complementizer head at (‘that’) and there exists a specifier position between at and the verb that i morgen can occupy.

In English, embedded fronting of a non-subject does not result in V2/subject-auxiliary inversion. Thus, observing the absence of V2 in embedded clauses like (12) might provide evidence that the language lacks the extra functional structure.⁷

(12) He said that tomorrow {he will not | *will he not} sing.

Indirect positive evidence might also come from input sentences that do not involve observing different complementizer-level functional structure: English speech errors might also provide relevant evidence of a difference. It is well known that English speakers produce resumptive pronouns inside islands to ‘rescue’ ill-formed sentences (e.g. Morgan and Wagers, 2018; Ross, 1967). Importantly, English speakers produce resumptives in precisely the locations where Norwegian would allow gaps. For example, the sentences in (11) were observed in natural discourse:

(13) a. There were a bunch of people at the party that I didn’t know [who they were].

b. ‘. . . the sale of the uranium that nobody knows what it means’ – Donald Trump (Gore et al., 2016, cited in Morgan and Wagers, 2018: 861)

c. ‘Maybe it was a bad idea to get people together and try to record audio with some equipment that we didn’t know how it worked.’ (CBC Podcast Personal Best, Episode: ‘Know more, Carry Less’, ~9:00)

Based on examples such as those in (13), a learner with the knowledge that resumptive pronouns and gaps are in complementary distribution would be able to infer that embedded questions are islands in English. Importantly, drawing inferences based on indirect positive evidence requires non-trivial prior knowledge of the implicational relations between overt forms and (families of) underlying structures.

Indirect positive evidence of the type we describe above is likely to be relatively infrequent in the learner’s input. The relative infrequency of such structures may help explain why our participants appear not to have mastered the appropriate generalization and why there is significant inter-individual variation in outcomes despite long-term exposure to, and instruction in, English. Such effects follow under probabilistic models of grammar competition where conclusively shifting to the subset grammar would require repeated exposure to disconfirmatory evidence (e.g. Yang, 2018).

VI Conclusions

We have argued that native Norwegian speakers erroneously transfer the grammatical source of wh-island insensitivity from their L1 to their L2 English. Such effects are compatible with models of transfer that allow transfer of CP-level functional structure, but not those that restrict transfer to lexical information. We also found evidence that suggested that (some) learners may partially restructure, which we suggested could be triggered by indirect positive evidence. However, our data do not tell us whether the restructuring observed involves transition to the target English analysis or adoption of a divergent compensatory hypothesis that simply ensures closer surface alignment with the English forms.

Footnotes

Acknowledgements

Previous versions of this work were presented at UiT, UMASS, UC Santa Cruz, and at the 2019 CUNY Sentence Processing Conference. We thank audiences for helpful feedback. Special thanks to Jason Rothman for helpful comments on a previous draft. All errors or misrepresentations are our responsibility.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Dave Kush

Notes

References

Amaral

Roeper

(2014) Multiple grammars and second language representation. Second Language Research 30: 3–36.

Ambridge

Pine

Rowland

Chang

Bidgood

(2013) The retreat from overgeneralization in child language acquisition: Word learning, morphology, and verb argument structure. Wiley Interdisciplinary Reviews: Cognitive Science 4: 47–62.

Anderssen

Bentzen

Busterud

, et al. (2018) The acquisition of word order in L2 Norwegian: The case of subject and object shift. Nordic Journal of Linguistics 41: 247–74.

Ayoun

(1999) Verb movement in French L2 acquisition. Bilingualism: Language and Cognition 2: 103–125.

Bates

Mächler

Bolker

Walker

(2015) Fitting linear mixed-effects models using lme4: Fitting Linear. Journal of Statistical Software 67: 1–48.

Bentzen

(2014) Embedded verb second (V2). Nordic Atlas of Language Structures Journal 1: 211–24.

Berwick

(1985) The acquisition of syntactic knowledge. Cambridge, MA: MIT Press.

Carroll

(1995) The irrelevance of verbal feedback to language learning. In: Selinker

Sharwood Smith

Rutherford

Eubank

(eds) The current state of interlanguage: Studies in honor of William E Rutherford. Amsterdam: John Benjamins, pp. 73–88.

Carroll

(2001) Input and evidence: The raw material of second language acquisition. Amsterdam: John Benjamins.

10.

Chomsky

(1977) On wh-movement. In: Culicover

Akmajian

Wasow

(eds) Formal syntax. New York: Academic Press, pp. 71–132.

11.

Chomsky

Lasnik

(1977) Filters and control. Linguistic Inquiry 8: 425–504.

12.

Clahsen

Muysken

(1986) The availability of universal grammar to adult and child learners: A study of the acquisition of German word order. Second Language 2: 93–119.

13.

Dahl

(2004) Negative evidence in L2 acquisition. Nordlyd 32: 28–45.

14.

Drummond

(2013) IbexFarm. Available at: http://spellout.net/ibexfarm (accessed August 2020).

15.

Eubank

(1993) Optionality and the initial state in L2 development. In: Hoekstra

Schwartz

(eds) Language acquisition studies in generative grammar. Amsterdam: John Benjamins, pp. 369–88.

16.

Felser

Cunnings

Batterham

Clahsen

(2012) The timing of island effects in nonnative sentence processing. Studies in Second Language Acquisition 34: 67–98.

17.

Foraker

Regier

Khetarpal

Perfors

Tenenbaum

(2009) Indirect evidence and the poverty of the stimulus: The case of anaphoric ‘one’. Cognitive Science 33: 287–300.

18.

Franceschina

(2005) Fossilized second language grammars : The acquisition of grammatical gender. Philadelphia: John Benjamins.

19.

Gore

Kiely

Robertson

(2016) Spinning the FBI letter. Philadelphia, PA: FactCheck.org. Available at: http://www.factcheck.org/2016/10/spinning-the-fbi-letter/ (accessed August 2020).

20.

Hawkins

Towell

Bazergui

(1993) Universal grammar and the acquisition of French verb movement by native speakers of English. Second Language Research 9: 189–233.

21.

Holmberg

Platzack

(1995) The role of inflection in Scandinavian syntax. Oxford: Oxford University Press.

22.

Huang

CTJ

(1983) Logical relations in Chinese and the theory of grammar. Unpublished PhD dissertation, MIT, Cambridge, MA, USA.

23.

Johnson

Newport

(1991) Critical period effects on universal properties of language: The status of subjacency in the acquisition of a second language. Cognition 39: 215–58.

24.

Judy

Rothman

(2010) From a superset to a subset grammar and the Semantic Compensation Hypothesis: Subject pronoun and anaphora resolution evidence in L2 English. In: Franich

Iserman

Keil

(eds) Proceedings of the 34th Boston University Conference on Language Development (BUCLD 34). Somerville, MA: Cascadilla Press, pp. 197–208.

25.

Juffs

(2005) The influence of first language on the processing of wh-movement in English as a second language. Second Language Research 21: 121–51.

26.

Juffs

Harrington

(1995) Parsing effects in second language sentence processing: Subject and object asymmetries in wh-extraction. Studies in Second Language Acquisition 17: 483–516.

27.

Julien

(2007) Embedded V2 in Norwegian and Swedish. Working papers in Scandinavian syntax 80: 103–61.

28.

Kim

Baek

Tremblay

(2015) The role of island constraints in second language sentence processing. Language Acquisition 22: 384–416.

29.

Kush

Lohndal

Sprouse

(2018) Investigating variation in island effects: A case study of Norwegian wh-extraction. Natural Language and Linguistic Theory 36: 743–79

30.

Kush

Lohndal

Sprouse

(2019) On the island sensitivity of topicalization in Norwegian: An experimental investigation. Language 95: 393–420.

31.

Kuznetsova

Brockhoff

Christensen

RHB

(2017) lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software 82: 1–26.

32.

Lardiere

(2007) Ultimate attainment in second language acquisition: A case study. Mahwah, NJ: Lawrence Erlbaum.

33.

(1998) Adult L2 Accessibility to UG: An issue revisited. In: Flynn

Martohardjono

O’Neil

(eds) The Generative Study of Second Language Acquisition. Mahwah, NJ: Lawrence Erlbaum, pp. 232–86.

34.

Lindahl

(2017) Extraction from relative clauses in Swedish. Unpublished doctoral dissertation, University of Gothenburg, Gothenburg, Sweden.

35.

Maling

Zaenen

(1982) A phrase structure account of Scandinavian extraction phenomena. In: Jacobson

Pullum

(eds) The nature of syntactic representation. Dordrecht: Springer, pp. 229–82.

36.

Martohardjono

(1993) Wh-movement in the acquisition of a second language: A cross-linguistic study of three languages with and without movement. Unpublished doctoral dissertation, Cornell University, Ithaca, NY, USA.

37.

Maxwell

Delaney

(2003) Designing experiments and analyzing data: A model comparison perspective. Mahwah, NJ: Lawrence Erlbaum.

38.

Mazurkewich

(1984) The acquisition of the dative alternation by second language learners and linguistic theory. Language Learning 34: 91–108.

39.

Mazurkewich

White

(1984) The acquisition of the dative alternation: Unlearning overgeneralizations. Cognition 16: 261–83.

40.

Morgan

Wagers

(2018) English resumptive pronouns are more common where gaps are less acceptable. Linguistic Inquiry 49: 861–76.

41.

Omaki

Schulz

(2012) Filler-gap dependencies and island constraints in second language sentence processing. Studies in Second Language Acquisition 33: 563–88

42.

Pearl

Mis

(2016) The role of indirect positive evidence in syntactic acquisition: A look at anaphoric one. Language 92: 1–30.

43.

Perfors

Tenenbaum

Regier

(2011) The learnability of abstract syntactic principles. Cognition 118: 306–38.

44.

Perlmutter

(1971) Deep and surface structure constraints in syntax. New York: Holt, Rinehart and Winston.

45.

Pinker

(1989) Learnability and cognition: The acquisition of argument structure. Cambridge, MA: MIT Press.

46.

Plough

(1992) Indirect negative evidence, inductive inferencing and second language acquisition. In: Eubank

Selinker

Sharwood Smith

(eds) The current state of interlanguage: Studies in honor of William E Rutherford. Amsterdam: John Benjamins, pp. 89–105.

47.

R Core Team (2013) R: A language and environment for statistical computing [software]. Vienna: R Foundation for Statistical Computing.

48.

Ramscar

Dye

McCauley

(2013) Error and expectation in language learning: The curious absence of ‘mouses’ in adult speech. Language 89: 760–93.

49.

Rankin

(2012) The transfer of V2: Inversion and negation in German and Dutch learners of English. International Journal of Bilingualism 16: 139–58.

50.

Rankin

(2014) Variational learning in L2: The transfer of L1 syntax and parsing strategies in the interpretation of wh-questions by L1 German learners of L2 English. Linguistic Approaches to Bilingualism 4: 432–61.

51.

Regier

Gahl

(2004) Learning the unlearnable: The role of missing evidence. Cognition 93: 147–55.

52.

Reinhart

(1981) A second COMP position. In: Belletti

(ed) Theory of markedness in generative grammar. Pisa: Scuola Normale Superiore, pp. 518–57.

53.

Rizzi

(1982) Issues in Italian syntax. Dordrecht: Foris.

54.

Rohde

Plaut

(1999) Language acquisition in the absence of explicit negative evidence: How important is starting small? Cognition 72: 67–109.

55.

Ross

(1967) Constraints on variables in syntax. Unpublished doctoral dissertation, MIT, Cambridge, MA, USA.

56.

Rothman

Iverson

(2013) Islands and objects in L2 Spanish: Do you know the learners who drop? Studies in Second Language Acquisition 35: 589–618.

57.

Rothman

González Alonso

Puig-Mayenco

(2019) Third language acquisition and linguistic transfer: Volume 163. Cambridge: Cambridge University Press.

58.

Samordna opptak [The Norwegian Universities and Colleges Admission Service] (2020). Krav til norsk og engelsk [Norwegian and English requirements]. Oslo: Samordna opptak. Available at: https://www.samordnaopptak.no/info/utenlandsk_utdanning/sprakkrav/krav-til-norsk-og-engelsk-for_hoyere_utdannning/index.html (accessed August 2020).

59.

Schwartz

(1993) On explicit and negative data effecting and affecting competence and linguistic behavior. Studies in Second Language Acquisition 15: 147–63.

60.

Schwartz

Sprouse

(1994) Word order and nominative case in nonnative language acquisition: A longitudinal study of (L1 Turkish) German interlanguage. In: Hoekstra

Schwartz

(eds) Language acquisition studies in generative grammar. Amsterdam: John Benjamins, pp. 317–68.

61.

Schwartz

Sprouse

(1996) L2 cognitive states and the Full Transfer/Full Access model. Second Language Research 12: 40−72.

62.

Snyder

(2007) Child language: The parametric approach Oxford: Oxford University Press.

63.

Sorace

(1996) The use of acceptability judgments in second language acquisition research. In: Ritchie

Bhatia

(eds) Handbook of second language acquisition. San Diego, CA: Academic Press, pp. 375–409.

64.

Spinner

Gass

(2019) Using judgments in second language acquisition research. New York: Routledge.

65.

Sprouse

(2007) A program for experimental syntax: Finding the relationship between acceptability and grammatical knowledge. Unpublished doctoral dissertation, University of Maryland, College Park, MD, USA.

66.

Sprouse

Wagers

Phillips

(2012) A test of the relation between working memory and syntactic island effects. Language 88: 82–124.

67.

Sprouse

Fukuda

Ono

Kluender

(2011) Reverse island effects and the backward search for a licensor in multiple wh-questions. Syntax 14: 179–203.

68.

Sprouse

Caponigro

Greco

Cecchetto

(2016) Experimental syntax and the variation of island effects in English and Italian. Natural Language and Linguistic Theory 34: 307–344.

69.

Trahey

White

(1993) Positive evidence and preemption in the second language classroom. Studies in Second Language Acquisition 15: 181–204.

70.

Vainikka

Young-Scholten

(1994) Direct access to Xʹ-theory: Evidence from Korean and Turkish adults learning German. In: Hoekstra

Schwartz

(eds) Language acquisition studies in generative grammar. Amsterdam: John Benjamins, pp. 265–316.

71.

Vainikka

Young-Scholten

(1996) Gradual development of L2 phrase structure. Second Language Research 12: 7–39.

72.

Vainikka

Young-Scholten

(2006) The roots of syntax and how they grow. In Unsworth

Parodi

Sorace

Young-Scholten

(eds) Paths of development in L1 and L2 acquisition: In honor of Bonnie D Schwartz. Amsterdam: John Benjamins, pp. 77–106.

73.

Vikner

Christensen

Nyvad

(2017) Order and structure in syntax I: Word order and syntactic structure. In: Bailey

Sheehan

(eds) V2 and cP/CP. Berlin: Language Science Press, pp. 313–24.

74.

Westergaard

(2003) Unlearning V2: Transfer, markedness, and the importance of input cues in the acquisition of word order in English by Norwegian children. EUROSLA Yearbook 3: 77–101.

75.

Westergaard

(2014) Linguistic variation and micro-cues in first language acquisition. Linguistic Variation 14: 26–45.

76.

Westergaard

(2019) Microvariation in multilingual situations: The importance of property-by-property acquisition. Second Language Research. Epub ahead of print 12 November 2019. DOI: 10.1177/0267658319884116.

77.

Wexler

Manzini

(1987) Parameters and learnability in binding theory. In: Roeper

Williams

(eds) Parameter setting. Dordrecht: Reidel, pp. 41–76.

78.

White

(1989a) Linguistic universals, markedness and learnability: Comparing two different approaches. Second Language Research 5: 127–40.

79.

White

(1989b) The principle of adjacency in second language acquisition: Do L2 learners observe the subset principle? In: Gass

Schachter

(eds) Linguistic perspectives on second language acquisition. Cambridge: Cambridge University Press, pp. 134–58.

80.

White

(1991) Adverb placement in second language acquisition: Some effects of positive and negative evidence in the classroom. Interlanguage Studies Bulletin (Utrecht) 7: 133–61.

81.

White

(2003) Second language acquisition and universal grammar. Cambridge: Cambridge University Press.

82.

White

Genesee

(1996) How native is near-native? The issue of age and ultimate attainment in the acquisition of a second language. Second Language Research 12: 233–65.

83.

White

Juffs

(1998) Constraints on Wh-movement in two different contexts of non-native language acquisition: Competence and processing. In: Flynn

Martohardjono

O’Neill

(eds) The generative study of second language acquisition. Hillsdale, NJ: Lawrence Erlbaum, pp. 111–30.

84.

Wolfe Quintero

(1992) Learnability and the acquisition of extraction in relative clauses and wh-questions. Studies in Second Language Acquisition 14: 39–70.

85.

Yang

(2018) A formalist perspective on language acquisition. Linguistic Approaches to Bilingualism 8: 665–706.

86.

Yuan

(1997) Asymmetry of null subjects and null objects in Chinese speakers L2 English. Studies in Second Language Acquisition 19: 467–497.

L2 transfer of L1 island-insensitivity: The case of Norwegian

Abstract

Keywords

I Introduction

1 What can transfer?

2 Conservativity and transfer

3 Retraction and restructuring after transfer

4 Past work on learnability of islands/movement

II Experiments

1 Materials and design

2 Procedure

3 Analysis

III Experiment 1

1 Materials

2 Experiment 1a: Native English controls

3 Results

4 Experiment 1b: Norwegian L1, English L2

a Participants

b Subject islands

c Wh-Trace Islands

5 Individual differences

6 Discussion

IV Experiment 2

1 Participants

2 Materials

3 Results

a Subject islands

b Wh-Trace islands

V General discussion

1 The source of inconsistent judgments

2 Evidence of difference

VI Conclusions

Footnotes

Acknowledgements

Funding

ORCID iD

Notes

References