Abstract
Labeling is required by the interpretive system. When a head merges with a phrase, the head provides the label. However, lexical heads and T with poor inflectional features are too weak to be labels. Although insightful, this theory leaves at least one problem that needs prompt solutions: are there other kinds of weak heads? In this paper, we address this issue by proposing that phonological features play a crucial role in the labeling algorithm and by putting forward an additional version of weak heads. That is, a head that loses phonological features in the syntax is also weak. This approach to weak heads, together with the constraint that a structure must be labeled for interpretation, can capture the distribution of empty categories in topicalization, relativization, ellipsis, and other phenomena, some of which have not received enough scholarly attention. Therefore, our syntactic-phonological approach to labeling can open up new possibilities to account for the distribution of empty categories in a principled manner.
Introduction
Chomsky (2013) argues that when two constituents are merged, owing to the requirement of the interpretive system, the derived syntactic object must have a label, which is assigned by the labeling algorithm (LA). Roughly speaking, LA works in two ways: One is that when a head and a phrase are merged, the head provides the label for the derived structure; the other is related to the cases where two phrases are merged, including successive-cyclic movement, criterial positions, and other symmetric structures.
As for the first way of labeling, Chomsky (2015) further claims that some heads are too weak to label, proposing there are two kinds of weak heads: lexical heads, which have no category feature although they have lexical contents; and T in languages like English, whose inflectional features are poor. These weak heads cannot provide labels, but can have a structure labeled by agreeing with an overt constituent in the Spec position (we use terms of the Government and Binding theory only for ease of exposition). These assumptions about labeling can help explain the distribution of many empty categories. Let us use (1) for illustration.
(1) *Whoi do you think that ti will like Mary?
Suppose the derivation of (1) reaches the stage shown in (2), in which who will be extracted to the Spec-CP, the edge position of a phase, as shown in (3). When the phase CP (or the complement of phase head C) is sent to the interface (Bošković, 2016a; Chomsky, 2008), the LA will start to work (Rizzi, 2015, p. 321). However, who has been away from Spec-TP, which makes it incapable of agreeing with T to have the structure TP labeled (i.e., being in a discontinuous chain, the lower copy of who is invisible to the LA, and then it cannot undergo feature matching with T to have TP labeled). Furthermore, as a weak head, T cannot serve as a label. Thus, when this phase is transferred to the interface, part of the structure (namely, the structure marked as “?” and “??” in (2) and (3)) will not be properly labeled. In such a case, the empty category left by who is essentially illicit. Consequently, (1) is ungrammatical.
(2) [C that [?? who [? T [like Mary]]]] (3) [who [C that [?? who [? T [like Mary]]]]]
Based on this theory, many insightful hypotheses have been posited, and interesting data have been uncovered; see, for example, Bošković (2016b, 2018). However, Hayashi (2020) argues that the concept of “weak head” should be eliminated because it cannot account for the labeling of English infinitival clauses and the sentences in such languages as Japanese. For example, Bill in (4) should move out of the infinitival clauses in the derivational process, as shown in (5). Since T in English is weak, how the structure β in (5) is labeled is still unclear. Japanese has no phi-feature agreement. Therefore, the head T in this language should be weak. However, this language, similar to Italian, allows pro-drop, as can be seen in (6). It is unclear how the structure TP is labeled under Chomsky’s (2015) labeling theory.
(4) John expects Bill to win. (5) [ζ v* [
ε
Bill [
δ
R [
γ
(6) (Boku-wa) ringo-o tabe-masu. I-TOP apple-ACC eat-PRS ‘I ate an apple.’ (Hayashi, 2020, p. 280)Bill [β to [α Bill, win]]]]]] (Hayashi, 2020, p. 279)
The phenomena pointed out by Hayashi (2020) is considerably challenging; however, this does not indicate that the notion of weak heads must be eliminated. Indeed, we can develop Chomsky’s (2015) notion of weak heads by covering phenomena such as (4) and (6). To some extent, Chomsky’s (2015) assumption of weak head T in English is a reinterpretation of a long-standing observation, dating back to Taraldsen (1978); it holds that languages such as Italian, Spanish, and Hungarian, allow pro-drop because they are rich in inflectional features. Conversely, pro-drop is not allowed in English owing to its poor agreement inflection. In Chomsky’s (2015) labeling theory, he cited only this observation, leaving untouched other interesting remarks on pro-drop and infinitival T. While making a correlation between rich agreement inflection and pro-drop, previous research also makes some insightful remarks. For instance, Rizzi (1982:143) holds that there is a “pronominal Agr” in Infl/T. Borer (1986, 1989) argues that infinitival Agr and gerundive Agr in nonfinite T are anaphoric. Huang (1982) proposes that Chinese allows pro-drop because it has no agreement/Agr. Similarly, Saito (2007) suggests that in Japanese there is a covert operation allowing for pro-drop, which involves covert copying of elements from discourse-given entities to null argument positions. The precondition for this covert operation is the lack of surface agreement (Roberts, 2010).
Based on the previous research, particularly Huang (1982) and Saito (2007), we can develop Chomsky’s (2015) assumption by defining the weak T as (7).
(7) T is weak if its inflectional features are not rich.
The definition presupposes that if the head T is weak, it must have inflectional features. Assuming a head has agreement features, it is reasonable to consider it weak if it turns out to be poor in such features. With such a definition, nonfinite T should be strong because it has no inflectional features. T in such languages as Italian, Spanish, and Hungarian is strong because the inflectional features on T are rich. Furthermore, T in languages like Japanese and Chinese should be strong too because such languages do not have inflectional agreements.
Our assumption can explain why Chomsky (2013, 2015) believes labels to be crucial for interpretation at interfaces. In the pro-drop languages, the “pronominal Agr” (the set of phi-features) in T is interpretable (Holmberg, 2005; Rizzi, 1982; see also Alexiadou & Anagnostopoulou, 1998; Rizzi, 1982; Sheehan, 2006 for discussion of D-feature in T). When T merges with vP, T together with the Agr/D feature will provide a label for the mother node. Furthermore, the Agr/D feature can identify the semantic content of pro and help it interpreted (Rizzi, 1986). In English the Agr in T is not enough to interpret the pro in the Spec-T (alternatively, the Agr in T is uninterpretable). Therefore, if T tries to provide a label, the pro in the subject cannot be interpreted in the end, an undesirable consequence. If the label is nonfinite T (e.g., if “to” provides a label for the mother node), the PRO/trace/copy in Spec-T will be taken to be anaphoric (see Borer, 1986, 1989). In languages without agreement, such as Chinese, Korean, and Japanese, if the label is T, an entity will be copied from the discourse to null subject position at the interface (Saito, 2007; see also Miyagawa, 2017).
Although Chomsky’s (2015) assumption is insightful, he left at least one problem that needs prompt solutions. For example, to solve the problems of projection and to account for linguistic facts, he put forward two kinds of weak heads. If his assumption is correct, one may wonder whether there are other kinds of weak heads. In this paper, we aim to address this problem. We develop Chomsky’s labeling theory by proposing that there is another kind of weak heads. Specifically, based on Richards’ (2016, 2020) argument that certain aspects of phonological structures are built in the narrow syntax, we propose that phonological features play a crucial role in the LA. Particularly, a head that loses phonological features in the syntax is weak when it comes to labeling. This approach to weak heads, together with Chomsky’s (2013, 2015) constraint that a structure must be labeled for interpretation, can explain the distribution of empty categories in topicalization, relativization, and ellipsis.
The remainder of this paper is organized as follows. Section 2 introduces additional weak heads, that is, a definition of weak heads from the perspective of phonological features. Section 3 presents the facts that it can account for. Section 4 concludes the paper.
Our Assumptions About Weak Heads and Labeling
Most generative linguists agree that when taken from the lexicon, lexical items typically have semantic, formal, and phonological features (Chomsky, 1995). After a phase is complete, phonological features are transferred to the sensorimotor system by the spell-out operation (Chomsky, 2000, 2001, 2008). Put differently, although phonological features are present in the narrow syntax, they are blind to syntactic operations, and the phonological structures are built until the phonological features reach the sensorimotor system. Along this line, the relationship between syntax and phonology is unidirectional, that is, syntax determines phonology. Intending to change this system, Richards (2016, 2020) proposes that part of the phonological structures are constructed within the syntax. That is, there is an interaction between syntax and phonology, and their relationship is bidirectional rather than unidirectional (Kandybowicz, 2020). Of course, there is a restriction on the phonological information available to the narrow syntax. Specifically, only the phonological information predictable from the syntax is available, while the lexical specific information, such as specific segmental content, is not. With these assumptions, he offers a principled account of A’-moment, A-movement, and head movement. Similar to Richards (2016, 2020), Holmberg (2000) suggests that although syntactic operations cannot detect the exact phonological feature matrix of a constituent, they can determine whether a constituent has phonological features.
As for the phonological features of words, Richards (2016) makes some enlightening remarks. He argues that a conversion operation comes into play when phonological structures are built in the syntax. Consequently, a nominal with phonological features can be converted into PRO. In other words, the phonological features of a nominal can be erased by the conversion operation. It is well known in the Government and Binding theory that PRO is different from nouns/pronouns with phonological features in that it has the feature specification [+a, +p], which means that it is ungoverned (Chomsky, 1981). This indicates that the loss of phonological features has a significant impact on the syntactic properties of constituents.
As can be observed, the phonological features as well as the operation in the prosodic structure in the syntax exert a strong impact on syntax. Following this line of reasoning, we propose that phonological features also play a crucial role in the LA, and inspired by Chomsky’s (2015) assumption of weak heads, we put forward a new version of weak heads, as shown in (8).
(8) A head that loses its phonological features in the syntax is weak when it comes to labeling.
The weak heads discussed by Chomsky (2015) have phonological features. For the sake of exposition, we name these weak heads as overt weak heads and heads without phonological features as null weak heads. A head can lose phonological features in two ways. One is that the phonological features on a head are erased by the conversion operation in the narrow syntax. The other is that the phonological features of a head move away while the prosodical structure is built in the syntax. Thus, the original site of the head will have no phonological features. See also Tian (2022).
Before proceeding to view the empirical evidence in support of (8), we need to make two points clear. The first point is related to the identification of null weak heads. We assume that a head that loses its phonological features in the syntax is weak. Nevertheless, the phonological features in the syntax are invisible, and all we see are the phonetic forms at the surface level. Moreover, Richards (2016) makes it clear that the phonological features in the syntax may be different from the phonetic forms we see at the surface level because the early syntactic derivational process may be obscured by later derivations. For example, he argues that PRO is initially a noun with phonological features. To establish the contiguity relation, the phonological features of the noun are erased and the noun is converted to a PRO. Given this, we are faced with the following question: how can we determine whether a head loses its phonological features in the syntax?
One effective method is to see whether a head can have a phonetic form. If it can, but it ends up empty, or its phonetic form is moved away, then we can claim that the head loses its phonological features in the syntax. For example, the phonological features of C can be realized as “that,” as (9a) shows, but it turns out to be empty in (9b). Then it becomes a null weak head in (9b).
(9) a. I think that you are right. b. I think you are right.
If a head is never phonetically realized, we cannot claim that it loses its phonological features in the narrow syntax because it is possible that the phonological features of the head are not phonetically realized. For example, (10a) shows that the phonological features on T may be realized as [s], but they have no phonetic realization at all in (10b).
(10) a. He likes syntax. b. We all like syntax.
However, we cannot claim that when the subject is plural, T has no phonological features. This is confirmed by (11), in which the verb takes a different form when the subject is changed from singular to plural. Another example in our favor is the phonological features of the plural morpheme. They may be realized as [s], as in books, but they may have no phonetic realization at all, as in deer.
(11) a. He is a writer. b. We are writers.
The next point is related to the labeling of weak heads. Chomsky (2015) proposes that the weak head can have its mother node labeled by carrying out feature matching with a phrase. Specifically, although the overt weak head cannot provide a label for its mother node, it can help its mother node get labeled by undergoing feature matching with another phrase in its Spec. How then can the null weak head get its mother node labeled? As a kind of weak heads, the null head should try to have its mother node labeled with the help of feature matching, as overt weak heads do. Given that LA is a minimal search carried out in a local domain (Chomsky, 2013, 2015), the null weak head should try to find a constituent in its local domain to carry out feature matching so its mother node can be labeled. As stated above, the overt weak head relies on feature matching with the phrase in its Spec to have its mother node labeled. Since the overt weak head can become null, as in (12), the null weak head must resort to another kind of feature matching to have its mother node labeled. Otherwise, there will be no difference between the null and the overt weak head, which is against the assumption that the phonological features really play a crucial role in the LA.
(12) I suggest that he (should) go to the library right now.
Then what kind of feature matching can a null weak head rely on to have its mother node labeled? In the local domain, a head is likely to have two kinds of feature matching: One is to perform feature matching with the phrase in its Spec, and the other is to perform c-selection feature matching with its complement (Chomsky, 1995). Since the feature matching between the null weak head and its Spec is unable to have its mother node labeled, the null weak head has to rely on c-selection feature matching to have its mother node labeled.
We think that there are some factors in favor of this assumption. Firstly, almost all heads can be empty, that is, every head is likely to turn into a null weak head, and the feature shared by every head and the phrase in its local domain should be c-selection feature matching. For example, T in (12) is null, namely a null weak head. The complement of null T is a VP/v*P, which can undergo c-selection feature matching with T to have their mother node labeled. Accordingly, (12) is expected to be grammatical. By contrast, the complement of T in (13) is a ParticipleP, which cannot succeed in undergoing c-selection feature matching with the null head T. As a result, a labeling failure ensues.
(13) *I suggest that she going to the school.
Secondly, many scholars like Seely (2006), Bošković (2016b), and Narita and Fukui (2022) argue convincingly that c-selection is indispensable in the minimalist syntax. Besides, Chomsky (1995, p. 247) also argues that when a head and a phrase are merged, category feature checking is necessary.
Based on the discussion above, we can claim that the null weak head cannot label, but it can have its mother node labeled by carrying out c-selection feature matching with its complement. If we carefully scrutinize null weak heads, we can see that they should be divided into two types: (a) the overt weak heads become null owing to loss of phonological features in the syntax. The overt weak head like English T must undergo feature matching with the phrase in its Spec to have its mother node labeled. Once it becomes null, its identity should not be altered. For example, after the head T becomes null, although its syntactic properties may be affected, it is still T (i.e., we cannot argue that once T is null, it becomes a V or other heads for that matter). Therefore, feature matching between the null T and the phrase in Spec-TP is still necessary. To be exact, the overt weak T relies on phi-feature matching to have its mother node labeled. Then the null weak T should rely on both phi-feature matching and c-selection feature matching to have its mother node labeled. This assumption of multiple matching might be reasonable because many scholars also propose that multiple agreement is possible in syntax; see Hiraiwa (2004), Anagnostopoulou (2005) and Nevins (2011), particularly, Béjar and Rezac (2009). (b) The overt strong heads become null owing to loss of phonological features in the syntax. The overt strong head can label on this own. Then the null version of the strong head can carry out c-selection feature matching to have its mother node labeled. In the following, for ease of illustration, we will not differentiate these two types of weak heads, and assume that the null weak heads can have their mother nodes labeled by carrying out c-selection feature matching with their complements.
To sum up, in this section we develop Chomsky’s (2015) labeling theory by proposing that there is another version of weak head, namely the head that loses phonological features in the syntax. This kind of weak head cannot provide a label for its mother node, but it can have its mother node labeled by undergoing c-selection feature matching with its complement. We also provide a diagnostic method to determine whether a head has lost phonological features in the syntax or not. In the following section we will test whether our proposal can lead to be a novel prediction of linguistic facts or provide a new perspective to uncover the rule underlying different phenomena or offer a better account of linguistic facts than previous research. If our hypothesis can achieve any or all of the goals listed above, it can be said that the hypothesis is both theoretically and empirically superior.
The Distribution of Empty Categories in Null Head Constructions
In this section, we study the distribution of empty categories in many constructions with our definition of weak heads and Chomsky’s (2013, 2015) constraint that a structure must be labeled for interpretation.
Null T Constructions
It has been noted that an elided VP must be preceded by an auxiliary verb, as can be exemplified in (14) and (15) (See Aelbrecht & Haegeman, 2012; Aelbrecht & Harwood, 2015; Bresnan, 1976; Johnson, 2001; Lobeck, 1995; Zagona, 1988 for more examples).
(14) a. Jane hasn’t eaten any rutabagas and Holly hasn’t either. b. Mag Wildwood wants to read Fred’s story, and I also want to. c. John wants to go on vacation, but he doesn’t know when to. (15) a. I thought the auxiliary hadn’t disappeared, but it *(had) b. *I can’t believe Holly Golightly won’t eat rutabagas. I can’t believe Fred, either. c. John didn’t go because he did want *(to).
If there is no auxiliary verb in the clause, that is, if the elided verb is finite, the dummy auxiliary do must be inserted, as shown in (16). This has led many linguists to assume that it is the auxiliary or T that licenses VP ellipsis (Aelbrecht & Haegeman, 2012; Aelbrecht & Harwood, 2015).
(16) The chicken didn’t put the tuna on the table, but the penguin did.
Johnson (2001) observes that the trace of topicalized VP must be governed by an overt auxiliary, too, which is shown clearly in the contrast between (17) and (18). In (17) the trace of topicalized VP is governed by an overt auxiliary, and the relevant examples are grammatical. By contrast, the auxiliary together with VP is topicalized in (18). In other words, the trace of VP is governed by the trace of an auxiliary rather than the overt auxiliary itself. On this occasion, the relevant examples are ungrammatical.
(17) Madame Spanella claimed that . . . a. eat rutabagas, Holly wouldn’t t. b. eaten rutabagas, Holly hasn’t t. c. eating rutabagas, Holly should be t. (18) Madame Spanella claimed that . . . a. *would eat rutabagas, Holly t. b. *hasn’t eaten rutabagas, Holly t.
The following sentences lend stronger support to the idea that both the elided VP and the trace of topicalized VP must be preceded by an overt auxiliary. The auxiliary in (19a) is optional. It can be seen in (19b) and (19c) that once the auxiliary is empty, VP cannot be elided or fronted.
(19) a. They requested that he (should) sing a song. b. *They requested that he sing a song and she also. c. *Sing a song, they requested that he.
(20) indicates that the same is true of Chinese. (20a) shows that if the auxiliary becomes covert, the VP cannot be elided, and (20b) suggests that if the auxiliary is also topicalized together with VP, the sentence will be ungrammatical.
(20) a. Zhangsan shuo ta neng shuo yingyu, Lisi ye *(neng) Zhangsan say he can speak English Lisi also can speak English. ‘Zhangsan said he could speak English, and Lisi could speak English, too.’ b. *Zhangsan shuo, [neng shuo yingyu]i, Lisi ti. Zhangsan say can speak English Lisi ‘Zhangsan said that Lisi was able to speak English.’shuo yingyu.
Under our approach, the generalization that the ellipsis of VP or the trace of the topicalized VP must be preceded by an overt auxiliary can be accounted for. First consider (14) and (16). Since T is realized either as an auxiliary or the dummy auxiliary do in these sentences, it is an overt weak head. Thus, after T merges with an NP and agrees with it in ϕ-features, its mother node can be properly labeled, as (21) shows. As for ellipsis, we agree with Baltin (2012), who assumes that it takes place in the syntax. After being elided, the ellipsis site will like a pro-form. In other words, after the syntactic structure is elided, the ellipsis site will have no internal structure. This assumption is compatible with both PF deletion approach to ellipsis and LF-copy approach to ellipsis (See Baltin, 2012, for evidence). Since the ellipsis site is located in a labeled structure, the ellipsis of VP can be properly interpreted at the interface.
(21) 
By contrast, owing to lack of phonological features, T in (15) is a null weak head, incapable of providing a label for the structure marked as “?” in (22). It must undergo c-selection feature matching with the phrase in its complement position to get its mother node labeled. However, VP has been elided in the narrow syntax (Baltin, 2012; Park, 2017). After VP is elided, the ellipsis site will be like a pro-form (Baltin, 2012). This pro-form cannot participate in the LA because it has no specific category feature. Whether the phrase is originally a VP, a DP, or an NP, after it is elided, it will be a pro-form, whose contents are recovered at LF/interface. In addition, do and so are also pro-forms (Baltin, 2012). Without a particular category feature, it is impossible for the pro-form to undergo c-selection feature matching with another head to have its mother node labeled in the LA. Given this, once VP is elided in the narrow syntax, it is not able to undergo feature matching with T when LA, which as part of the spell-out operation (Bošković, 2016b, p. 59), starts to work. 1 Therefore, the ellipsis of VP will be located in a structure that is improperly labeled; it will not be assigned an interpretation at the interface, an unwelcome result.
(22) 
(17) to (19) can be explained away in the same way. If the auxiliary also moves to Spec-CP together with VP, the head T will be a null weak head when the LA starts to work as part of the spell-out operation. As a null weak head, it must undergo c-selection feature matching with the phrase in the complement of T. However, at this moment, the null weak head T cannot undergo feature matching with its complement because the latter has moved to Spec-CP, a position too far away from the null T (See Chomsky, 2015, for similar account of the ungrammaticality of [1a]). Consequently, the structure formed by the trace of the moved T and the trace of the VP cannot be properly labeled (see [22]), and the ungrammaticality ensues. In (20) if the T in Chinese loses its phonological features in the syntax, it will be a null weak head. Topicalization or ellipsis of VP will be ungrammatical for the same reason as in the case of (17) to (19).
Sentences like (23) are also in favor of our assumption. Even if no overt auxiliary is available after the complementizer that in (23), there is a negative word not preceding the elided VP. As the negation is a syntactic head (Lobeck, 1995; Potsdam, 1997), the head NEG should be strong enough to provide a label. Consequently, the elided VP can get an interpretation at the interface. 2 Moreover, the null head T can also undergo feature matching with its overt complement to have its mother node labeled. This sentence is, of course, grammatical.
(23) Ted hoped to vacation in Liberia but his agent recommended that he not. Under our approach the following phenomenon can also be explained. (24) a. Is she a teacher? *Yes, she’s. b. Who’s the tallest girl in our university? *Mary’s. (25) a. Is she a teacher? Yes, she is. b. Who’s the tallest girl in our university? *Mary’s.
Be is a clitic in (24). After it undergoes cliticization to she or Mary in the narrow syntax, T will have no phonological features when the LA is to be carried out, which makes T a null weak head. Therefore, different from the overt weak head T in (25), which can become strong by agreeing with a phrase in Spec-TP, it must undergo feature matching with the phrase in the complement of T. Unfortunately, its complement has been elided before the LA starts to work. As a result, part of the structure will be unlabeled.
Before we move to another kind of null head, it needs to be clarified that stripping is not a challenge to our assumption. We agree with Lobeck (1995, p. 27) in assuming that ellipsis is different from stripping, as (26) shows.
(26) Jane loves to study rocks and John [e] too.
Following Haegeman and Lohndal (2015), who recast Johnson (2014), we can assume that both John and too in the second conjunct of (26) have undergone movement to the clause internal left periphery consisting of Topic and Focus, which is right above vP. Therefore, after VP is elided, its empty category should be in a labeled structure projected by the head Focus (see section 2). If one does not approve of Haegeman and Lohndal’s (2015) analysis, and suggests that sentences like (26) involve TP deletion rather than VP deletion, as proposed by Kim (1997), then our hypothesis can be better supported because under this approach, after the remaining phrase in (26) moves to Spec-FocusP, TP is elided. Being one of the split heads of C (Rizzi, 1997), the head Focus should have phonological features although these features are not phonetically realized in the present context. Therefore, Focus should be a strong head, providing a label for its mother node. Furthermore, if Blümel (2017) and Miyagawa et al. (2019) are right in thinking that the declarative root can remain label-less, then (26) is still expected to be grammatical even if FocusP cannot be properly labeled. In such a case, it will be immaterial whether the head Focus has phonological features or not. The above argument is also applicable to verb gapping.
The fragment answer shown in (27) is not a challenge to our analysis either. Following Stainton (2006), we can assume that there is no ellipsis involved in this construction. Alternatively, we can follow Merchant (2005) in assuming that the fragment moves to the clause periphery first before the TP structure is elided. See also section 3.5 for discussion about the head C.
(27) Who finished this task first? John.
Null Verb Constructions
Verbs can be empty under certain circumstances. In such a case, the object of the verb cannot be relativized, topicalized, or elided, which is exemplified by (28) and (29).
(28) a. I think the students in MIT have all arrived, but I do not know whether our guests have both *(arrived). b. These books, we must have all read, but I do not know whether our guests have both *(read). c. The books that we must have all read are written by Chomsky, and the one that our guests have both *(read) is written by Chomsky, too. (29) a. zhexie chengshi, women yibufen qu-guo, tamen (quan) dou *(qu-guo). these city we some go-ASP they all all go-ASP ‘As for these cities, we have been to some of them, but they have been to all of them.’ b. women yibufen ren qu-guo zhexie chengshi, tamen (quan) dou *(qu-guo). we some person go-ASP these city they all all go-ASP ‘Only some of us have been to these cities, but all of them have been to there.’
The above facts can be captured neatly under our assumptions. These sentences are similar in that there is a quantifier in front of the empty categories left by VP. As argued convincingly by Sportiche (1988), the quantifier is adjacent to the NP with which it is merged before the latter undergoes movement. Considering the subject VP internal hypotheses (Diesing, 1990; Kitagawa, 1986) and the requirement of (30), we anticipate that the position occupied by the stranded quantifier should be in Spec-v*P, the base position for the subject (see also Bonet, 1990). Also, the functional category v* serves as the head of v*P in terms of the Government and Binding theory.
(30) Functional heads may be empty. But the abstract functional features must be licensed by the lexical material in its Specifier, and vise versa (Gasde an& Paul, 1996, p. 265).
Nevertheless, v* does not have phonological features in the LA in (28) and (29) because the phonological features of v* and V are elided before LA starts to work (See also our diagnostic test for null weak heads). Put differently, it is a null weak head. Then, it must undergo c-selection feature matching with the phrase in its complement position to get its mother node labeled. However, its complement (namely VP) has been elided before LA is to start (See Johnson, 2001; Merchant, 2001; Schuyler, 2001, among others, for the assumption that structures like (28) and (29) are VP ellipsis constructions). Consequently, feature matching becomes impossible and the empty category will be in an unlabeled structure at the interface. The unlabeled structure is marked as “?,” which is illustrated in (31).
(31) 
Now let us consider the following sentence, which can be used in specific contexts, in which, for example, I am distributing a fruit to students. It is a null verb construction rather than a typical gapping construction because the verbs in both conjuncts are empty. Therefore, it is a perfect example to test the effect of null verb.
(32) ni san-ge pingguo, ta si-gen xiangjiao. you three-CL apple he four-CL banana ‘You have three apples and he has four bananas.’
(33) shows that the object cannot be topicalized or elided in such a case. Put differently, if the verb is null, the object position cannot turn out to be empty, which falls right into place under our approach.
(33) a. *ni san-ge pingguo, ta ye_____. you three-CL apple he also ‘You have three apples and he also has three apples.’ b. *ni san-ge pingguo, si-gen xiangjiaoi, ta ti. you three-CL apple four-CL banana he ‘You have three apples and as for the four bananas, he will have them.’ c. *san-ge pingguoi, ni ti; si-gen xiangjiaoj, ta tj. three-CL apple you four-CL banana he ‘You have three apples and he has four bananas.’
The following is the corresponding sentence of (32) in which the verb is overt/not omitted.
(34) ni chi san-ge pingguo, ta chi si-gen xiangjiao. you eat three-CL apple he eat four-CL banana ‘You have three apples and he has four bananas.’
As (35) shows, on this occasion, the complement of the verb can be empty, which lends stronger support to our hypothesis, that is, it is the phonological features of the verb that make all the differences.
(35) a. ni chi san-ge pingguo, ta ye chi _____. you eat three-CL apple he also eat ‘You have three apples and he also has three apples.’ b. ni chi san-ge pingguo, si-gen xiangjiaoi, ta chi ti. you eat three-CL apple four-CL banana he eat ‘You have three apples and as for the four bananas, he will have them.’ c. san-ge pingguoi, ni chi ti; si-gen xiangjiaoj, ta chi tj. three-CL apple you have four-CL banana he eat ‘As for the three apples, you eat them; and as for the four bananas, he will eat them.’
Above, it is shown that when the phonological features on v*/V are erased, its complement position cannot be empty. Otherwise, there will be a labeling failure, and the sentence will be ungrammatical, accordingly. In the above examples, most of the verbs are transitive. Actually, even if the verb is intransitive, it cannot be elided either, which is demonstrated below.
(36) a. tamen dou xiao-le, wome ye quan dou *(xiao-le). they all smile-ASP we also all all smile-ASP ‘They all smiled. So did we.’ b. tamen dou zou-le, ni ye *(keyi) _____. they all leave-ASP you also can ‘They all left. So can you.’
This is what we expect, and the structure shown in (31) is applicable to (36a). With the phrase quan dou“all” in Spec-v*P, the head v* is indispensable. If the verb is elided in the narrow syntax, v* will have no phonological features because it can attract no item to attach to it. Consequently, (36a) is doomed to be ungrammatical because the weak head *v has no way to get its mother node labeled. 3 In (36b), it is the whole v*P that is elided. With T being empty, there is sure to be a labeling failure (See section 3.1 for discussion about similar cases).
Null Cl Constructions
In Mandarin Chinese, the classifier can be omitted in a specific circumstance; that is, the number is one, and the classifier is ge. The following is an example.
(37) wo bushi hen xihuan zhe (ge) haizi. I not very love this (CL) child. ‘I do not like this child very much.’
Classifier omission is much more common in Beijing Dialect (Jin, 1995; Zhu, 1982, p. 220), which can be exemplified by (38).
(38) zhebian you yi (ge) zhuozi. here have one (CL) desk ‘There is a desk here.’
One piece of evidence in support of the null classifier hypothesis in (38) comes from tone sandhi. Yi“one” in sentences like (38) is always pronounced with a rise tone, irrespective of the tone of the following noun (like zhuozi“desk”) even though yi“one” and the noun appear to be adjacent. This indicates that there must be a null classifier between yi“one” and the following noun.
Interestingly, if the classifier ge has no phonological features, the complement of the classifier cannot be topicalized, relativized, or elided. This is in contrast with the case in which the classifier is overt, which can be seen clearly in (39) and (40).
(39) a. *wo bu xihuan zhe haizi, ye bu xihuan na____. I not like this child also not like that ‘I like neither this child nor that one.’ b. *haizi, wo xihuan zhe. child I like this ‘As for the children, I like this one.’ c. *wo xihuan zhe de haizi I like this DE child ‘the child that I like’ (40) a. wo bu xihuan zhe-ge haizi, ye bu xihuan na-ge____. I not like this-CL child also not like that-CL ‘I like neither this child nor that one.’ b. haizi, wo xihuan zhe-ge. child I like this-CL ‘As for the children, I like this one’.
It cannot be said that the sentences in (39) are unacceptable because the demonstrative pronoun zhe cannot stand alone (or because it must co-occur with a classifier). (41) shows that it is possible for the demonstrative zhe“this” or na“that” to stand alone as an object.
(41) ni bu xihuan zhe, ye bu xihuan na, ni daodi xihuan shenme? you not like this also not like that you really like what ‘You do not like this, and you do not like that. What on earth is your favorite?’
The above fact is expected. If the classifier has phonological features, it will be a strong head, thereby providing provide a label for the structure formed by the merger of CL and NP. By contrast, when the classifier is empty, it will be weak. Once its complement becomes an empty category, the whole structure will be short of a label, resulting in a crash at the interface.
The following phrase can also lend support to our assumption.
(42) a. yi nanhai one boy ‘one boy ’ b. *yi bao zou. a sound thrashing ‘a sound thrashing’
(42a) is grammatical because the null classifier only c-selects an individual noun. Being an individual noun, nanhai“boy” can undergo c-selection feature matching with the null classifier, and the whole structure can be properly labeled. However, baozou“sound thrashing” is a deverbal nominal phrase rather than an individual nominal one. Therefore, c-selection feature matching cannot be carried out successfully in (42b), and then there will be a labeling failure.
Null de Constructions
In Chinese, de can be used to connect different kinds of phrases (such as DP, PP, AP, and CP) to a DP. It has been regarded as a determiner (Saito et al., 2008), a relative clause head (Ning, 1993), and a ModP head (Paul, 2005), to name a few. Controversial as the nature of de is, de is regarded as a head in most studies. In this section, we will study different kinds of de constructions involving null elements to see whether the phonological features of de will exert an impact on the licensing of empty categories.
Null de used to connect a possessee with a possessor
De in the possessive phrase can be omitted in most cases. For example, in (43) nimen de xuexiao“your school” is a possessive phrase, in which de is the overt realization of the head Poss and it is optional for de to be realized in this phrase. Interestingly, if de is overt, the complement of de, namely xuexiao“school,” can be topicalized or elided. By contrast, if it is empty, topicalization or ellipsis of xuexiao is illicit, as is shown in (44) and (45).
(43) wo xihuan nimen (de) xuexiao. I like you DE school ‘I like your school.’ (44) a. wo xihuan nimen de xuexiao, ye xihuan tamen de (xuexiao). I like you DE school too like them DE school ‘I like your school and I like theirs, too.’ b. xuexiaoi, wo xihuan tamen de ti. school I like them DE ‘As for the school, I like theirs.’ (45) a. wo xihuan nimen xuexiao, ye xihuan tamen *(xuexiao). I like you school too like them school ‘I like your school and I like theirs, too.’ b. *xuexiaoi, wo xihuan tamen ti. school I like them ‘As for the school, I like theirs.’
With our assumptions, the above data can fall into place. First let us see (44). With an overt form, de in (44) should be a strong head, which can provide a label for its mother node. Thus, after its complement is topicalized or elided, the empty category will be located in a labeled structure. Now let us see (43), in which de might be null. If de turns out to be null, it should be a weak head. As a null weak head, it is unable to provide a label for its mother node. Nevertheless, it can help its mother node properly labeled through c-selection feature matching with the phrase in its complement position. Consequently, all the structures can be labeled. If the phrase in the complement of de has moved to the edge of the phase deP or has been deleted before LA starts to work, as (45) shows, then feature matching between the null de and its complement will become impossible. Consequently, a labeling failure will occur. The empty category of xuexiao“school” will be located in an unlabeled structure, which is shown in (46).
(46)
Perhaps, one may ask whether the possessive phrases without de, like nimen xuexiao“your school” and wo baba“my father,” have the same structure as the one with de like nimen de xuexiao“your school” and wo de baba“my father.” In other words, is there a null de when de is invisible? Many a scholar, like Duanmu (2007), gives a positive answer to this question. Furthermore, some evidence shows that whether there is an overt de or not, the structure of the possessive phrase should be the same. For example, the binding ability of the possessor in (47) and the weak crossover effect in (48) remain unchanged when de is not overt (See also Tang, 1989, for discussion about reflexives).
(47) Zhangsani (de) na-ben shu hai-le tai/zijii/tazijii. Zhangsan DE that-CL book ruin-ASP him/ himself/himself ‘That book of Zhangsan ruins him.’ (48) *? tai, Zhangsani (de) baba bu xihuan ti. he Zhangsan DE father not like ‘Zhangsani’s father does not like himi.’
Furthermore, the possessor is obligatory even if there is no overt de. For example, omission of the possessor in (49) will render this sentence uninterpretable (See also Zhang, 2009). This indicates that the possessor should be an argument of the possessee. Put another way, it should be in an A-position, namely, in the Spec position of deP/PossP.
(49) *(ta) baba xihuan yuyanxue. he father like linguistics ‘His father likes linguistics.’
De used to connect two phrases without possessive relation
Similar to the cases in which de is used to connect a possessor DP to a possessee DP, de must be overt so as to license an empty category when it is used to connect two phrases without a possessive relation. This can be seen in (50), in which de is to connect a DP to a DP, an AP, a PP, and a CP, respectively.
(50) a. zhuozii, wo xihuan mutou/hongse/hong *(de) ti. table I like wood/red/red DE ‘Speaking of the tables, I like the wooden/red one. ’
b. shui, wo xihuan guanyu lisi *(de) ti.
book I like about history DE ‘Speaking of the books, I like the one about history. ’
c. diannaoi, wo xihuan ni mai *(de) ti.
computer I like you buy DE ‘Speaking of the computers, I like the one you bought.’
In Mandarin Chinese, the complement of de can be empty when the de phrase is used to denote a person, such as a driver, writer, etc. Interestingly, de in such a case cannot be empty either, as shown in (51).
(51) wo shi kai che *(de). I am drive car DE ‘I am a driver.’
These facts can be accounted for easily along the above lines. Without phonological features, the head de will be weak. Thus, when its complement becomes an empty category, it will appear in an unlabeled structure. A crash at the interface will arise, accordingly.
Other Heads
In this section, we will have a brief discussion of other heads, some of which might appear to challenge our assumptions. As was previously argued, a head that loses phonological features in the syntax is weak. As a null weak head, it is incapable of providing a label for its mother node. If its complement is also empty, the empty category will be in an unlabeled structure. The empty category cannot be identified at the interface, and a crash will show up. However, the sluicing phenomenon shown in (52) seems to be contrary to our prediction.
(52) He is writing something, but you can’t imagine what. (Ross, 1967, p. 252)
It can be seen that in the embedded clause of the second conjunct in (52), the head C is empty and the complement of C is elided, but this sentence turns out to be acceptable. Why is it that the head C can still provide a label for the structure?
Instead of being a challenge to our hypothesis, the sluicing phenomena fall right into place under our assumption. In the following we will choose Merchant’s (2001) analysis of sluicing as an example for illustration because his analysis is the most influential (See Abe, 2015; Lasnik, 2001, among many others). Based on Merchant (2001), (52) should be derived by moving what from TP to Spec-CP of the embedded clause and then having TP elided, as shown in (53).
(53) You can’t imagine [CP what CQ [TP he is writing what]].
Now it will be clear why sluicing is not a challenge to our hypothesis. C has phonological features. However, such features never get phonetic forms in sluicing constructions. This produces an illusion that the phonological features on C get erased and become empty in the narrow syntax. In fact, the phonological features of C remain intact throughout the derivation. Since C is a strong head in sluicing constructions, it will be able to serve as a label. As a result, the empty category of TP will be in a labeled structure. Nothing is wrong with sentences like (52).
The following sentences are in support of our assumption.
(54) A: Max has invited someone. B: Really? Who (*has)? (Abe, 2015, p. 17) (55) En eller andensnakker med Marit, men vi ved ikke hvem (*der). (Danish) someone talks with Marit but we know not who C0 ‘Someone talks with Marit, but we do not know who.’ (Merchant, 2001, p. 68)
As can be seen, wh-word and an overt C in sluicing never co-occur in English and other Germanic languages. This motivates Merchant (2001) into coming to the following generalization.
(56) Sluicing-ComP generation: In sluicing, no non-operator material may appear in Comp.
This generalization shows that the phonological features of C in the sluicing constructions never get phonetic forms at PF. Given that phonological features on C are not elided in the syntax, C will be a strong head, capable of providing a label for its mother node.
(9), repeated as (57), can also be accounted for easily. In (57a) the head C can be realized as that. Put differently, the phonological feature of C can have phonetic realization. By contrast, the head C in (57b) is null. In other words, the phonological features of C are erased in the narrow syntax, making it impossible for C to be realized as that. Then, the head C in (57b) should be a null weak head. This head should have its mother node labeled by undergoing c-selection feature matching with its complement.
(57) a. I think that you are right. b. I think you are right.
After this brief discussion of the head C, let us turn to the head D. The phonological features of D can have overt phonetic form in English, as (58) shows. In this sentence, D should be a strong head, which can provide a label for its mother node. Therefore, nothing is wrong with it.
(58) I like the books.
What is interesting is (59). D is able to have a phonetic form, but it does not, which suggests that D in (59) should be a null weak head. Based on feature matching with its complement, D can have its mother node labeled. Thus, no structure in (59) is improperly labeled, and (59) is expected to be grammatical.
(59) I like books.
The following sentence supports our assumption. De in this sentence is used to connect an AP to an NP. Interestingly, it can be either overt or covert. When it is overt, it can serve as the label of its mother mode. By contrast, when it is covert, it must undergo c-selection feature matching with its complement to have its mother mode labeled. Given they are labeled in different ways, their labels may not be the same, and their meanings may be slightly different.
(60) wo xihuan xin (de) shu. I like new DE book ‘I like new books.’
This is really the case. The phrase xin shu“new book” is different from xin de shu“new book” in meaning: Xin“new” in the former is a defining characteristic of books and xin de“new” in the latter is an accessory property of books (Paul, 2005).
With the discussion above, let us consider null preposition constructions, which, in our opinion, can lend support to our assumptions. Please see (61).
(61) a. Where has John been? b. Which place has John been *(to)?
In (61a), the wh-word where is similar to the preposition phrase in function (Huang, 1982). Since the head P is never realized at PF in (61a), it should be a strong head, providing a label for its mother node. That is why (61a) is grammatical. In contrast, the phonological features of P can be realized as to at PF in (61b). If these phonological features are deleted, that is, if the preposition becomes null in the LA, the empty category left by wh-movement will be located in an unlabeled structure. Unacceptability will ensue, accordingly. (62) is also in favor of our assumption.
(62) a. Where did John fly to? He flew to Germany. (Miyagawa, 2017, p. 132) b. *Where did John fly? He flew to Germany.
With the overt preposition to, (62a) can be used to express the destination of flying. However, once this preposition is deleted as in (62b), where cannot undergo A’-movement if the same meaning is to be expressed.
Before concluding this section, we need to point out that there are alternative accounts for certain phenomena listed in this paper, such as (14) and (15); see, for example, Lobeck (1995), Merchant (2001, 2005), and Aelbrecht (2010). However, compared with them, our approach has obvious advantages. First, citing Miyagawa et al. (2019), we can say that “the advantage, we believe, of our approach is that we suggest a unified way to view the multitude of phenomena that require separate explanations.” Second, under our approach, many new data can be uncovered (See section 3.2 and 3.3, among others). Third, many facts in this paper cannot be explained with the existing hypotheses. Let us take Aelbrecht (2010) as an example for exposition. According to her, VP ellipsis is licensed by an ellipsis feature on Voice (namely, [E]-feature) which must be checked against the head T via Agree. Although this can explain the contrast shown in (14) and (15), it is hard to account for the fact shown in (19b) where the auxiliary should be optional. Under this approach it will be a mystery why the phonological features of the auxiliary can affect the ellipsis of VP. Moreover, it is hard to explain Chinese ellipsis, as in (20), where no overt agreement features or inflectional variations are observable. Furthermore, it cannot be extended to movement and ellipsis constraints shown in (19c), (24), and (25), and the facts in section 3.2 to 3.4. Interestingly, once our assumption is adopted, all these facts fall right into place.
Conclusion
Chomsky (2015) argues that there are two kinds of weak heads: lexical heads and T in languages with poor inflectional morphology. In this paper, we follow Richards (2016, 2020) in proposing that there is another kind of weak head, namely the head that loses its phonological features in the syntax. This approach to weak heads, together with the constraint that a structure must be labeled for interpretation, can explain the distribution of empty categories in different kinds of constructions. Thus, our syntax-phonology perspective presents a new approach to labeling and opens up new possibilities to account for the distribution of empty categories in a principled way.
Footnotes
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by a grant from the National Social Science Foundation of China awarded to Qilin Tian (Award Reference: 18BYY006).
Ethical Approval
Ethical approval is not applicable for this article.
Statement of Human and Animal Rights
This article does not contain any studies with human or animal subjects.
Statement of Informed Consent
There are no human subjects in this article and informed consent is not applicable.
Data Availability Statement
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
