Abstract
This article focuses on the choice of nominal forms in a language with articles (Catalan) in comparison to a language without articles (Russian). An experimental study (consisting of various naturalness judgment tasks) was run with speakers of these two languages which allowed to show that in bridging contexts native speakers’ preferences vary when reference is made to one single individual or to two disjoint referents. In the former case, Catalan speakers chose (in)definite NPs depending on their accessibility to contextual information that guarantees a unique interpretation (or the lack of it) for the entity referred to. Russian speakers chose bare nominals as a default form. When reference is made to two disjoint referents (as encoded by the presence of an additional altre/drugoj “other” NP), speakers prefer an optimal combination of two indefinite NPs (i.e., un NP followed by un altre NP in Catalan; odin “some/a” NP followed by drugoj NP in Russian). This study shows how speakers of the two languages manage to combine grammatical knowledge (related to the meaning of the definite and the indefinite articles and altre in Catalan; and the meaning of bare nominals, odin and drugoj in Russian) with world knowledge activation and accessibility to discourse information.
Keywords
1 Introduction
Discourse planning and discourse comprehension involve an assessment of the referent’s expectedness, which has been claimed to be partially influenced by the accessibility of the speaker to the information status and the uniqueness status of the referent. In accordance with the Dual-Process Activation Model (Brocher et al., 2016; Brocher & von Heusinger, 2018), an approach designed to address the dynamicity in reference management in discourse processes (Arnold, 2010), a distinction is made between the activation of a concept and the activation of a referent. The former is associated with an NP’s descriptive material, a noun’s information that, as anchor, is used to establish coherence relationships in discourse by pre-activating referent’s concepts that lead to easier reference integration. The latter is associated with the referential status of an NP depending on its definiteness marking (Burkhardt, 2006; Schumacher, 2009) and the bridging relations that can be inferred during discourse processing (Myers et al., 2010).
The present experimental investigation converges with these studies by searching how the speakers’ choice of nominal forms interacts with the accessibility of inferred referents and the informativeness of referential expressions in discourse contexts (Ariel, 1990; Arnold, 2010; Chafe, 1976, 1994; Davies & Arnold, 2019; Givón, 1983; Gundel et al., 1993). However, while previous psycholinguistic research on reference management in speech comprehension and production mainly focuses on languages with articles (namely, English and German), in the present experimental study, we examine speakers’ preferences for specific nominal forms in Catalan (a Romance language with articles) in comparison to Russian (a Slavic language without articles). 1 Thus, the novelty of this study is that we investigate which referential expressions are preferred (definite vs. indefinite, bare vs. specific indefinite) in the same experimental scenarios but in two languages with different grammatical systems. 2 The present study investigates the naturalness of definiteness marking and NP interpretations in inferred conditions, which allows us to show the extent to which information status and definiteness marking interact at the level of discourse comprehension.
We examine the distribution of simple singular sentence-initial NPs (in terms of Dryer, 2007) preceded by a definite or indefinite article in Catalan, and singular sentence-initial bare nominals or NPs preceded by the indefinite determiner odin “some/a” in Russian. We also study how the distribution of NPs changes if a second identical but not co-referential nominal is introduced into the immediately following discourse by means of l’/un altre “the other/another” in Catalan and of drugoj “other/another” in Russian. We aim at identifying intralinguistic patterns and cross-linguistic correlations in the use of these NPs.
Consider example (1) as an illustration of a bridging context (Asher & Lascarides, 1998), in which some inferential information is potentially evoked. The first utterance presents a discourse scenario and introduces an NP whose descriptive material preactivates a conceptual frame (Barsalou, 1992; Fillmore, 1975; Löbner, 1985, 1988). The second utterance introduces a nominal expression in sentence-initial or topic position; 3 the referent of this nominal expression (henceforth Ref1) is discourse-novel, but its information status (Prince, 1981, 1992); is inferred (or inferable) from the common ground knowledge (Clark & Marshall, 1981; Stalnaker, 1974) that people have and share regarding the conceptual descriptive contents pre-activated in the previous context. The third utterance also introduces a nominal expression in sentence-initial/topic position; the concept of this nominal expression (henceforth Ref2) is on one hand discourse-old, since its head noun is identical to the head of Ref1, but its referent is on other hand discourse-new, since it is specified by altre/drugoj.
(1) a. Gairebé immediatament, l’ambulància va arribar al lloc dels esdeveniments. La/Una infermera va demanar que li donessin alguna cosa per aturar l’hemorràgia. L’/Una altra infermera, que era a prop, li va oferir la seva bufanda.
b. Počti srazu k mestu proisšestvija priexala brigada skoroj pomošči. (Odna) medsestra poprosila čto-nibud’, čtoby ostanovit’ krov’. Drugaja medsestra predložila svoj šarf.
“Almost immediately, the ambulance arrived at the place of the events. The/A nurse asked for something to stop the bleeding. The other/Another nurse that was nearby offered her her scarf.”
By investigating the speaker’s choice of nominal forms in a language with articles as well as in a language without articles, this article also tackles indirectly the question of the interpretation of these forms. Reference to a uniquely identifiable object in discourse is taken in this article as the core meaning of definite articles, and in this respect, even though the (in)definiteness semantic category has been claimed to be universal for human cognition and, thus, present in all languages (Abraham et al., 2007; Aguilar-Guevara et al., 2019 i.a.), we posit that it is not necessarily represented in all languages as a morphosyntactic category. 4 Hence, by considering and comparing the choice of nominal forms that speakers of typologically different languages make, we aim at broadening our knowledge of definiteness marking (or its absence) and the meaning associated with the different forms involved. We thus address the following two main questions, which further warrant the more specific questions exposed in Section 3.1:
Q1: What is the naturalness of sentences with definite and indefinite NPs in languages with articles, and bare and non-bare NPs specified for indefiniteness in languages without articles, when reference is made to a discourse-novel but possibly identifiable as unique Ref1 that is conceptually pre-activated in the previous discourse?
Q2: What is the naturalness of sentences with definite and indefinite NPs in languages with articles, and bare and non-bare NPs specified for indefiniteness in languages without articles, when an (in)definite other NP for Ref2 is introduced in the subsequent discourse?
Our initial hypotheses regarding the naturalness of different nominal forms in the two languages are based on the semantic and psycholinguistic literature. In one-referent contexts, high naturalness of discourse-novel referents that are expressed by indefinite NPs is expected in Catalan, as this form is used for the introduction of new referents, but definite NPs should not be ruled out since they may be interpreted as anchored to a previous discourse by means of bridging. 5 The definite article is expected to uniquely identify one element, while the indefinite article asserts that the referred set is non-empty. As for Russian, bare nominals are predicted to be considered more natural, as they are expected to be compatible with both a definite and an indefinite reading, while the choice of odin is marked, conveying specificity. In two-referent contexts, Catalan definite NPs are expected to be ruled out for Ref1, as they imply a uniqueness interpretation that is violated in the presence of Ref2. In Russian bare NPs are predicted to be excluded for Ref1 only under the assumption that they primarily correspond to definites and involve reference to uniquely identifiable entities. Otherwise, they should be evaluated as fully natural.
We advance that the results of the present study show that the preference for definite or indefinite NPs in one-referent contexts was found not to be significant in Catalan (see parallel results from psycholinguistic research on German; Burkhardt, 2006; Schumacher, 2009). Even though with inferred concepts definite NPs are expected to be more prominent and come with a higher activation status than indefinite NPs, when there is domain restriction or pre-activation, the definite article of Ref1 uniquely identifies one element, but the indefinite article still has the potentiality to refer to one referent. These findings highlight the complexity of (in)definiteness marking in discourse comprehension and lead to explore alternative approaches to the semantic contribution of a (in)definite article, such as Szabó’s (2000) and Ludlow and Segal’s (2004).
The article is structured in the following way. In Section 2, we first review some theoretical notions related to definiteness and indefiniteness in languages with and without articles. Second, we go over the distribution of nominals in bridging contexts both in Catalan and in Russian when reference is made only to Ref1 and when a Ref2 is introduced in the discourse. In Section 3, we present in more detail the motivation and research questions pursued in this investigation. We then present the experimental study that consists of four tasks that aim at determining the preferences of native speakers of the two languages at the time of choosing the NP form when reference is made to one referent, either a unique or non-unique novel entity, or to two disjoint referents. We aim to find out whether speakers’ judgments change (in both Catalan and Russian) when participants are presented with only one referent or two. We also investigate whether speakers of Catalan have any preference for a definite or an indefinite NP for Ref1 when reference to Ref2 in a subsequent utterance is introduced by either a definite or an indefinite NP. To delimit the scope of our empirical study, we only investigated speakers’ preferences regarding singular count NPs in sentence-initial/topic position. The article finishes with a general discussion of the results of the experiment and conclusions in Section 4.
2 On the distribution and meaning of NPs in languages with and without articles
2.1 (In)definiteness: uniqueness, familiarity, specificity
The meaning of a definite description has been commonly associated with an existential and uniqueness condition of the referent (Frege, 1892; Russell, 1905; Strawson, 1950 i.a.). A singular definite description conveys that there is exactly one individual in the extension of the NP that satisfies the descriptive content of this NP, which means that a nominal interpreted as unique is construed in the narrowest possible domain (Abbott, 2004, 2010; Hawkins, 1978) 6 Indefinites, on the contrary, usually imply that there is more than one individual that satisfies the description of the nominal (Diesing, 1992). 7 In classical semantics, indefinites are treated as quantifiers that assert the non-emptiness of the set denoted by the noun (Heim, 2011).
In accordance with these antecedents, the meaning of the definite NP la infermera “the nurse” in the Catalan example in (1) is presumably about a contextually unique nurse, while the indefinite una infermera “a nurse” conveys the inference that the conditions for the definite article are not met (i.e., a contextually unique nurse is not identifiable).
In languages with articles, the choice of a definite form may also reflect familiarity (Christophersen, 1939; Heim, 1982; Kamp, 1981), according to which, the referent of the definite description is conceived as known/familiar to the speaker and the hearer. This familiarity may arise from a previous mention of the referent, from knowledge of the immediate situation, or from common ground knowledge that the participants of an act of communication share. Indefinite descriptions are said to introduce novel entities, while definite descriptions refer to already known entities. Thus, the same example given in (1) may be interpreted as an assertion about the nurse which is present at the moment of speaking (or has been previously mentioned) if the subject is definite, while the indefinite subject is about a nurse that is mentioned for the first time. In this sense, an indefinite expression is interpreted as conveying a lack of familiarity. 8
Assuming that a uniqueness interpretation is the main semantic contribution of the definite article, we would not expect this meaning to be present on bare nominals in languages without articles. In fact, the absence of a uniqueness semantics for Russian bare NPs has been recently proposed in Šimík and Demian (2020), and Seres and Borik (2021), according to whom, bare nominals in Russian have an inherent indefinite meaning, and any other interpretations they may have are achieved through a pragmatic enrichment of the default indefinite. This means that, out of context, the Russian sentence introduced by medsestra “nurse” in (1) may be interpreted as being about some nurse that asked for something to stop the bleeding, but a uniqueness reading is not excluded (see the English translation).
There is still another notion interfering with uniqueness and familiarity that is relevant in our study on the use and meaning of nominal expressions. This is specificity (Fodor & Sag, 1982), which determines whether a description is associated with some specific referent or not. 9 Definite nominals in languages with articles are most often considered to have a specific reference, but indefinites are either specific or non-specific. In a specific reading, the individual referred to by means of the NP is assumed to exist in the actual world, while in the non-specific reading the individual referred to may exist in some possible world. Specificity as a way of structuring relations among elements in discourse may be lexicalized in some languages (von Heusinger, 2002, p. 45). Thus, while Russian does not express (in)definiteness by means of articles, it has overt specificity markers such as the indefinite determiner odin ‘some/a.’ 10 Odin belongs to a class of actualizers (Padučeva, 1985, 2017) elements that help establish the reference of a nominal in the absence of articles. Thus, example (2) shows that in Russian in normal circumstances a preverbal position excludes the indefinite interpretation of bare singular NPs (Geist, 2010). 11
(2) [The door opened and. . .]
a. Odna devochka voshla v dom.
a/one girl came into house
“A girl entered the house.”
b.*Devochka voshla v dom.
girl came into house
(Geist 2010, p. 193 ex.s (6a) and (5b))
Odin is also found in introductory contexts, where its referent is known to the speaker but not to the hearer, and the speaker may not want to give away any information to identify it (Padučeva, 2016). Based on these assumptions, one of the goals of this article is to investigate experimentally whether in bridging contexts and in sentence-initial/topic position Catalan speakers consider more natural a definite or an indefinite NP, and whether Russian speakers consider more natural a bare NP or an NP modified by odin.
2.2 Bridging and NP distribution. Introducing Ref1
Bridging is a function of interpretation of the discourse parts and how they are put together to maximize discourse coherence (Asher & Lascarides, 1998). As such, bridging is considered a by-product of discourse interpretation, when the speaker and the hearer may share some knowledge of the relations that hold between certain NPs that act as triggers and other NPs that act as associates. Thus, in our example in (1), the first utterance introduces the ambulance. At the level of concept activation, the hearer will prompt the essential knowledge (or frame; Fillmore, 1982) that relates a number of concepts denoted by ambulance to other concepts in the same frame. At the level of referent activation, the definite article that precedes the noun refers to the unique/salient object that has the property denoted by the noun in the context described. The bare NP brigada skoroj pomošči of the Russian version is assumed to refer to an entity that is familiar to both the speaker and the hearer in that specific context. When the second utterance is introduced, the speaker’s choice of a nominal form in topic position reveals specific bridging inferences. 12 Note that the materials used in our experimental study introduce an associative anaphoric relation between two NPs that do not share the same head noun and are distributed in two subsequent sentences/utterances.
In our bridging example in (1) the hearer/reader understands that, in the second sentence, the speaker refers to the nurse in the ambulance mentioned in the first one, relying on the common ground knowledge that ambulances may carry nurses. 13 As for Catalan (Brucart, 2002; Brucart & Rigau, 2002; Institut d’Estudis Catalans [IEC], 2016; Rigau, 1981), the use of the definite article in that context signals to the hearer/reader that there is only one nurse (the unique nurse) in the ambulance. 14 Definiteness marking in our materials correspond to strong readings of the definite article (Abney, 1987; Longobardi, 1994, 2001, 2005 i.a.), according to which uniqueness and familiarity are guaranteed: the speaker’s choice of a definite NP depends on whether the referent is interpreted as unique in a given situation, and also on the background knowledge of the participants of communication, that is, whether they expect a prototypical situation to involve a unique referent of this kind. Otherwise, the speaker would have used an indefinite description, which would imply that there is more than one nurse in the ambulance. 15
As for Russian (Švedova, 1980/1982) bearing in mind the formal semantic literature on the meaning of bare nominals in languages without articles (Chierchia, 1998; Dayal, 2004), a bare NP in argument position, without any context, may have different interpretations, comparable to the ones that definite and indefinite nominals convey in languages with articles. If the familiarization condition is fulfilled a bare NP is expected to be interpreted as a definite marked NP; if the novelty condition is fulfilled a bare NP is expected to be interpreted as an indefinite marked NP (Geist, 2010). This notwithstanding, Russian does not freely allow indefinite readings for NPs and indefinite aboutness topics must be specific (see (2)), which means that bare NPs can be interpreted as indefinite only if they belong to the comment (Geist, 2010). According to this view, (singular) bare NPs primarily correspond to definites and involve reference to uniquely identifiable entities. 16 In the specific case of our bridging materials, such as the example in (1), participants were presented with texts that introduced Ref1 by means of a bare nominal or an NP specified by odin. Given the extensive use of bare NPs, the unique or non-unique interpretation that Russian speakers can attribute to these nominal expressions is expected to be highly dependent on contextual information and concept activation.
To sum up, in the experimental study described in Section 3, we examine the hypothesis that the speaker’s choice of (in)definite NPs in Catalan and bare/odin NPs in Russian is grammatically constrained by the linguistic parameters that encode uniqueness, familiarity and specificity (or the lack of them) in the languages under study, and are sensitive to bridging inferences constructed at the time of discourse comprehension.
2.3 Bridging and NP distribution. Introducing Ref2
Bridging contexts are usually exemplified by pieces of discourse containing two subsequent sentences/utterances that lack an explicit link between two subsequent events. However, this is not necessarily so, since one can easily build a discourse in which a third utterance introduces a second referent (Ref2).
Consider our example in (1) once more. In the Catalan version, a second referent is introduced in sentence-initial/topic position by means of an NP specified by the lexical item altre “other,” which establishes a referent distinct from the one accessible in the preceding discourse. 17 The presence of an (in)definite altre NP for Ref2 after Ref1 introduces four possible combinations: (in)definite NP—(in)definite altre NP. Grammatical studies do not make any clear predictions about (in)definite NP correlations in the presence of altre. However, one would expect that, out of these four combinations, only indefinite NP—indefinite altre NP would be completely well-formed, the reason being that an NP containing altre introduces one more element into the discourse, presumably disrupting the uniqueness interpretation associated with a definite article that might possibly occur in the previous discourse, and perfectly matching the non-uniqueness interpretation associated with an indefinite one. When the article preceding altre is indefinite, the NP denotes an unidentified entity different from the one previously mentioned (e.g., La/Una infermera . . . Una altra infermera “the/a nurse . . . another nurse”). The combination of an indefinite altre NP preceded by a definite NP is not expected to illustrate a well-formed associative anaphoric use. If altre is preceded by a definite article, it denotes an entity different from the one that has been previously introduced by an (in)definite NP, but which is necessarily part of a known plurality (e.g., La/Una infermera . . . L’altra infermera “the/a nurse . . . the other nurse”). After a definite NP for Ref1 with a unique interpretation, the occurrence of l’altre NP (formally identical but not co-referential with the first one) is expected to be considered infelicitous by native speakers, as the second definite NP cancels the unique interpretation of the first one.
In this article, we aimed to investigate whether native speakers of Catalan show any preferences when judging discourses that contain a (in)definite NP followed by a second NP introduced by a (in)definite altre NP, and whether they consider ill-formed discourses that present definite NPs followed by (in)definite altre NPs.
Moving on to Russian, it must be pointed out that, in addition to bare and odin NPs, NPs can be specified by drugoj “other, another, different; next, second.” This lexical item introduces into the discourse a new referent, whose semantic status is underspecified (i.e., in the absence of articles it is unclear in Russian whether the referent of this element is unique or non-unique). As for our experimental study, we conjectured that the introduction of a second disjoint referent specified by drugoj into the discourse might give insights into the interpretation of the first element: specifically, it might constrain what the reading associated with the first referent can be. Therefore, we considered worth investigating what the optimal form chosen by native speakers for the first referent in the context of a second referent of the form drugoj NP is: either a bare nominal or an NP specified by odin.
The linguistic literature on topicality holds that nominal expressions in sentence-initial position can be considered topics and that topicality strongly favors definiteness (and/or specificity) cross-linguistically (Cohen & Erteschik-Shir, 2002; Erteschik-Shir, 2007; Geist, 2010; Reinhart, 1981, among others). Under Dayal’s (2004) hypothesis that bare NPs refer uniquely, drugoj NPs in a subsequent utterance are not expected to be well-formed. However, under Šimík and Demian’s (2020) and Seres and Borik’s (2021) hypothesis that a bare nominal in a language without articles like Russian does not carry any uniqueness presupposition, a second identical but not co-referential NP appearing in discourse right after the first one is predicted to be acceptable and, consequently, drugoj NPs are expected to be accepted. The latter prediction was borne out by our experimental findings.
On other hand, when an NP modified by an overt indefinite determiner (i.e., odin) occurs in sentence-initial/topic position, since this item does not encode a uniqueness-associated meaning, reference to a second distinct individual by means of drugoj is expected not to be problematic and, in fact, we show that it is the option preferred by native speakers.
To sum up, in this study we aimed to investigate in the two-referent condition whether native speakers of Catalan prefer any specific combination of (in)definite NP for Ref1 when an (in)definite altre NP was present for Ref2; likewise, we aimed to investigate whether native speakers of Russian prefer a discourse that distributes a bare nominal followed by drugoj NP over a discourse that distributes an odin NP followed by drugoj NP.
We now turn to our experimental study, based on the theoretical claims and empirical materials presented in this Section 2.
3 Experimental study
3.1 Aim of the study
Our purpose was to empirically investigate speakers’ choices of NP forms in bridging contexts in Catalan and Russian when reference to unique individual entities is not clearly established from the previous discourse context but is potentially inferable from world knowledge and conceptual activation. We examined the speakers’ preference for definite versus indefinite NPs in Catalan, as well as the speakers’ preference for bare versus odin NPs in Russian, when there is only one referent (Ref1) in discourse and when a second referent (Ref2) is introduced. We thus aimed at answering the following research questions:
Q1: What is the naturalness of sentences with definite and indefinite NPs (in Catalan), and bare and odin NPs (in Russian) when reference is made to a discourse novel but possibly identifiable as unique Ref1? To address this question Task 1 was designed for Catalan and Task 2 for Russian.
Q2: What is the naturalness of sentences with definite and indefinite NPs (in Catalan), and bare and odin NPs (in Russian) for Ref1 when an (in)definite altre/drugoj NP for Ref2 is introduced in the subsequent discourse? To address this question Task 3 was designed for Catalan and Task 4 for Russian.
We also tackled two more questions:
Q3: Do speakers’ judgments change when participants (in both Catalan and Russian) are presented with only one referent or two?
Q4: What is the naturalness of an (in)definite NP for Ref1 when reference to Ref2 is introduced by a (in)definite NP (in Catalan)?
Notice that Q4 is more of an exploratory nature as there are no predictions that can be drawn from the previous literature in this regard.
Overall, we aimed (a) to check whether speakers’ judgments are in agreement with the predictions drawn from the Dual-Process Activation Model, the theories that address bridging relationships presented in Section 1, and the grammatical studies described in Section 2; (b) to check whether speakers’ naturalness judgments vary when only Ref1 is introduced or when reference is made to two disjoint referents; and (c) to find out whether the results obtained for a language with articles be correlated with those obtained for a language without articles. We also checked whether the speaker’s choice of nominal NPs is constant in different scenarios.
3.2 Design and materials
The experimental study consisted of four naturalness judgment tasks (two for Catalan speakers and two for Russian speakers). All test items started with a brief scenario-introducing sentence. Half of the items included a sentence with a sentence-initial singular (in)definite NP in Catalan (Task 1) or a singular bare/odin NP in Russian (Task 2), and another half of the items included, as well, one more sentence with a sentence-initial singular (in)definite altre NP in Catalan (Task 3) or drugoj NP in Russian (Task 4). 18 The same seven scenarios were used for each task of the study. See Appendix A for the complete set of materials, which are labeled: ambulance—nurse, private company—programmer, local shopping centre—guard, popular blog—author, school trip—teacher, butchery—butcher, and office—manager.
In Task 1 (for Catalan), the reference to a discourse-novel entity (Ref1) is made by means of either a definite or an indefinite NP (3), which gives two possible combinations for each experimental scenario, resulting in 14 items (2 conditions: def/indef × 7 scenarios).
(3) School trip—teacher example item
Al tren de rodalies hi havia un grup escolar: els estudiants jugaven a cartes i deien paraulotes. La/Una professora llegia el diari sense fer-los cas.
“In the commuter train there was a school group: the students were playing cards and swearing. The/A teacher was reading a newspaper not paying attention to them.”
In Task 2 (for Russian), the reference to a discourse-novel entity (Ref1) is made by means of either a bare NP or an NP specified by odin (4), which gives two possible combinations for each experimental scenario, resulting also in 14 items (2 conditions: bare/odin × 7 scenarios).
(4) Private company—programmer example item
Po raznym pričinam iz ètoj častnoj kompanii načali uvol’njat’sja sotrudniki. (Odna) programmistka uvolilas’ polgoda nazad bez ob’’jasnenij.
“For different reasons workers started to leave this private company. The/A programmer left a year and a half ago without any explanation.”
In Task 3 (for Catalan) a second referent is added. Although Ref1 is introduced by an (in)definite NP (exactly as in Task 1), Ref2 is introduced by an (in)definite altre NP. This gives four possible combinations for each of the seven contexts (5), resulting in 28 experimental items (2 conditions for Ref1: def/indef × 2 conditions for Ref2: def/indef × 7 scenarios). The items were pseudorandomized and divided into two distinct surveys in such a way that each participant had to rate only 14 items, ensuring that each group of participants were presented with the same number of definite and indefinite correlates for both Ref1 and Ref2 NPs. The distribution of the items was as follows: the participants of group 1 rated the items with indef1/indef2 (scenarios 1, 4, 5, 7), indef1/def2 (scenarios 2, 3, 6), def1/indef2 (scenarios 4, 5, 7), def1/def2 (scenarios 1, 3, 4, 6); the participants of group 2 rated the items with indef1/indef2 (scenarios 2, 3, 6), indef1/def2 (scenarios 1, 4, 5, 7), def1/indef2 (scenarios 1, 2, 3, 6), def1/def2 (scenarios 1, 5, 7). See Appendix A for the scenarios.
(5) Office—manager example item
La gent de l'oficina no té complexos. En absolut, cap complex. La/Una responsable treu les pinces i s’arrenca els pèls de la barbeta. L’/Una altra responsable menja pollastre al despatx.
‘The people in the office don’t have complexes, not at all. The/A manager takes tweezers out of her pocket and pulls out hairs from her chin. The other/Another manager eats chicken at the office.’
In Task 4 (for Russian), a second referent was added. Although Ref1 is introduced by a bare or an odin NP (exactly as in Task 2), Ref2 is introduced by a drugoj “other, another” NP (6), which gives two possible combinations for each experimental scenario, resulting in 14 items (2 conditions for Ref1: bare/odin × 1 condition for Ref2: drugoj × 7 scenarios).
(6) Local shopping centre—guard example item
Včera večerom v našem rajone ograbili krupnyj magazin. (Odin) oxrannik smotrel televizor i ničego ne slyšal. Drugoj oxrannik prosto spal.
“Last night, a shopping center was burgled in our area. The/A guard was watching TV. The other/Another guard was just sleeping.”
3.3 Procedure
A total of 263 participants, recruited through social media, volunteered in this study (194 women, 67 men, 2 other; Mage = 41.07, SD = 13.98): 50 completed Task1, 57 Task2, 108 (51 goup1, 57 group 2) Task3, and 50 Task4. A sociolinguistic questionnaire administered right before each task collected demographic data which included participants’ age, sex and level of studies, linguistic training, and global daily use of Catalan and Russian. See Appendix B for details.
Participants were given short instructions to rate every piece of discourse on a scale from “not natural at all” to “completely natural,” using a horizontal slider from 0 to 100 under each item. The tasks were conducted online using Alchemer software (https://www.alchemer.com). The median duration of each task was 5 min 54 s.
3.4 Results
In Section 3.4.1, results are given on the analyses of the forms used for Ref1, in which the effects of Language and NRefs (i.e., Number of Referents in the discourse) are also taken into consideration, thus addressing questions Q1, Q2, and Q3. In Section 3.4.2, results are given for the various (in)definite NP combinations for Ref1 and Ref2 in Catalan, thus addressing Q4.
3.4.1 Naturalness judgments for Ref1 and Ref2 forms in Catalan and Russian
Figure 1 shows the perceived naturalness of the two forms in which Ref1 NPs (definite/indefinite in Catalan and bare/odin in Russian) were presented along the four naturalness judgment tasks conducted (i.e., both in Catalan and in Russian, when only one referent or two referents are introduced in the discourse). This figure shows a preference for definite/bare NPs (over indefinite NPs) when reference is made only to Ref1, and a preference for indefinite NPs (over definite/bare NPs) when a Ref2 is also present in the discourse. These preferences are found in the two languages, though they are much clearer for Russian speakers than for Catalan speakers. In fact, Catalan speakers show no significant preferences for definite or indefinite NPs when presented in one-referent contexts (74.98% vs. 70.87% in the graph), and no significant preferences either when the indefinite NP is presented in a one- or a two-referent context (70.87% vs. 68.80% in the graph). Catalan and Russian speakers do not significantly differ regarding the acceptability of definite/bare NPs in either one-referent contexts (74.98% vs. 74.62% in the graph) or two-referent contexts (60.33% vs. 50.29% in the graph). The remaining contrasts presented in Figure 1 have been found to be significant in the statistical analysis presented below.

Perceived naturalness of (in)definite NPs in Catalan versus bare/indef. odin NPs in Russian for Ref1 when only one referent is presented in the scenarios (one ref.) and when a second referent Ref2 is present in the scenarios (two refs.).
The glmmTMB package in R was used for the analysis of the data obtained from the experiment. A beta mixed-effects model was run with the perceived naturalness as the dependent variable. 19 The independent variables Language (Catalan, Russian), NRefs (i.e., number of referents in the discourse; one, two), Def1 (i.e., the definiteness condition for Ref1: {def, indef} in Catalan, {bare, indef} in Russian), and all their possible interactions were set as fixed factors. The selected random-effects structure contained a random slope for NRefs × Def1 by subject plus a random slope for NRefs by scenario. 20 In the report below, the omnibus test results are provided (which have been obtained using the car package) plus the output of a series of pairwise tests performed with the emmeans package, which include a measure of effect size by using Cohen’s d.
Table 1 summarizes the results of the omnibus test. Significant results were found for two of the paired interactions and for the triple interaction.
Omnibus Test Results for the Naturalness Judgements for Ref1 and Ref2 Forms in Catalan and Russian.
The significant effects of this model can be better explained by describing the pairwise contrasts associated with the triple interaction Language × NRefs × Def1.
Concerning Def1 as the contrast field, in one-referent contexts, Russian speakers prefer bare NPs over odin NPs (which should be considered a marked indefinite expression) (d = 1.391, p < .001), but Catalan speakers show no significant preference for definite or indefinite forms (which should be considered equally well-formed in the contextual settings under study) (d = 0.031, p = .0.843). In two-referent contexts, both groups maintain the significant preference for indefinite NPs (over definite/bare NPs) described above, though the effect size for Russian speakers (d = −1.397, p < .001) is almost four times greater than that of Catalan speakers (d = −0.309, p = .027). In the significant double interaction NRefs × Def1, in which the results of the two languages are taken together, we find that definite/bare NPs are preferred over indefinite NPs in one-referent contexts (d = 0.711, p < .001), and a preference for indefinite NPs over def/bare NPs in two-referent contexts the (d = −0.853, p < .001).
Looking at NRefs as the contrast field, definite/bare NPs were given higher naturalness rates when they appear in one-referent contexts than in two-referent ones by the two language groups, though the effect size is almost the double for Russian speakers (d = 1.267, p < .001) than for Catalan speakers (d = 0.625, p = .004). Focusing on indefinite NPs, Russian speakers accepted them more in two-referent contexts than in one-referent contexts (d = −1.521, p < .001), but Catalan speakers showed no significant preference (d = 0.285, p = .188). The two double interactions found to be significant include NRefs; in Language × NRefs, in which the results of the two definiteness strategies are taken together, Catalan speakers gave greater naturalness rates to contexts with only one referent than to those with two (d = 0.455, p = .016), whereas Russian speakers show an opposite-direction nonsignificant pattern (d = −0.127, p = .484); in NRefs × Def1, in which the results of the two languages are taken together, definite/bare NPs are more accepted in one-referent contexts than in two-referent contexts (d = 0.946, p < .001), whereas the inverse pattern is found for indefinite NPs (d = −0.618, p < .001).
Concerning Language as the contrast field, the two linguistic communities show no significantly different naturalness rates when examining the def/bare NPs, in either one-referent contexts (d = −0.374, p = .069) or two-referent contexts (d = 0.268, p = .244). However, the two communities differ when looking at indefinite NPs: whereas in one-referent contexts speakers of Catalan provide higher naturalness rates than speakers of Russian (d = 0.985, p < .001), in two-referent contexts speakers of Russian provide higher naturalness rates than speakers of Catalan (d = −0.821, p < .001).
3.4.2 Naturalness judgments for (in)definite NP combinations in Catalan
In line with the theoretical hypothesis presented above, according to which altre “other” encodes disjoint reference and forces the interpretation that two subsequent NPs refer to distinct entities, it was expected for Catalan that the most natural discourse would be one in which an indefinite NP (Ref1) was followed by an indefinite altre NP (Ref2). Similarly, the least natural discourse would be the one in which a definite NP (Ref1) was followed by a definite altre NP (Ref2). Recall that these expectations come from the fact that an indefinite NP does not presuppose uniqueness of the referent and, thus, a second indefinite altre NP may be introduced to the reference domain without creating any conflict for the disjoint reference.
Figure 2 confirms experimentally the main predictions that follow from previous grammatical studies and shows the existence of slight differences among the four possible combinations of nominal forms chosen for Catalan Ref 1 and Ref 2. Still, it provides evidence that the difference in the naturalness of the four possible combinations is small, and that in a dynamic update of discourse information native speakers maximize discourse coherence (Asher & Lascarides, 1998) between (in)definite NPs for Ref1 and Ref2 over default world knowledge and grammatical expectations.

Perceived naturalness of (in)definite NPs for Ref1 and Ref2 in Catalan.
A beta mixed-effects model again was run with the perceived naturalness as the dependent variable. This time, the independent variables were Def1 (i.e., the definiteness condition of the first referent in the discourse; {def, indef}), Def2 (i.e., the definiteness condition of the second referent in the discourse; {def, indef}), and their paired interaction were set as fixed factors. The selected random-effects structure contained a random slope for both Def1 and Def2 by subject plus a random intercept for item.
The main effect of Def1 was found to be significant, χ2(1) = 13.683, p < .001, interpretable such that—independently of the form of Ref2 (but in presence of Ref2)—Catalans provided higher naturalness rates when Ref1 was indefinite than when it was definite (d = −0.346, p < .001). The rest of the fixed factors failed to reach significance: Def2, χ2(1) = 1.462, p = .227 and also the paired interaction Def1 × Def2, χ2(1) = 2.766, p = .096. The difference between indefinite and definite forms for Ref2 in a context in which Ref1 was definite, modeled as one of the paired contrasts of the paired interaction (66.84 vs. 70.80 in Figure 2 above), failed to reach significance as well (d = −0.211, p = .053).
4 General discussion and conclusion
In this experimental study, we investigated the choice of nominal forms that native speakers of a language with articles (Catalan) and native speakers of a language without articles (Russian) make in different contextual settings that involve bridging inferences and reference management. More specifically, we focused on the choice of singular nominal expressions (i.e., (in)definite NPs and bare/odin NPs for Ref1, (in)definite + altre + NPs and drugoj + NPs for Ref2) occurring in sentence-initial/topic position. The main purpose was to examine speakers’ preferences of nominal forms in discourse comprehension by comparing the results provided for definite/indefinite/bare NPs in two typologically different languages. This research connects with Burkhardt’s (2006) and Schumacher’s (2009) studies on the processing of definite and indefinite NPs, respectively. 21
The findings of the current study are discussed with respect to the predictions made in semantic studies (Heim, 1982, 2011; Löbner, 1985, 1998 a.o.), the Dual-Process Activation Model on reference management (Brocher et al., 2016; Brocher & von Heusinger, 2018), and linguistic and psycholinguistic approaches to bridging inferences in discourse comprehension (Asher & Lascarides, 1998; Myers et al., 2010; Poesio & Vieira, 1998).
Similar to what is usually claimed for other languages with articles, Catalan indefinite NPs are generally assumed to introduce new referents into the discourse, while definite NPs are assumed to introduce known/familiar referents. This generalization also applies when the speaker aims at establishing relationships between two co-referential (in)definite NPs in discourse. In bridging contexts it has been suggested that, rather than a binary relation between a first-mention indefinite NP and a definite co-referential one, it appears to be more appropriate to postulate a scale ranging from those definite NPs that are uniquely interpretable by means of world knowledge (the ambulance in (1), which can be considered a self-sufficient definite description) to those definite NPs that depend on a previous anchor (the nurse in the same discourse). What is clear is that definites are not necessarily discourse-old and hearer-old but can be frame-activated. Concerning Russian, grammatical studies have acknowledged that bare singular NPs in argument position can receive a definite or an indefinite interpretation (Švedova, 1980/1982). Although the definite interpretation of bare NPs has been assumed to arise in parallel conditions under which the definite article is used in languages with articles (i.e., for known, familiar, salient referents), the option of an indefinite reading for bare singular NPs has been claimed not to be excluded: bare NPs can be interpreted as indefinite when they belong to the comment (Borik, 2016; Geist, 2010). Moreover, this indefinite interpretation of bare NPs is to be contrasted with the meaning of indefinites preceded by odin, which are considered in the literature to be the ones preferred by speakers for specific aboutness topics. What remains to be investigated from a contrastive perspective is the specific choice of nominal forms in a dynamic approach to discourse and utterance interpretation, especially when various bridging relations are activated.
Our findings from Tasks 1 and 2 provided clear evidence that in the seven scenarios used, which only activated associative anaphora relationships for Ref1, Catalan speakers show no statistical preference for definite or indefinite NPs, while Russian speakers show a statistical preference for bare NPs over odin NPs. These results support the conclusion that Ref1 had already been conceptually pre-activated in the previous context and in this sense was discourse-old, but at the same time it introduced a new entity into the discourse and therefore was perceived as discourse-new. This accounts for the fact that Ref1 can be expressed by either a definite or an indefinite NP. Therefore, we confirm previous (corpus) studies on languages with articles (Burkhardt, 2006; Fraurud, 1990; Poesio & Vieira, 1998; Schumacher, 2009). As for Russian the preference for bare nominals supports the hypothesis that they can be interpreted as either conveying definiteness or the lack of it (Švedova, 1980/1982). Our results provide empirical support for this hypothesis when the perceived naturalness of bare NPs in Russian (74.62%) is compared with the naturalness of definite and indefinite NPs in Catalan (74.98% and 70.87%, respectively). Unlike bare nominals, the lower perceived naturalness of odin phrases (48.61%) confirms the special status of this indefinite specific lexical item for the introduction of novel referents. Our results support the prediction made in the literature that the use of bare NPs is generally felicitous for discourse-new referents, which could or could not be interpreted as aboutness topics, while the use of odin NPs was dispreferred, a result that is in accordance with the claim of the non-obligatoriness of this item ((Borik, 2016; Padučeva, 1985). Note that in all the seven experimental scenarios the sentence-initial bare subject can be interpreted as a topic: even though it is discourse-new, it is anchored to the preceding discourse by means of bridging, which explains a possible definiteness-like interpretation of the bare NP. An important conclusion is that inferred (inferable) NPs take the default definite/bare nominal form.
Overall, the results obtained for Catalan and Russian when reference is made only to Ref1 support the construction of bridging inferences at the time of discourse comprehension (Myers et al., 2010, and references therein). One might also consider that definite and indefinite descriptions deserve a unitary semantic analysis as hypothesized by Szabó (2000) and Ludlow and Segal (2004), for whom the semantic import of both definite and indefinite descriptions is existential quantification, and implications of uniqueness, salience and familiarity for definite descriptions, parallel to implications of non-uniqueness, lack of salience and novelty for indefinite descriptions are basically pragmatic, contextually driven.
Our findings from Tasks 3 and 4, when Ref2 was present, revealed a preference for indefinite NPs. Specifically, Catalan speakers showed that in the presence of Ref2 (and independently of its form) Ref1 was preferred to be indefinite. Russian speakers showed a significant preference for odin NP in the presence of a subsequent drugoj NP. We conclude that the presence of a disjoint marker (altre/drugoj) constrains indefiniteness.
These results are also interesting because they support the conclusion that different sentence-initial/topic NPs introduce different domains of quantification. As for Catalan the fact that sequences of the form la. . .una altra (def-indef) and la. . .l’altra (def-def) are not discarded, but only dispreferred, is accounted for if one assumes that there is no unique domain for both nominal descriptions, even when a definite article is used in the first mention (Szabó, 2000). As for Russian, the statistical preference for odin NPs in the presence of drugoj NPs also supports that odin NPs, in contrast to bare singular NPs, introduce a domain of quantification that opposes a specific referent to another referent introduced in a subsequent sentence. In addition, we are committed to claiming that the event that involves odin NPs is distinct from the event that involves drugoj NPs.
We proceed to our research question whether speaker’s judgments change when participants (in both Catalan and Russian) have access to only one referent or two. Taken together the results of the four tasks (Figure 1) we conclude that both Catalan and Russian show the following: in one-referent contexts, definite/bare NPs are preferred over indefinite ones, while in two-referent contexts the preference is for indefinite/odin NPs over definite/bare ones. This suggests that speakers of both languages give preference to an overtly marked indefinite when its non-uniqueness is unambiguously established in the presence of a second disjoint referent.
We have also shown that differences (both in one-referent and in two-referent contexts) are higher in Russian than in Catalan, which follows from the fact that the two languages have different nominal systems. The grammatical difference between a bare NP and an NP specified by odin for a Russian speaker (both in terms of distribution and meaning) is higher than the difference between a definite and an indefinite NP for a Catalan speaker. Odin NPs are distinct from indefinite NPs in languages with articles, whereas bare NPs have parallels to both definite and indefinite NPs. The Russian pattern in the two-referent condition (i.e., bare NPs are accepted up to 50.29% when followed by drugoj NPs) leads us to further conclude that bare NPs have a reading consistent with an indefinite interpretation (Geist, 2010; cf. Dayal, 2004).
Concerning the final question of our research on the naturalness of (in)definite NPs combinations in Catalan, our findings confirm the following hierarchy of subsequent nominal expressions for Ref1 and Ref2: indefinite—indefinite > indefinite—definite > definite—indefinite > definite—definite (Figure 2), thus enriching the general pattern of distribution of definite and indefinite NPs expected from grammatical studies. The puzzle that these results raise is that definite—indefinite and definite—definite are given around a 60% of acceptability. Note, however, that this puzzle disappears once we assume that the semantic contribution of a definite and an indefinite article is the same: existential quantification, and that the uniqueness implications usually associated with definite descriptions are not semantic in nature but pragmatically inferred by constructing bridging inferences between subsequent events.
In sum, in this article, we have compared two typologically different languages with respect to the choice of nominal forms in preverbal position. In contexts where the referent cannot necessarily be identified as unique or non-unique, Russian speakers prefer to use a bare nominal, and Catalan speakers vacillate between a definite and an indefinite NP. In those contexts where the non-uniqueness of the referent is implied due to the co-presence of a second disjoint referent, speakers of both languages under study prefer to use an NP specified for indefiniteness—preceded by an indefinite article in Catalan and by an indefinite determiner encoding specificity in Russian. For speakers of both languages, this is the optimal choice for referring to two NPs that are disjoint in reference; for speakers of Catalan, there are three additional suboptimal combinations. Finally, this study has shown the existence of intralinguistic patterns and cross-linguistic correlations in the use of NPs when we compare a language that has articles with a language that lacks them. Overall, our findings on typologically different languages enlarge our knowledge of how languages work from a dynamic perspective to information updating.
Footnotes
Appendix A
Appendix B
Sociolinguistic data.
| Task 1 | Task 2 | Task 3: group1 | Task 3: group2 | Task 4 | ||
|---|---|---|---|---|---|---|
| Language | Catalan | Russian | Catalan | Catalan | Russian | |
| Total number of participants | 50 | 55 | 51 | 57 | 50 | |
| Duration of the task (min) (M, SD) | 6.49 (2.30) | 6.22 (3.65) | 6.70 (2.76) | 7.13 (3.41) | 6.24 (3.64) | |
| Age in years (M, SD) | 45.37 (15.94) | 32.22 (5.05) | 43.31 (12.67) | 50.57 (16.19) | 33.42 (4.73) | |
| Gender | Female | 48 | 37 | 31 | 34 | 44 |
| Male | 7 | 12 | 20 | 22 | 6 | |
| Other | 0 | 1 | 0 | 1 | 0 | |
| Studies | Compulsory | 0 | 1 | 3 | 1 | 0 |
| Post-comp. | 1 | 8 | 8 | 9 | 0 | |
| BA ongoing | 2 | 4 | 1 | 9 | 1 | |
| BA finished | 24 | 16 | 17 | 17 | 24 | |
| MA or more | 28 | 21 | 22 | 21 | 25 | |
| Linguist | No | 22 | 27 | 27 | 36 | 19 |
| Yes | 33 | 23 | 24 | 21 | 31 | |
| Global daily use of L1 (M, SD) | 75.82 (23.97) | 68.06 (31.39) | 72.18 (26.30) | 83.34 (19.47) | 61.74 (33.28) | |
Ethical approval
The experimental study was carried out following the regulations of the Ethics Committee on Animal and Human Experimentation of the Universitat Autònoma de Barcelona, under the approved experimental protocol CEEAH—4442.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study is supported by the Spanish MINECO (grant POID2020–112801GB–100), by the Generalitat de Catalunya (grant 2021SGR00787), and Margarita Salas (Next Generation—EU) grant awarded to the first author.
