Modals and Quasi-Modals in English World-Wide

Abstract

This study explores the distribution of modals and quasi-modals in the twenty English dialects represented in the Global Web-based English Corpus (GloWbE). Intervarietal trends are observed across and within the Englishes of the “Inner circle” and “Outer circle.” Ratios calculated for onomasiological pairings of modal expressions suggest that Inner circle varieties tend to be associated more closely than Outer circle varieties—and “epicentral” varieties more so than non-epicentral ones—with trends of frequency change that have been identified in previous diachronic studies of the reference varieties, British and American English. A further type of change is revealed by semantic analysis: Inner circle varieties tend to embrace epistemic modality more readily than Outer circle varieties. Possible explanations considered for intervarietal differences include areal proximity, epicentrality, evolutionary status, and colloquiality.

Keywords

modal quasi-modal World Englishes corpus linguistics colloquialization nativization

1. Introduction

The aim of the present study is to provide a more comprehensive and explanatorily adequate account of the modals and quasi-modals in World Englishes (WEs) than has hitherto been published. The principal means by which modality—which embraces such notions as possibility, necessity, ability, obligation, and permission—is expressed in English is the class of modal auxiliaries (often referred to simply as “modals”), but increasingly commonly by “quasi-modals” (periphrastic expressions of the type have to and be going to, also referred to as “semi-modals”). Salient characteristics of the quasi-modals are their semantic similarity to the modals and idiomaticity; for example, the modal meaning of be going to, as in (1), extends the literal motional meaning associated with going, as in (2) (see further Westney 1995; Krug 2000; Collins 2009a).

(1) I’m going to leave the discussion here (GloWbE, NZ)

(2) Oh no, I’m going to Paris this year (GloWbE, AU)

English modals and quasi-modals have attracted a great deal of scholarly interest, with synchronic corpus-based and corpus-informed descriptive studies mostly targeting the “reference” varieties, British English (BrE) and American English (AmE) (e.g., Coates 1983; Westney 1995; Krug 2000; Leech, Hundt, Mair & Smith 2009; Collins 2009a). More recently, the regional/varietal scope has been expanded in studies both of sets of WEs (e.g., Collins 2009b; Collins & Yao 2012; Deuber, Biewer, Hackert & Hilbert 2012; van der Auwera, Noël & de Wit 2012; Loureiro-Porto 2019) and of specific WEs (e.g., Diaconu 2012; Collins 2014; Collins, Borlongan & Yao 2014; van Rooy & Wasserman 2014; Noël & van der Auwera 2015).

Most commonly used as a source of data in the WEs-focused studies are the parallel corpora of the International Corpus of English (ICE) collection. However, the ready availability since 2014 of the Global Web-based English corpus (GloWbE) online (Davies 2013) has opened up new possibilities—exploited in the present study—for the intervarietal study of modality in English world-wide, with its twenty subcorpora each representing a different WE. GloWbE unquestionably has the advantage of size over ICE: GloWbE’s subcorpora range in size from 35 to 387 million words compared to the one-million words of each ICE corpus. However, it is important to be aware of its limitations and deficiencies (discussed in section 3).

The structure of the paper is as follows. Section 2 locates the study in its scholarly context. Section 3 outlines the features and composition of the primary data-source, GloWbE, and introduces the methodology used in the study. Section 4 provides and discusses the findings for fourteen pairs of modal expressions. Finally, section 5 presents the conclusion.

2. Background

The scholarly context for the present study is the WEs paradigm. Accordingly, it appeals to influential models of WEs which legitimize the formulation of predictions to test against the study’s findings: Kachru’s (1985) “Concentric circles” model; Schneider’s (2003, 2007) “Dynamic” model; and Mair’s (2013) “World system of Englishes” model. Factors invoked in these models that are relevant to interpreting the findings of the study include “areal proximity,” “epicentrality,” and “evolutionary status.” In addition to these I outline three considerations which, despite not being concerned specifically with WEs, have the potential to provide insights into the study’s alternation-based findings: diachronic trends, genre distribution, and modal semantics.

Kachru’s (1985) Concentric circles model warrants the prediction that the first -language or “native” varieties of the Inner circle (IC) will share structural properties that may differ from those that are found in the institutionalized second-language varieties of the Outer circle (OC), where they are subject to factors such as second-language acquisition processes and differential norm orientations. Within Kachru’s (1985) IC and OC, there are subgroupings of varieties determined by their areal proximity (e.g., the Englishes of India, Pakistan, Sri Lanka, and Bangladesh in South Asia). The member varieties of these regions may exhibit similarities that differentiate them from varieties of other geographical regions, no doubt a by-product of the role of language contact in driving the spread of linguistic innovations and usages. Areal proximity is furthermore related to the phenomenon of epicentrality in language, the attainment by a variety of demographic, historical, and sociolinguistic prominence, thereby enabling it to serve as a normative model for speakers of other—typically neighboring—varieties (cf. Peters 2009; Hundt 2013; Gries & Bernaisch 2016). The notion of epicentrality underpins Mair’s (2013) World system model of WEs, in which he posits a hierarchy of relationships between WEs in a globalized world, a hierarchy in which varieties higher up are more likely to influence those lower down than vice versa. According to Mair (2013), AmE has become a “hypercentral” model in English world-wide today (ousting the merely “supercentral” BrE), and there are further cases of epicentrality lower in the hierarchy, in areal zones of the type mentioned above.

A further source of hypotheses is the evolutionary status of varieties, a concept famously associated with Schneider’s (2003, 2007) Dynamic model of postcolonial Englishes. Schneider (2003, 2007) posits five developmental phases (“foundation,” “exonormative stabilisation,” “nativisation,” “endonormative stabilisation,” and “differentiation”), with the evolutionary status of OC varieties being determined by the positions they occupy along the cycle of phases, or of IC varieties by the time when they completed their passage through the cycle. Differences in the evolutionary status of varieties may be reflected in their linguistic similarities to and differences from the parent variety.

Consider now the three more general sources of explanation for the study’s findings that were introduced above. The first consideration is diachronic variation with the modals and quasi-modals (see further Ziegeler 2016). Research by Leech, Hundt, Mair, and Smith (2009) based on the Brown family of corpora provided evidence of a declining tendency in the frequency of the modals and a concomitant rise in the frequency of quasi-modals, in BrE and AmE writing between the early 1960s and 1990s. These findings, presented in Table 1 (BrE and AmE data from Leech, Hundt, Mair & Smith 2009:74, 97), furthermore indicate a tendency for the higher frequency modals (notably will, would, can, and could) to have undergone smaller changes than lower frequency modals such as must, shall, ought, and need. Leech, Hundt, Mair, and Smith’s (2009) research has prompted quantitative synchronic investigations of the distribution of the modals in other English varieties, in which a relative paucity of modal tokens and a relative abundance of quasi-modal tokens are sometimes interpreted as conveying an apparent-time implication of change (Collins 2009b; Collins & Yao 2012). Accordingly, I explore the potential relevance of attested diachronic trends observed in the literature to the frequency findings of the present synchronic study.

Table 1.

Percentage Change of the Modals in Various Corpora

Modal	Pmw in GloWbE	LOB~ FLOB	Brown~ Frown	Quasi-modal	Pmw in GloWbE	LOB~ FLOB	Brown~ Frown
Will	2163	−4.0	−11.1	Have to	854	+9.0	+1.9
Would	2365	−11.5	−6.1	Have got to	14	−34.1	+15.6
Can	2509	+3.1	−1.5	Be going to	281	−1.2	+53.7
Could	1540	+1.5	−6.8	Want to	712	+18.5	+70.9
Should	927	−11.8	−13.5	Need to	660	+266.0	+123.2
May	763	−17.5	−32.4	Be able to	409	+0.8	+5.8
Might	525	−17.8	−4.5	Had better	5	−26.0	−17.1
Must	430	−29.0	−30.4	Be supposed to	51	+113.6	+14.6
Shall	189	−43.6	−43.3
Ought	42	−43.7	−29.0
Need	9	−42.1	−12.5

Note: The GloWbE frequency and percentages for modal need are based on negative uses alone (need not and needn’t). Reduced forms (raw GloWbE frequencies in brackets) were not included: gonna (56,117); wanna (27,220); gotta (20,839); hafta (190).

It is necessary to enter a caveat regarding the extrapolation of putatively diachronic generalizations from frequencies extracted from a synchronic corpus such as GloWbE: all such extrapolations must be considered provisional, ultimately requiring empirical validation in future research using real-time historical WEs corpora. As argued in section 3, comparisons of frequencies from the two text categories of “Blogs” and “General” in GloWbE offer some insights into apparent-time change, albeit less compelling than extrapolations made on the basis of comparisons of speech versus writing frequencies in the ICE corpora. Another caveat is that rates and directions of change in BrE and AmE, as identified by Leech, Hundt, Mair, and Smith (2009) and others, may differ from those under way in other varieties (see, e.g., Mukherjee & Schilk 2012).

The second general consideration concerns the potential impact on GloWbE frequencies of colloquialization, a powerful discourse-pragmatic agent of grammatical change in English that is characterized by Leech, Hundt, Mair, and Smith (2009) as a stylistic shift that has been operating to make written genres more like spoken ones since the mid-twentieth century. Collins and Yao (2013), who define colloquialization as the spreading of colloquial features from baseline casual face-to-face conversation to other—written and spoken—genres, show that grammatical developments in a number of WEs may be affected by differences in the degrees to which speakers are (in)tolerant of colloquialism and informality. Witness, for example, the contribution to the ongoing grammaticalization of the quasi-modals that is to be found in instances where the infinitival marker to is incorporated into the preceding verb. Such reductions are found not only in informal speech, but also in informal styles of writing (typically in representations of casual speech) where they are represented by non-standard spellings of the type gonna, gotta, wanna, and hafta (Huddleston & Pullum 2002:1616). In the present study, the source of quantitative information about the colloquiality of the modals and quasi-modals that is invoked in discussing putatively colloquialization-influenced variation is the generic division in GloWbE between General texts and Blogs.

The third general consideration is modal semantics. I operate here with a binary semantic distinction between “root” and “epistemic” modality, as used, among others, by Coates (1983, 1995) and Depraetere and Reed (2021). The distinction is exemplified by the different uses of must in (3) and (4), respectively root and epistemic.

(3) Imran Khan is the right choice and he must be given a chance. (GloWbE, PK)

(4) sometimes the things he says I think he must be crazy (GloWbE, PK)

Root modality deals with the necessity or possibility of the actualization of a situation, two major subtypes recognized by, for example, Palmer (1990), Huddleston and Pullum (2002), and Collins (2009a): (i) deontic modality (in which the factors impinging on the actualization involve some kind of authority, as when a person, rule, or convention is responsible for the imposition of an obligation or granting of permission); and (ii) dynamic modality (in which the factors are intrinsic to the subject-referent—such as ability or volition—or generally circumstantial). By contrast, epistemic modality deals with the speaker’s judgment that the proposition underlying the utterance is true, located on a scale ranging from weak possibility (“It may be so”) to strong necessity (“It must be so”). It is the distribution of epistemic modality that will be the focus of the meaning-based analysis in this study.

3. Data and Method

The data source for the present study is GloWbE, a web-based corpus comprising 1,885,632,973 words of both General texts (e.g., newspapers, magazines, company websites) and Blogs from 1.8 million web pages from twenty different countries (Davies & Fuchs 2015). The number of tokens of modals and quasi-modals for each variety far exceeds that available in studies based on the one-million-word ICE and Brown corpora (cf. Peters, Collins & Smith 2009; Leech, Hundt, Mair & Smith 2009); even relatively uncommon expressions such as modal need and quasi-modal had better yield a sufficient number of tokens in GloWbE to sustain viable analyses. It must be conceded that, limited as it is to web-based texts, GloWbE lacks the representativeness of ICE and the Brown family, whose designs incorporate a wide variety of written registers, and, in the case of ICE, spoken registers as well (Loureiro-Porto 2017). The distinction between General and Blog texts in GloWbE bears some similarities to that between spoken and written texts in ICE, with Davies and Fuchs (2015:3-4) claiming an ICE-like 60 percent versus 40 percent split in GloWbE between General texts from relatively formal genres such as newspapers, magazines, and company websites (corresponding to the more formal written texts of ICE) and (informal) Blogs (corresponding in several respects to transcriptions of spoken language in ICE). Their claim requires some qualification. First, according to information provided at english-corpora.org, the ratio of Blogs to General ranges from 0.52:1 in the United States subcorpus to 0.25 in the Ireland subcorpus, with an average that is supported by calculations made by Loureiro-Porto (2017:455) of approximately 0.44:1. Secondly, there has been considerable debate over the extent to which the informality of blogs resembles that of the spoken word, with participants generally prepared to accept that while blogs may not be equivalent to speech they are nevertheless “speech-like” in certain respects (see Nelson 2015; Loureiro-Porto 2017; Mazzon 2019). In section 4, I appeal to frequency differences between GloWbE Blogs and General texts to test for the possible influence of colloquialization—and “anti-colloquialization” (Collins & Yao 2018; Kruger & Smith 2018)—on the distribution of the modals and quasi-modals.¹

In the present study the distinction between Blogs and General texts in GloWbE has been exploited as a source of information about the colloquiality of the modals and quasi-modals and the potential influence of colloquialization. There is insufficient space in this paper for a comprehensive account of colloquialization effects for every alternation presented in section 4, so commentary will be limited to a selection of cases where there is a notable preference for Blogs over General texts.

Table 2 presents the macro-generic distribution of the modals and quasi-modals, with per-million-word (pmw) frequencies derived from the General and Blogs sections of GloWbE. What it shows is that the lower frequency modals tend to be favored more in General texts than Blogs, suggesting that their declining diachronic fortunes may be influenced by their “anti-colloquiality,” that is, their greater preference for features that are typical of writing than for those that are typical of speech (cf. Collins &Yao 2018; Kruger & Smith 2018). By contrast, the distribution of the higher frequency modals is skewed toward Blogs, their informality and colloquiality probably helping them to withstand the declining trend of their lower frequency counterparts. The distribution of the quasi-modals is also skewed more toward the Blogs, suggesting that colloquialization is an important factor in their rising fortunes.

Table 2.

Macro-Generic Distribution of the Modals and Quasi-modals in GloWbE (Frequencies pmw)

Modals	Blogs	General	Percent blogs (%)	Quasi-modals	Blogs	General	Percent blogs (%)
Will	3607	3421	51.2	Have to	1032	900	53.4
Would	2477	2328	51.5	Have got to	6.4	5.5	53.7
Can	3311	3094	51.7	Be going to	142	115	55.2
Could	1285	1189	51.9	Need to	745	637	53.9
Should	1205	1245	49.1	Be able to	380	343	52.5
May	892	1129	43.7	Had better	2.0	2.0	50.0
Might	520	468	52.5	Be supposed to	60	55	52.1
Must	475	609	43.8	Be about to	43	38	53.0
Shall	76	242	23.8
Ought	32	31	50.7
Need	14	17	45.1

It is important to keep in mind that GloWbE is not designed to be a carefully curated and generically-representative corpus like the corpora of the ICE and Brown family collections. Despite inevitably being, as a web-based corpus, somewhat “quick and dirty” (Isingoma & Meierkord 2019:311), GloWbE is a highly attractive resource for studies of the present kind, with its massive size, its inclusion of a large number of WEs, the informality of its texts, and the user-friendliness of an online platform providing search tools that enable a wealth of quantitative information to be readily accessed. Most importantly, the relatively informal nature of the GloWbE texts is arguably conducive to the study of diachronically volatile categories such as the modals and quasi-modals, which as we shall see, are prone to the influence of drivers of change that are particularly associated with more informal language, notably colloquialization.

The composition of GloWbE is presented in Table 3, with labels for the country of origin of each of the twenty subcorpora, the English varieties they represent, and the number of words each one contains. Also included are subclassifications used in the study: Kachru’s (1985) IC versus OC distinction, along with further primarily regionally-based subgroupings (of the IC into American, European, and Oceanic countries; and of the OC into South Asia [SA], South-East Asia [SEA], Africa [Afr], and the Caribbean [Carib]). The neatness of this picture is complicated to some extent by the OC untypicality of JamE, SingE, and SthAfrE, all of which enjoy a good representation of first-language English speakers. I henceforth use the GloWbE “country labels” when referring to the findings for the particular GloWbE subcorpora, as opposed to the abbreviated “variety labels,” which I use when extrapolating from the findings for subcorpora to varieties in general.

Table 3.

GloWbE Composition and Word-Count

WEs/Kachru classification		Country labels	English varieties	Variety labels	Word count
Inner circle	America	US	American	AmE	386,809,355	1,239,817,686
	America	CA	Canadian	CanE	134,765,381
	Europe	GB	British	BrE	387,615,074
	Europe	IE	Irish	IrE	101,029,231
	Oceania	AU	Australian	AusE	148,208,169
	NZ	New Zealand	NZE	81,390,476
Outer circle	South Asia	IN	Indian	IndE	96,430,888	234,039,410	645,815,287
		LK	Sri Lankan	SLE	46,583,115
		PK	Pakistan	PakE	51,367,152
		BD	Bangladesh	BDE	39,658,255
	South-East Asia	SG	Singapore	SingE	42,974,705	169,095,257
		MY	Malaysian	MalE	42,420,168
		PH	Philippine	PhilE	43,250,093
		HK	Hong Kong	HKE	40,450,291
	Africa	ZA	South African	SthAfrE	45,364,498	203,016,954
		NG	Nigerian	NigE	42,646,098
		GH	Ghanaian	GhanE	38,768,231
		KE	Kenyan	KenE	41,069,085
		TZ	Tanzanian	TanzE	35,169,042
	Caribbean	JM	Jamaican	JamE	39,663,666	39,663,666
	Total				1,885,632,973

Note: Frequencies are as in https://www.english-corpora.org/glowbe_corpus.asp.

The reference varieties, BrE and AmE, exert influence that extends well beyond their geographical neighbors, IrE and CanE respectively. The linguistic sway of BrE reflects its historical status as colonial “parent” in the evolution of postcolonial English varieties, while that of AmE reflects the status of the USA latterly as an international superpower. The remaining two IC varieties, AusE and NZE, have closely related histories and are well-established in the Southern Hemisphere. Each of the three multivariety OC subgroups—SA, SEA, and Afr—contains an extensively standardized, influential, and internationally well-known epicentral variety: IndE, SingE, and SthAfrE, respectively. Extended discussion of fourteen of the postcolonial Englishes represented in GloWbE is provided in Schneider (2007), and a subset of these in Schneider (2014).

Turning to methodology, the analytical approach adopted in the present study is premised on the concept of “alternation” between competing grammatical items and categories. Such alternates are understood to be semantically overlapping in the sense that they compete with, and can usually be substituted for, one another. This does not mean that they are semantically identical in every respect, a situation that—even if it were possible—would result in a level of redundancy that would be intolerable in any natural language. In cases of mutual substitutability, one generally finds a difference, even if subtle or elusive, in connotative and/or associative meaning, as in the almost identical contexts of (5) and (6) where may arguably has a slightly more formal overtone than the otherwise semantically equivalent might.

(5) In addition, they may possibly want to slow down some of the lead follicles. (GloWbE, IN)

(6) Not a route to everything they might possibly want to do that the device or software is capable of. (GloWbE, BG)

My alternation-based approach is thus congruent with research—mostly informed by the Labovian “language variation and change” and the “corpus-based variationist linguistics” models (Szmrecsanyi 2017)—which has been conducted on such phenomena as dative alternation and genitive alternation in language use and acquisition (e.g., Heller, Szmrecsanyi & Grafmiller 2017; Szmrecsanyi et al. 2017), and with research on recent diachronic variation involving “onomasiological” competition between alternating constructions (e.g., Aarts, Close & Wallis 2013; Mair & Leech 2021). Accordingly, in this study I eschew the approach customarily followed in corpus-based WEs studies of generalizing from normalized frequencies, in favor of one based on ratios representing the proportionalities for putatively competing modal expressions. Tables 4 to 17 display these ratios and, in keeping with the shading system used in the frequency tables generated by the GloWbE tools—where the cells for high frequency tokens are shaded—such frequencies are bolded in this and all subsequent tables.²

Table 4.

Have to versus Must in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Have to	988	924	981	928	902	903	1002	905	794	907	1029	1030	946	836	987	843	748	824	727	962
Must	488	599	487	634	563	606	571	666	632	600	527	671	633	603	802	838	689	620	532	645
Ratios	2.0	1.5	2.0	1.4	1.6	1.4	1.7	1.2	1.2	1.5	1.9	1.5	1.4	1.3	1.2	1.0	1.0	1.3	1.3	1.4
	1.6						1.4				1.5				1.1					1.4
	1.6						1.3

Table 5.

Have to versus Have Got to in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Have to	988	924	981	928	902	903	1002	905	794	907	1029	1030	946	836	987	843	748	824	727	962
Have got to	23	17	26	15	23	21	11	9	8	9	14	13	16	12	16	13	8	9	10	11
Ratios	42.9	54.3	37.7	61.8	39.2	43.0	91.0	100.5	99.2	100.7	73.5	77.9	59.1	69.6	61.6	64.8	93.5	91.5	72.7	87.4
	46.4						97.8				70.3				76.8					87.4
	46.4						78.2

Table 6.

Should versus Ought to in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Should	1192	1092	1244	1231	1174	1127	1372	1431	1726	1213	1143	1259	1151	1262	1230	1609	1246	1201	1045	1199
Ought to	38	26	32	21	30	25	31	29	26	30	20	27	37	32	25	44	37	30	30	29
Ratios	31.3	42.0	38.8	58.6	39.1	45.0	44.2	49.3	66.3	40.4	57.1	46.6	31.1	39.4	49.2	36.5	33.6	40.0	34.8	41.3
	42.4						50.0				43.5				38.8					41.3
	42.4						43.5

Table 7.

Should versus Had Better in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Should	1192	1092	1244	1231	1174	1127	1372	1431	1726	1213	1143	1259	1151	1262	1230	1609	1246	1201	1045	1199
Had better	8.9	6.2	7.3	5.2	8.2	6.6	4.1	3.9	2.4	1.9	6.8	6.0	5.8	7.3	4.3	5.6	4.5	4.6	4.0	4.7
Ratios	133.9	176.1	170.4	236.7	143.1	170.7	334.6	366.9	719.1	638.4	168.0	209.8	198.4	172.8	286.0	287.3	276.8	261.0	261.2	238.0
	171.8						514.7				187.2				274.4					238.0
	171.8						315.5

Table 8.

Should versus Be Supposed to in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Should	1192	1092	1244	1231	1174	1127	1372	1431	1726	1213	1143	1259	1151	1262	1230	1609	1246	1201	1045	1199
Be supposed to	80	60	53	43	46	39	53	39	46	44	60	56	58	37	43	67	55	54	52	39
Ratios	14.9	18.2	23.4	28.6	25.5	28.8	25.8	36.6	37.5	27.5	19.0	22.4	19.8	34.1	28.6	24.0	22.6	22.2	20.0	28.6
	23.2						31.8				23.8				23.4					28.6
	23.2						26.3

Table 9.

Need to versus Need in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Need to	643	659	690	620	773	795	802	544	533	670	691	671	594	641	769	650	590	708	569	594
Need	15	13	14	13	14	14	29	22	16	18	19	20	19	16	17	19	13	16	14	10
Ratios	42.8	50.6	49.2	47.6	55.2	56.7	27.6	24.7	33.3	37.2	36.3	33.5	31.2	40.0	45.2	34.2	45.3	44.2	40.6	59.4
	50.3						30.7				35.2				41.9					59.4
	50.3						38.0

Table 10.

Need to versus Must in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Need (to)	643	659	690	620	773	795	802	544	533	670	691	671	594	641	769	650	590	708	569	594
Must	488	599	487	634	563	606	571	666	632	600	527	671	633	603	802	838	689	620	532	645
Ratios	1.3	1.1	1.4	0.9	1.3	1.3	1.4	0.8	0.8	1.1	1.2	1.0	0.9	1.0	0.9	0.7	0.8	1.1	1.0	0.9
	1.2						1.0				1.0				0.9					0.9
	1.2						0.9

Table 11.

May versus Might in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
May	958	1204	982	1188	1103	1196	1161	960	1142	1259	1046	1195	1181	1256	1062	920	929	966	899	937
Might	562	481	575	507	529	497	395	303	344	374	453	383	416	406	406	356	292	333	314	313
Ratios	1.7	2.5	1.7	2.3	2.0	2.4	2.9	3.1	3.3	3.3	2.3	3.1	2.8	3.0	2.6	2.5	3.1	2.9	2.8	2.9
	2.1						3.1				2.8				2.7					2.9
	2.1						2.9

Table 12.

Could versus Might in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Could	1335	1186	1418	1199	1217	1218	1034	1051	929	1003	1103	1062	1111	1025	1096	1060	1054	1037	945	1011
Might	562	481	575	507	529	497	395	303	344	374	453	383	416	406	406	356	292	333	314	313
Ratios	2.3	2.4	2.4	2.3	2.3	2.4	2.6	3.4	2.7	2.6	2.4	2.7	2.6	2.5	2.6	2.9	3.6	3.1	3.0	3.2
	2.3						2.8				2.5				3.0					3.2
	2.3						2.8

Table 13.

Can versus May in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Can	3020	3061	3102	2957	3205	3267	3745	2845	3016	3552	3686	3646	3594	3581	3228	3258	2808	3049	2940	2730
May	958	1204	982	1188	1103	1196	1161	960	1142	1259	1046	1195	1181	1256	1062	920	929	966	899	937
Ratios	3.1	2.5	3.1	2.4	2.9	2.7	3.2	2.9	2.6	2.8	3.5	3.0	3.0	2.8	3.0	3.5	3.0	3.1	3.2	2.9
	2.7						2.8				3.0				3.1					2.9
	2.7						3.0

Table 14.

Can versus Be Able to in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Can	3020	3061	3102	2957	3205	3267	3745	2845	3016	3552	3686	3646	3594	3581	3228	3258	2808	3049	2940	2730
Be able to	369	415	402	352	422	456	411	358	334	422	464	449	462	443	430	380	385	416	425	393
Ratios	8.1	7.3	7.7	8.4	7.5	7.1	9.1	7.9	9.0	8.4	7.9	8.1	7.7	8.0	7.5	8.5	7.2	7.3	6.9	6.9
	7.6						8.6				7.9				7.4					6.9
	7.6						7.8

Table 15.

Will versus Shall in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Will	3089	3361	3404	3654	3320	3565	3888	3293	3919	3863	3856	3919	3695	3619	3853	4037	3635	3878	3693	3494
Shall	166	198	95	264	107	128	257	167	326	405	151	192	499	313	181	325	410	269	234	182
Ratios	18.6	16.9	35.8	13.8	31.0	27.8	15.1	19.7	12.0	9.5	25.5	20.4	7.4	11.5	21.2	12.4	8.8	14.4	15.7	19.1
	23.9						14.0				16.2				14.5					19.1
	23.9						15.1

Table 16.

Will versus Be going to in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Will	3089	3361	3404	3654	3320	3565	3888	3293	3919	3863	3856	3919	3695	3619	3853	4037	3635	3878	3693	3494
Be going to	389	355	308	262	308	299	256	205	224	216	310	288	303	239	360	296	266	235	216	286
Ratios	7.9	9.4	11.0	13.9	19.7	11.9	15.1	16.0	17.4	17.8	12.4	13.6	12.1	15.1	10.7	13.6	13.6	16.5	17.0	12.2
	12.3						16.5				13.3				14.2					12.2
	12.3						14.5

Table 17.

Be going to versus Be about to in GloWbE

	US	CA	GB	IE	AU	NZ	IN	LK	PK	BD	SG	MY	PH	HK	ZA	NG	GH	KE	TZ	JM
	IC						OC
							SA				SEA				Afr					Carib
Be going to	389	355	308	262	308	299	256	205	224	216	310	288	303	239	360	296	266	235	216	286
Be about to	42	36	44	37	43	44	29	28	33	30	37	37	48	33	37	42	36	37	33	35
Ratios	9.2	9.8	7.0	7.0	7.1	6.7	8.8	7.3	6.7	7.2	8.3	7.7	6.3	7.2	9.7	7.0	7.3	6.3	6.5	8.1
	7.8						7.5				7.3				7.3					8.1
	7.8						7.4

The study includes both a (primary) form-based component and a (secondary) meaning-based component. For the former, search routines were formulated in accordance with the online BYU platform. For the modals, as single form categories, both raw and pmw frequencies were readily obtainable using the modal form in conjunction with the tag “.[vm*],” as for example in “will.[vm*].” However, for the quasi-modals, as multi-form lexeme-based categories, normalized frequencies had to be calculated from the raw frequencies provided. In cases where exhaustive searches were not possible for a category, frequencies based on a set of the most frequent tokens—typically the 1000 most common—of the category in question were obtained. For example, the search for be going to was limited to uncontracted tensed forms of be, to was required to be followed by a verb (in order to exclude irrelevant non-modal instances where to was followed by a noun, as in I am going to Paris), and frequencies were limited to the 1000 most common lexical verb forms. In some cases, it was inevitable that a small number of irrelevant tokens would slip through the net: for instance, in the search for have to there was no automatic way of excluding superficial instances where have is followed by a modifying to-infinitival clause, as in (7).

(7) [ . . .] and his organisation represent a totally failed political ideology. All they have to offer is a return to the gun and the bomb. (GloWbE, IE)

For the meaning-based analysis I had to address the problem that the GloWbE platform provides only for form-based searches. This being so, an exhaustive semantic description of the modals and quasi-modals would have required manual inspection of the almost seventy million modal and quasi-modal tokens in the corpus. In view of the practical impossibility of such an undertaking, two alternative possibilities were considered. One was to manually process smaller sets of randomly sampled tokens. The other was to exploit the fact that—as recognized by, for example, Coates (1983) and Wärnsby (2006)—there are a number of contextual syntactic features that can be used to identify modal meanings, especially epistemic ones. I have pursued the latter alternative in this study, anticipating that it might shed further light on Collins’s (2022) finding that epistemic comment markers are more commonplace in IC than OC varieties (by a ratio of 1.53:1).

Selective use was made of the six identifying features claimed by Coates (1983:244-245) to be associated with epistemic modality: perfect aspect, progressive aspect, existential there subject, state verb, quasi-modal, and inanimate subject. Wärnsby (2006:49-51) not only quantifies and exemplifies Coates’s (1983) features, but also proposes a number of explanations for their applicability. For example, Wärnsby (2006) argues that the incompatibility of the perfect aspect with non-epistemic modality derives from the fact that directed or permitted actions can normally only be posterior, and that of the progressive aspect from the fact that one cannot permit something that is already happening and therefore is beyond the agent’s control. The high strength of the correlations reported by Coates (1983) and Wärnsby (2006) are undoubtedly a by-product of the size of their databases (3460 times smaller than GloWbE for Coates 1983, and 2670 for Wärnsby 2006). Infrequent as they may be, non-epistemic examples are not entirely excluded by the identifying features. For example, while must and may cannot be used subjectively to oblige or permit someone to do something in the past, anteriority with the perfect aspect is nevertheless possible if they are used objectively in a general requirement or granting of permission, as in (8) and (9).

(8) Excise Duty must have been paid before the goods are sent otherwise goods may be seized (GloWbE, GB)

(9) The world is increasingly becoming a small place. Today, job opportunities are not just limited to India alone, although you may have completed your education here. A whole lot of other countries have Indian workers employed in scores. (GloWbE, IN)

There is a second type of indicator of epistemic meaning in modal expressions to which appeal is made in the study, namely adverbials functioning either as “harmonic” expressions or as “hedges.” Harmonic adverbials are congruent with the type of epistemic modality expressed by the (quasi-)modal. For example, in must surely, the adverb surely is compatible with the speaker’s strong confidence in the logical necessity of the proposition, and in may perhaps, perhaps is compatible with the speaker’s inference that the proposition is logically possible. By contrast, hedges are semantically non-harmonic expressions: in must presumably, the adverb serves to pragmatically weaken the speaker’s confidence; and in surely may, the weak may and strong surely express independent modal meanings (‘surely it is the case that it is possible’).³

4. Results

Fourteen pairs of semantically similar modal expressions are identified, associated with three broad semantic groupings: necessity and obligation; possibility, permission, and ability; and prediction and volition.

4.1. Necessity and Obligation

4.1.1. Must versus Have to

Must and have to are semantically similar in expressing predominantly strong deontic necessity (Huddleston & Pullum 2002:177; Loureiro-Porto 2019:124), apart from a tendency for must to be skewed toward speaker-oriented subjectivity and have to toward speaker-external objectivity (Coates 1983; Perkins 1983; Palmer 1990; Westney 1995). So, why are their diachronic fortunes so dissimilar in the reference varieties, with the strong decline that must is undergoing contrasting with the modest increase shown by have to (see Table 1)? We may surmise that, inter alia, the phenomenon of colloquialization is playing a role here, given the contrast between the apparent colloquiality of have to (see Table 2) suggested by its preference for Blogs over General texts in GloWbE (1.13:1) and the apparent anti-colloquiality of must suggested by its preference for General over Blogs (1.27:1).

The pmw frequency-based ratios presented in Table 4 show the relative preference for the quasi-modal over the modal to be stronger in the IC than the OC, an unsurprising finding in view of the evidence that the IC varieties tend to be more advanced than the OC in current grammatical change, and especially in colloquialization-driven changes (Collins & Yao 2018; Collins 2023). It is notable that a mere comparison of IC versus OC average frequencies for have to (842:860) fails to reveal the relative popularity of this quasi-modal in the IC that is revealed by the present onomasiological alternation-based approach.

Further evidence of the role of colloquialization in the IC versus OC results can be found in the distribution of the informal reduced form hafta, which is precisely twice as popular in the IC (and particularly so in AmE) with 0.12 tokens pmw, as it is in the OC (where the AmE-influenced variety PhilE has the highest number of tokens) with 0.06 pmw. Another factor that appears to be exerting an influence in the ratio-based findings presented in Table 4 is epicentrality: hypercentral AmE and supercentral BrE have the strongest ratios overall; IndE has the strongest ratio in SA; and SingE in SEA. Another finding consistent with that of other studies is that the most IC-like of the OC subgroups is SEA, a finding supported by the relatively evolutionarily-advanced status of the SEA varieties (Collins 2022, 2023).

Finally, consider the expression of epistemic modality by must, as in (10). Application of the relevant tests discussed in section 3 revealed that epistemic must is more common in the IC than the OC. For must have past participle verb the ratio is 1.10:1, and for must be verb-ing—a superior test which yields fewer spurious non-epistemic tokens—the ratio is 1.05:1.

(10) He must be having a lot of new experience in the school on first day. (GloWbE, SG)

Have to is rarely epistemic, except when it collocates with would, as in (11) (Collins 2009a:60). Like other combinations that normally express epistemic meaning, would have to be is attested more frequently in the IC than the OC (1.75:1), and most commonly found in the Antipodean varieties AusE and NZE.

(11) Any actor who could accurately depict that backstabbing jerk would have to be good. (GloWbE, AU)

4.1.2. Have to versus Have got to

The quasi-modals have to and have got to both express mainly deontic necessity and are often treated as variants in the literature, despite the fact that have got to displays most of the formal properties of the modal auxiliaries (including absence of non-tensed forms and inability to cooccur with modals: *will have got to), and tends to be more informal than have to; in Collins (2009a:67,72), the speech versus writing ratio for have to is 2.5:1, but that for have got to, 12.0:1, indicating a far stronger preference for speech.

As Table 5 indicates, the preference for have got to is stronger in BrE (and in AusE, which is known for its informality; see Peters & Collins 2012) than in AmE. This transatlantic difference is noted also by Leech, Hundt, Mair, and Smith (2009:105), who attribute it to the greater prescriptive censure of got found in the USA. This censure is presumably attributable to the almost exclusive provenance of have got to in speech, which results in it being a target for prescriptivists who object to the overt informality of got in writing. Strunk and White (2000:46), for example, note that “[t]he colloquial have got for have should not be used in writing.” Table 5 also indicates that in GloWbE the OC displays a far stronger relative preference for have to and dispreference for have got to than the IC. A possible factor at work in the OC dispreference of have got to is its aforementioned association with informal styles (the latter also reflected, it may be noted, in the higher incidence of gotta tokens—1.4:1—in GloWbE in the IC than the OC).

Like have to, have got to is rarely epistemic. As with the predominantly-epistemic collocation would have to be, so it is with the collocation have got to be joking as exemplified in (12): more tokens are found in the IC (N = 29) than in the OC (N = 5).

(12) You have got to be joking when you say, “Islam discourages outsiders from enquiry.” (GloWbE, AU)

4.1.3. Should versus Ought to

Should and ought to are semantically close and often interchangeable, as suggested by mutually-reinforcing examples such as (13) and (14).

(13) Something ought to and should yield in the interest of a harmonious existence. (GloWbE, NG)

(14) In sum, the court should and ought to dismiss this petition for the foregoing reasons (GloWbE, KE)

According to Huddleston and Pullum (2002:186), “[i]n its most frequent use should expresses medium strength deontic or epistemic modality and is generally interchangeable with ought (+ to).” However, ought to is far less common than should, and, like other low frequency modals, is in rapid decline (see Table 1).

The ratios in Table 6 indicate a dispreference for ought to, relative to should, that is very similar in the IC and OC, and is stronger in BrE than AmE (as the contrasting Brown family percentages in Table 1 would lead us to expect). The relative tolerance of ought to in AmE is shared by the AmE-influenced OC variety, PhilE. Strong epicentrality is evidenced by SingE in SEA and by SthAfrE in Afr. Finally, it may be noted that our alternation-based account paints a different picture of the fortunes of should in the IC than an account based on (average) pmw frequencies alone. In the latter it is not only ought to that enjoys more support in the OC than the IC, but also should, with ratios of 1.06:1 and 1.09:1 respectively, calculated by comparing the average frequency of the six IC countries with that of the fourteen OC varieties.

Another explanation for why should is holding its ground better, and more so in the IC than ought to, is that it is one of the few necessity/obligation modal expressions apart from must and be supposed to to express epistemic meaning to any appreciable degree; in Collins’s (2009a:45, 53) study, should represents 11.8 percent, as against 3.0 percent for ought to. Cooccurrence with the adverb hopefully is a reliable test of epistemic meaning because it suggests that actualization is beyond the speaker’s control. There were 437 tokens of should hopefully in GloWbE, as in (15), with an IC:OC ratio of 1.93:1; and a disproportionate number of these were in BrE. Also, more frequent in the IC than the OC were the 28 tokens of should presumably (3.50:1), illustrated in (16). There were no tokens of presumably with ought to in GloWbE and only one of hopefully.

(15) Playing with Moulson and Tavares should hopefully bring out Boyes old scoring touch from a few seasons ago. (GloWbE, CA)

(16) Spanish goalkeeper David De Gea should presumably also be in contention after missing out against Norwich (GloWbE, GB)

4.1.4. Should versus Had Better

Had better is similar to should and ought to in expressing medium strength modality. It differs from these modals in being essentially monosemous, with a deontic meaning fittingly dubbed “advisability” by Jacobsson (1980:52). Table 1 shows that in Leech, Hundt, Mair, and Smith (2009), had better is the only quasi-modal to be in decline in AmE, and one of only several to be in decline in BrE (cf. van der Auwera, Noël & Linden 2013). The frequencies for had better in Table 7, which include those for contracted ’d better, indicate that this quasi-modal flouts the tendency for the OC to be more supportive than the IC of modal constructions that are in decline. In fact, the most frequent use of had better, relative to should, is found in AmE, which commonly plays a leading rather than conservative role in language change; there is furthermore epicentral support for had better from IndE in SA, SingE in SEA, and SEA in the OC.

What factors could outweigh the typical intervarietal pattern of IC-leadership noted elsewhere in this study to be associated with diachronically volatile modal expressions? One possibility is the comparative syntactic complexity of had better, and another may be that the grammaticalization of this quasi-modal has progressed less in the OC, as evidenced by cases such as (17), where it retains its original comparative sense.

(17) Clearly a Muslim had better be absent than to show up in school during mass and be playing hide-and-seek (GloWbE, GH)

4.1.5. Should versus Be Supposed to

Be supposed to is a further medium strength quasi-modal that has semantic affinities with should and ought to, and which carries the same “conversationally-derived implication of non-fulfilment” as the latter (Collins 2009a:81). Be supposed to arguably upsets the typical pattern in modal expressions, certainly the modals, of epistemic meanings deriving historically from root meanings: here the epistemic meaning is prior (Mair 2004; Berkenfield 2006; Moore 2007; Noël & van der Auwera 2009). The frequencies presented in Table 1 show that be supposed to is strongly on the rise diachronically, especially in BrE, while the Blogs versus General findings presented in Table 2 suggest the possibility that colloquialization is a factor in this development. Table 8 shows be supposed to to be marginally more frequent in the IC than in the OC (1.13:1), with the familiar pattern of AmE exhibiting the highest frequency in the IC, along with IndE in SA, and SingE in SEA. While a comparison of the average pmw frequencies for be supposed to also supports its greater use in the IC than in the OC (1.06:1), it does so less markedly than the alternation-based finding.

Another factor in the relative frequency of be supposed to in the IC is likely to be its greater predilection for epistemic meaning here than in the OC. The combination of this quasi-modal with the perfect, as in (18), and with the non-agentive verb happen, as in (19), are both strong indicators of epistemicity, and both have strong IC:OC ratios (respectively 3.79:1 and 1.44:1), particularly in AmE and BrE.

(18) AT THE FEET OF THE MASTER, is supposed to have been written by him when he was thirteen years of age (GloWbE, IN)

(19) your prediction about what will happen this November will be as disastrously wrong as your prediction about what was supposed to happen in November of 2008 (GloWbE, US)

4.1.6. Need to versus Need

I have operated with the principle that modals have no non-tensed forms and no separate third person singular present tense form or regular past tense form with -ed as suffix (cf. Quirk, Greenbaum, Leech & Svartvik 1985:138-139; Huddleston & Pullum 2002:109; Collins 2009a:12-13). By these criteria, indeterminate cases such as (20) and (21) are understood to contain quasi-modals, the absence of the to-infinitive a matter of secondary importance (e.g., van der Auwera, Noël & de Wit 2012).

(20) But really, this needs not be our destiny; it need not be our collective fate (GloWbE, NG)

(21) she feels that she needed not give men ‘chance’ (GloWbE, GH)

Table 9 presents the results of the search for modal need (via the query “need.[vm*]”) and for the quasi-modal (need to). In the latter case, also included were the small number of instances (0.32 pmw) with forms other than the base form need (i.e., needs, needed, needing) in construction with a bare- rather than to-infinitive.

The relationship between the modal need and the quasi-modal need to is closer than any of the onomasiological pairs that we have analyzed thus far. Both express predominantly dynamic necessity, along with deontic and epistemic necessity, with the main difference being that, while epistemic necessity is more common than deontic necessity with need, the reverse is the case with need to (Collins 2009a:57, 73). The diachronic trajectories of the two items contrast markedly, the quasi-modal being strongly on the rise, the modal in decline (see Table 1), a pattern consistent with their generic distribution. Table 2 signals need to to be more Blogs-friendly (1.16:1) and need more General text-friendly (1.2:1). It would then be anticipated that the IC would show relatively more support for need to and less for need, than would the OC. This expectation is confirmed, as can be seen in Table 9, where the preference for need to over need is considerably stronger in the IC than in the OC (by a ratio of 1.45:1). Epicentrality is in evidence only in the European IC varieties (with BrE showing a higher relative preference for need to than IrE), and in Afr (where SthAfrE has the highest ratio). Once again, our onomasiological approach yields a clearer confirmation of the difference between the IC and the OC (1.45:1) than one based merely on a comparison of average pmw frequencies (1.08:1), calculated by comparing the average frequency of the six IC countries with that of the fourteen OC varieties.

While its capacity to express epistemic modality is not sufficient to save need from decline, it is notable that—as is commonly the case with epistemic modality—epistemic need, as marked by its collocation with necessarily in (22), is more frequent in the IC than the OC, by a ratio of 1.59:1.

(22) Furthermore, money given to poor country governments needn’t necessarily end up going to infrastructure or healthcare. (GloWbE, GB)

4.1.7. Must versus Need to

As noted in sections 4.1.1 and 4.1.6 must predominantly expresses deontic necessity, typically used with speaker-oriented subjectivity, need to predominantly dynamic necessity. The semantic domain where they are most clearly in competition with each other is deontic necessity, which in the case of need to derives from its primary dynamic meaning, as recognized by Smith (2003:260) in his observation that need to “can acquire the force of an imposed obligation, but [. . .] the speaker or writer can claim that the required action is merely being recommended for the doer’s own sake.” Deontic need to, like deontic have to, accordingly appeals to speakers seeking a more “democratic,” less authoritarian, tenor than that associated with deontic must. This contrast is evident in (23) and (24).

(23) Here in Australia you must wear a helmet when you ride on the road (GloWbE, CA)

(24) Anyways, you need to file for a court date if you plan on fighting this ticket. (GloWbE, CA)

The frequencies in Table 1 indicate the diachronic fortunes for must and need to to be strikingly divergent, the former undergoing a strong decline and the latter a strong increase. That colloquialization may be a factor here is suggested by the anti-colloquiality of must (more frequent in General texts than in Blogs) and the colloquiality of need to (more frequent in Blogs than in General texts).

As the ratios in Table 10 show, the (apparently rising) popularity of need to vis-à-vis must is stronger in the IC than the OC. It is, furthermore, slightly stronger in BrE than AmE, with epicentrality in evidence in the dominance of IndE in SA and of SingE in SEA.

4.2. Possibility, Permission, and Ability

4.2.1. May versus Might

The dominant meaning of both may and might in Contemporary English is epistemic possibility. Opinions differ as to the degrees of likelihood they express. Some claim that the degrees are the same, including Coates (1983:152) and Collins (2009a:111), and others argue that may expresses a greater degree of likelihood, including Hermerén (1978) and Palmer (1990). May has shown a greater declining tendency than might (see Table 1). One likely factor in this trend is the anti-colloquiality of may (whose higher frequency in General over Blogs [1.28:1] contrasts with might’s higher frequency in Blogs over General texts [1.10:1]; see Table 2).

Unsurprisingly, as Table 11 shows, it is the typically more advanced “hypercentral” AmE, along with “supercentral” BrE, which show the strongest relative preference for might and dispreference for may, in both the IC and overall. The same relative preference is evidenced by the IC over the OC, by IndE in SA, by SingE in SEA, and by SthAfrE in Afr.

Are there semantic factors influencing the findings presented in Table 11? It is arguable that might is becoming the primary exponent of epistemic possibility. One piece of evidence for this is that collocations of might with the progressive aspect represent 2.75 percent of all tokens of might in GloWbE, compared with only 1.59 percent for may. A further piece of evidence is that coordinative sequences in the GloWbE data where the speaker switches from epistemic may to epistemic might, as in (25), are more common than those from epistemic might to epistemic may (21 versus 7 tokens respectively).

(25) These footwear may or might possibly not have beads, gems etc. (GloWbE, CA)

The differences between may and might that we have observed—when combined with the further finding that the IC versus OC ratio of might be verb-ing of 1.22:1, is greater than that for may of 1.18:1—suggest that in terms of semantic developments the IC is ahead of the OC.

4.2.2. Might versus Could

The past tense modals might and could differ from their present tense counterparts, may and can, in having two broad uses: temporal and hypothetical. As Table 2 indicates, the more frequently occurring of the two, could, is also the more diachronically stable. The frequencies in Table 2 suggest that these two modals have comparable levels of colloquiality. The ratios in Table 12 present might as more frequent in the IC than the OC, relative to could, as it is in SEA within the OC. Possible explanations are that epistemic possibility is more commonly expressed by might than could, and epistemic might is more speech-friendly than is epistemic could (Collins 2009a:109, 176-177). In the present study, the use of both might and could in the existential-there construction, as in (26), where they predominantly express epistemic meaning, was found to be more frequent in the IC than in the OC (might 1.33:1 versus could 1.05:1).

(26) Given the results so far, there could be 20 to 50 tigers here. (GloWbE, GB)

4.2.3. Can versus May

Can is a high-frequency, diachronically stable modal whereas may is a lower-frequency modal that is undergoing a strong decline (see Table 1). One factor in their contrasting fortunes is that may, as a predominantly epistemic modal, is encountering competition from epistemic might and could, whereas can has little competition as an exponent of dynamic possibility (including ability). Another is that the colloquiality of can (whose distribution is skewed toward Blogs: see Table 2) contrasts with the anti-colloquiality of may (skewed as it is toward General texts).

Table 13 indicates that the frequency of can relative to that of may is marginally stronger in the OC than the IC, with relative ratios suggestive of epicentrality in the three IC subgroups as well as in SA and SEA. The relatively stronger support for may in the IC may be attributable to the greater predilection for epistemic may in the IC than the OC, as noted in section 4.2.1.

4.2.4. Can versus Be Able to

Be able to occupies a semantic niche in the modal system that guarantees its ongoing diachronic viability (see Table 2). While, like can and could, it commonly expresses ability, this quasi-modal more readily carries an implication of actuality. Thus, in (27), was able to conveys the subject referent’s successful throughput achievement, and substitution of could would sound somewhat unnatural.

(27) Running some unencrypted performance tests. I was able to achieve 11.9MB/s (95.2 Mbit/s) throughput across the firewall. (GloWbE, CA)

Be able to is intervarietally stable: as Table 14 indicates, there are only small variations across the pmw frequencies for the twenty varieties, and the can versus be able to ratios are similar in the IC and OC.

4.3. Prediction and Volition

4.3.1. Will versus Shall

The high frequency modal will and the low frequency shall contrast markedly in their diachronic trajectories, the former undergoing a modest decline and the latter a major decline in Leech, Hundt, Mair, and Smith (2009) (see Table 1). One factor in the different fortunes of the two modals here may be their strikingly different generic distribution, as presented in Table 2, where will displays a Blogs versus General text ratio of 1.05:1, and by contrast shall is much more frequent in the General texts, with a ratio of 3.18:1.

Another factor is semantic: will is the primary exponent of epistemic predictability and prediction, while shall is no longer a viable competitor for will in this semantic area, having become predominately a marker of constitutive/regulative deontic modality (Collins 2009a:126, 135). Will tends strongly toward epistemic meaning when it collocates with hopefully, a combination that is more frequent in the IC (2.9 tokens pmw) than in the OC (1.9). This collocation contrasts with will gladly, which tends strongly to volitional meaning, and is less frequent in the IC (90.3 tokens pmw) than in the OC (101.4).

The pmw frequencies and ratios for uncontracted will and shall in Table 15 indicate that shall is less frequent in the IC than in the OC. Epicentrality appears to be a factor in the demise of shall, as reflected in the ratios for AmE over CanE, BrE over IrE, AusE over NZE, SingE in SEA, and SthAfrE in Afr.

4.3.2. Will versus Be going to

Will and be going to both express predominantly epistemic meanings, where there is arguably little difference between them other than the implicature of immediacy that is typically associated with the quasi-modal (Palmer 1990:144; Collins 2009a:144). Table 1 shows will to be in mild decline, and be going to to be on the rise, with AmE in the lead in both developments. Colloquialization is most likely a contributing factor, with be going to exhibiting a stronger Blogs versus General text ratio than will in Table 2. Further confirmation of this suggestion is provided by the distribution of the reduced form gonna, which is more frequent in the IC than in the OC by a ratio of 1.09:1.

The ratios in Table 16 suggest that be going to has made far greater inroads into will’s territory in the IC than in the OC, and the results are strongly suggestive of epicentral influence: AmE has a far greater proportion of be going to tokens than any other variety, while the ratio for its geographical neighbor CanE is the second strongest in the IC, and the two strongest ratios in the OC belong to AmE’s close neighbor JamE and the AmE-influenced SEA variety PhilE. IndE leads the way in SA, SthAfrE does so in Afr, as does BrE in the IC European subgroup.

That semantic factors may be playing a role in the IC versus OC results is suggested by the frequencies for some epistemically-oriented collocations. When combined with the harmonic adverb probably, the pmw frequency of be going to in the IC (0.81) surpasses that of the OC (0.46) by 1.76:1, a stronger ratio than that for will probably (1.16:1). Even more tellingly, when combined with the stative quasi-modal be able to, the pmw frequency of be going to in the IC (2.8) surpasses that of the OC (2.1) by 1.33:1, whereas will be able to is actually less common in the IC (31.7 pmw) than in the OC (43.3 pmw).

4.3.3. Be About to versus Be Going to

Be about to and be going to both express futurity with an accompanying sense of immediacy, one stronger in the former than the latter. Compare the following examples, where the impending arrival is presented as being more imminent in (28), as suggested by the harmonious collocation with temporal just, than it is in (29).

(28) And Vonner...... if he was in any Premiership side as a regular- guess what? We wouldn’t get him!!...... or do you think Harry Rednapp and Joe Jordan are just about to arrive as well? (GloWbE, GB)

(29) Customers are more likely to ignore a deal if they know another one is going to arrive shortly. (GloWbE, ZA)

Be about to is a low frequency quasi-modal which is on the rise, though less rapidly so than be going to, at least in AmE.⁴ Colloquialization may be a factor, given that the frequency of be going to in Blogs over General texts (1.23:1) exceeds that of be about to (1.13:1) (see Table 2).

The ratios in Table 17 indicate that the frequency of be going to, relative to that of be about to, is not only greater in the IC than in the OC, but also in putatively epicentral varieties (in AusE than NZE, in IndE in SA, in SingE in SEA, and in SthAfrE in Afr).

5. Conclusion

Let us review the study’s findings in light of the explanatory factors presented in section 2, beginning with Kachru’s (1985) Concentric circles typology of varieties. The IC varieties have been found here to typically have higher quasi-modal frequencies and lower modal frequencies than in the OC varieties, suggesting a tendency for the IC to be more advanced than the OC in diachronic trends that have been observed in the literature (notably the declining trajectories of most modals and the rising trajectories of most quasi-modals). Clear cases of this pattern are the relative dominance of have to and need to over must and of need to over need. The apparently greater degree of advancement shown by the IC varieties no doubt reflects the fact that, inter alia, they are longer established and more extensively normativized than are the typically more conservative newer OC varieties (see, Mesthrie & Bhatt 2008; Hundt 2009).

The influence of areal proximity is evident in many of the study’s findings. For example, the American, European, and Oceanian regional subgroups of the IC exhibit internal consistencies that are reflective of the historical and geographical ties between their constituent varieties. This tendency is particularly noticeable with the Oceanian varieties, AusE and NZE, whose similar and shared histories are reflected in their postcolonial evolutionary parallels (Schneider 2007:118-133). Co-patterning of AusE and NZE is found with have to versus must, should versus be supposed to, must versus need to, may versus might, might versus could, can versus may, and will versus shall. Of the areal groups in the OC, it is SEA that most often demonstrates a level of advancement approximating that of the IC, as seen in the majority of the alternations examined: have to versus must, have to versus have got to, should versus had better, may versus might, might versus could, can versus be able to, will versus should, and will versus be going to. This finding no doubt reflects the fact that the SEA varieties have collectively moved further through Schneider’s (2007) five evolutionary phases—with signs that SingE is entering phase 5 (cf. Percillier 2016), and that PhilE is entering phase 4 (cf. Borlongan 2016)—than have the varieties of SA and Afr.

Linguistic epicentrality, the potential of a variety to influence neighboring varieties, is widely attested in the results. The epicentrality of IndE in SA, vis-à-vis SLE, PakE, and BDE, is suggested in the findings for have to versus must, should versus had better, should versus be supposed to, must versus need to, may versus might, might versus could, will versus be going to, and be about to versus be going to. SEA is a more geographically diverse subgrouping, but it is clear that SingE is the most advanced variety therein (in the ratios for have to versus must, should versus be supposed to, must versus need to, may versus might, will versus shall, will versus be going to, and be about to versus be going to) suggesting its potentially epicentral status. In Afr it is SthAfrE that commonly emerges as the most advanced variety, with respect to should versus ought to, need versus need to, might versus could, will versus shall, will versus be going to, and be about to versus be going to.

As we have seen, in Mair’s (2013) World system model a non-areally-driven concept of epicentrality is applied to the hierarchical interrelationships between WEs in a globalized world, with AmE ascribed “hypercentral” status and with “supercentral” BrE next in the pecking order of English world-wide. The findings of the present study strongly support the putative hypercentrality of AmE, which has the leading ratio overall with have to versus must, should versus had better, should versus be supposed to, may versus might, will versus be going to, and might versus could, and which is just off the lead with be about to versus be going to and must versus need to. Evidence of the supercentrality of BrE is less compelling, though available in the findings for must versus need to where it is ahead of AmE, and have to versus must, and may versus might, where it shares the lead with AmE.

Colloquialization has been postulated as a factor in many of the results, as reflected in the higher frequency of some expressions in “speechy” Blogs over more formal General texts, and anti-colloquialization in the case of several others where the reverse situation is in evidence (see further Collins & Yao 2013; Collins 2015). It is accordingly plausible to assume that colloquialism and informality exert influence, even if only indirectly, on such findings as the tendency for the IC varieties to be more receptive than the OC of the generally increasing use of speech-friendly quasi-modals and the decreasing use of the typically writing-friendly modals (Collins & Yao 2018). Some classic cases of (anti-)colloquialism-influenced findings are those for have to versus must, need versus need to, and will versus shall. I have also cited, as further evidence of the role of colloquialization, the higher incidence in the IC than the OC of the informal reduced forms hafta, gotta, and wanna.

Another finding of the study, that epistemic meanings are more frequent in the IC than the OC varieties, requires a different kind of explanation, one based in cognitive semantics. According to Sweetser (1990), the development of epistemic meanings in the English modal system occurs later than that of root meanings, via the process that she refers to as “subjectification.” There is furthermore some evidence that this pattern may be mirrored ontogenetically in the “history” of individual speakers, with epistemic uses of modals later-acquired than root uses (Le Bonniec 1970; Kukzaj & Maratsos 1975; Cournane 2014). More speculatively, it may be suggested that such correspondences extend to dialect formation as well, thereby providing an explanation for why epistemic meanings tend to be more frequent in the longer established IC varieties than in the developing OC varieties. This suggestion is reinforced by Collins’s (2022) finding that epistemic comment markers such as possibly, maybe, probably, presumably, supposedly, and undoubtedly have a distribution similar to the epistemic (quasi-) modals studied in this paper.

A number of the study’s results indicate that the ratios-based onomasiological approach achieves a level of descriptive adequacy that surpasses one based solely on pmw frequencies. For example, as we have seen, the relative strength of the IC ratios for the quasi-modals have to, need to, and be supposed to, over those for the modals must, need, and should respectively emerges more clearly from alternation-based ratios than it does from their pmw frequencies alone, and thereby better highlights the contrast between the advancement of the IC and the conservatism of the OC.

In this study, historical modal and quasi-modal trajectories gleaned from corpus-based studies of BrE and AmE have been cited to support inferences of advanced and conservative modal trends drawn from synchronic multi-varietal GloWbE data. In the absence of available diachronic corpora representing all but a few of the GloWbE varieties, the status of such developmental inferences must ultimately be regarded as provisional, awaiting empirical substantiation via real-time diachronic data. I conclude by repeating my exhortation of 2015 to colleagues that they “address the ‘diachronic gap’ in the World Englishes paradigm” by the imaginative use of not only available corpora but also “newly-prepared purpose-built corpora” (Collins 2015:10).⁵

Footnotes

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Peter Collins

Notes

Author Biography

Peter Collins is honorary professor in Linguistics at the University of New South Wales in Sydney, Australia. He has published widely in English grammar, corpus linguistics, and World Englishes.

References

GloWbE = Davies

Mark

. 2013. Corpus of Global Web-Based English: 1.9 billion words from speakers in 20 countries. http://corpus.byu.edu/glowbe/.

Aarts

Bas

Joanne

Wallis

Sean

. 2013. Choices over time: Methodological issues in investigating current change. In Aarts

Bas

Joanne

Leech

Geoffrey

Wallis

Sean

(eds.), The verb phrase in English: Investigating recent language change with corpora, 14-45. Cambridge: Cambridge University Press.

Berkenfield

Catie

. 2006. Pragmatic motivations for the development of evidential and modal meaning in the construction “be supposed to X.” Journal of Historical Pragmatics 7(1). 39-71.

Borlongan

Ariane

. 2016. Relocating Philippine English in Schneider’s dynamic model. Asian Englishes 18(3). 232-241.

Coates

Janet

. 1983. The semantics of the modal auxiliaries. London: Croom Helm.

Coates

Janet

. 1995. The expression of root and epistemic possibility in English. In Bybee

Joan

Fleischman

Susan

(eds.), Modality in grammar and discourse, 55-66. Amsterdam: John Benjamins.

Collins

Peter

. 2009a. Modals and quasi-modals in English. Amsterdam: Rodopi.

Collins

Peter

. 2009b. Modals and quasi-modals in World Englishes. World Englishes 28(3). 281-292.

Collins

Peter

. 2014. Modal expressions in Malaysian English. In Abdul Manan

Shakila

Abdul Rahim

Hajar

(eds.), English in Malaysia: Postcolonial and beyond, 127-160. Bern: Peter Lang.

10.

Collins

Peter

. 2015. Introduction. In Collins

Peter

(ed.), Grammatical change in English world-wide, 1-11. Amsterdam: John Benjamins.

11.

Collins

Peter

. 2022. Comment markers in World Englishes. World Englishes 41(2). 244-270.

12.

Collins

Peter

. 2023. Grammatical variation in World Englishes: An onomasiological study. English World-Wide 44(2). 184-218.

13.

Collins

Peter

Borlongan

Ariane

Yao

Xinyue

. 2014. Modality in Philippine English: A diachronic study. Journal of English Linguistics 42(1). 68-88.

14.

Collins

Peter

Yao

Xinyue

. 2012. Modals and quasi-modals in New Englishes. In Hundt

Marianne

Gut

Ulrike

(eds.), Mapping unity and diversity in New Englishes: Corpus-based studies of New Englishes, 35-53. Amsterdam: John Benjamins.

15.

Collins

Peter

Yao

Xinyue

. 2013. Colloquial features in World Englishes. International Journal of Corpus Linguistics 18(4). 479-505.

16.

Collins

Peter

Yao

Xinyue

. 2018. Colloquialisation and the evolution of Australian English: A cross-varietal and cross-generic study of Australian, British and American English from 1931 to 2006. English World-Wide 39(3). 253-277.

17.

Cournane

Ailís

. 2014. In search of L1 evidence for diachronic reanalysis: Mapping modal verbs. Language Acquisition 21(1). 103-117.

18.

Davies

Mark

Fuchs

Robert

. 2015. Expanding horizons in the study of World Englishes with the 1.9 billion word Global Web-based English Corpus (GloWbE). English World-Wide 36(1). 1-28.

19.

Depraetere

Ilse

Reed

Susan

. 2021. Mood and modality in English. In Aarts

Bas

McMahon

April

Hinrichs

Lars

(eds.), The handbook of English linguistics, 2nd edn, 207-227. Hoboken, NJ: Wiley Blackwell.

20.

Deuber

Dagmar

Biewer

Carolin

Hackert

Stephanie

Hilbert

Michaela

. 2012. Will and would in selected New Englishes: General and variety-specific tendencies. In Hundt

Marianne

Gut

Ulrike

(eds.), Mapping unity and diversity world-wide: Corpus-based studies of New Englishes, 77-102. Amsterdam: John Benjamins.

21.

Diaconu

Gabriela Veronica

. 2012. Modality in New Englishes: A corpus-based study of obligation and necessity. Freiburg im Breisgau, Germany: Freiburg University PhD dissertation.

22.

Gries

Stefan Th.

Bernaisch

Tobias

. 2016. Exploring epicentres empirically: Focus on South Asian Englishes. English World-Wide 37(1). 1-25.

23.

Heller

Benedikt

Szmrecsanyi

Benedikt

Grafmiller

Jason

. 2017. Stability and fluidity in syntactic variation world-wide: The genitive alternation across varieties of English. Journal of English Linguistics 45(1). 3-27.

24.

Hermerén

Lars

. 1978. On modality in English: A study of the semantics of the modals. Lund: CWK Gleerup.

25.

Huddleston

Rodney

Pullum

Geoffrey

. 2002. The Cambridge grammar of the English language. Cambridge: Cambridge University Press.

26.

Hundt

Marianne

. 2009. Colonial lag, colonial innovation or simply language change? In Rohdenburg

Günter

Schlüter

Julia

(eds.), One language, two grammars: Differences between British and American English, 13-37. Cambridge: Cambridge University Press.

27.

Hundt

Marianne

. 2013. The diversification of English: Old, new and emerging epicentres. In Schreier

Daniel

Hundt

Marianne

(eds.), English as a contact language, 183-203. Cambridge: Cambridge University Press.

28.

Isingoma

Bebwa

Meierkord

Christiane

. 2019. Capturing the lexicon of Ugandan English: ICE-Uganda, its limitations, and effective complements. In U. Esimaje

Alexandra

Gut

Ulrike

Antia

Bassey E.

(eds.), Corpus linguistics and African Englishes, 294-328. Amsterdam: John Benjamins.

29.

Jacobsson

Bengt

. 1980. On the syntax and semantics of the modal auxiliary had better. Studia Neophilologica 52(1). 47-53.

30.

Kachru

Braj

. 1985. Standards, codification, and sociolinguistic realism: The English language in the Outer Circle. In Quirk

Randolph

Widdowson

Henry G.

(eds.), English in the world: Teaching and learning the language and literatures, 11-30. Cambridge: Cambridge University Press.

31.

Krug

Manfred G.

2000. Emerging English modals: A corpus-based study of grammaticalization. Berlin: Mouton de Gruyter.

32.

Kruger

Haidee

Smith

Adam

. 2018. Colloquialisation versus densification in Australian English: A multidimensional analysis of the Australian Diachronic Hansard Corpus (ADHC). Australian Journal of Linguistics 38(3). 293-328.

33.

Kukzaj

Stan

Maratsos

Michael

. 1975. What children can say before they will. Merrill-Palmer Quarterly of Behavior and Development 21(2). 89-111.

34.

Laliberté

Catherine

. 2022. A diachronic study of modals and semi-modals in Indian English newspapers. Journal English Linguistics 50(2). 142-168.

35.

Le Bonniec

Gilberte

. 1970. Etude genetique des aspects du raisonnement. Paris: Laboratoire de Psychologie. Ecole Pratique des Hautes Etudes (Vie section).

36.

Leech

Geoffrey

Hundt

Marianne

Mair

Christian

Smith

Nicholas

. 2009. Change in contemporary English: A grammatical study. Cambridge: Cambridge University Press.

37.

Loureiro-Porto

Lucía

. 2017. ICE versus GloWbE: Big data and corpus compilation. World Englishes 36(3). 448-470.

38.

Loureiro-Porto

Lucía

. 2019. Grammaticalization of semi-modals of necessity in Asian Englishes. English World-Wide 40(2). 115-142.

39.

Mair

Christian

. 2004. Corpus linguistics and grammaticalization theory. In Lindquist

Hans

Mair

Christian

(eds.), Corpus approaches to grammaticalization in English, 121-150. Amsterdam: John Benjamins.

40.

Mair

Christian

. 2013. The world system of Englishes: Accounting for the transnational importance of mobile and mediated vernaculars. English World-Wide 34(3). 253-257.

41.

Mair

Christian

Leech

Geoffrey N.

2021. Current changes in English syntax. In Aarts

Bas

McMahon

April

Hinrichs

Lars

(eds.), The handbook of English linguistics, 2nd edn, 249-276. Hoboken, NJ: Wiley Blackwell.

42.

Mazzon

Gabriella

. 2019. Variation in the expression of stance across varieties of English. World Englishes 38(4). 593-605.

43.

Mesthrie

Rajend

Bhatt

Rakesh

. 2008. World Englishes: The study of new linguistic varieties. Cambridge: Cambridge University Press.

44.

Moore

Colette

. 2007. The spread of grammaticalized forms: The case of be supposed to. Journal of English Linguistics 35(2). 117-131.

45.

Mukherjee

Joybrato

Schilk

Marco

. 2012. Exploring variation and change in New Englishes. Looking into the International Corpus of English (ICE) and beyond. In Nevalainen

Terttu

Traugott

Elizabeth Closs

(eds.), The Oxford handbook of the history of English, 189-199. Oxford: Oxford University Press.

46.

Nelson

Gerald

. 2015. Response to Davies and Fuchs. English World-Wide 36(1). 38-40.

47.

Noël

Dirk

van der Auwera

Johan

. 2009. Revisiting be supposed to from a diachronic constructionist perspective. English Studies 90(5). 599-623.

48.

Noël

Dirk

van der Auwera

Johan

. 2015. Recent quantitative changes in the use of modals and quasi-modals in the Hong Kong, British and American Printed Press. Exploring the potential of Factiva^® for the diachronic investigation of World Englishes. In Collins

Peter

(ed.), Grammatical change in English world-wide, 437-464. Amsterdam: John Benjamins.

49.

Palmer

Frank

. 1990. Modality and the English modals. 2nd edn. London: Longman.

50.

Percillier

Michael

. 2016. World Englishes and second language acquisition: Insights from southeast Asian Englishes. Amsterdam: John Benjamins.

51.

Perkins

Michael

. 1983. Modal expressions in English. London: Frances Pinter.

52.

Peters

Pam

. 2009. Australian English as a regional epicentre. In Hoffman

Thomas

Siebers

Lucia

(eds.), World Englishes – Problems, properties and prospects, 107-124. Amsterdam: John Benjamins.

53.

Peters

Pam

Collins

Peter

. 2012. Colloquial Australian English. In Kortmann

Bernd

Lunkenheimer

Kerstin

(eds.), The Mouton world atlas of variation in English, 585-595. Berlin: Mouton de Gruyter.

54.

Peters

Pam

Collins

Peter

Smith

Adam

. 2009. Comparative studies in Australian and New Zealand English: Grammar and beyond. Amsterdam: John Benjamins.

55.

Quirk

Randolph

Greenbaum

Sidney

Leech

Geoffrey

Svartvik

Jan

. 1985. A comprehensive grammar of the English language. London: Longman.

56.

Schneider

Edgar

. 2003. The dynamics of New Englishes: From identity construction to dialect birth. Language 79(2). 233-281.

57.

Schneider

Edgar

. 2007. Postcolonial English: Varieties around the world. Cambridge: Cambridge University Press.

58.

Schneider

Edgar

. 2014. English around the world: An introduction. Cambridge: Cambridge University Press.

59.

Smith

Nicholas

. 2003. Changes in the modals and semi-modals of strong obligation and epistemic necessity in recent British English. In Facchinetti

Roberta

Krug

Manfred

Palmer

Frank

(eds.), Modality on contemporary English, 241-266. Berlin: Mouton de Gruyter.

60.

Strunk

William

White

E. B.

2000. The elements of style. 4th edn. New York, NY: Allyn & Bacon.

61.

Sweetser

Eve

. 1990. From etymology to pragmatics: Metaphorical and cultural aspects of semantic structure. Cambridge: Cambridge University Press.

62.

Szmrecsanyi

Benedikt

. 2017. Variationist sociolinguistics and corpus-based variationist linguistics: Overlap and crosspollination potential. Canadian Journal of Linguistics 62(4). 1-17.

63.

Szmrecsanyi

Benedikt

Grafmiller

Jason

Bresnan

Joan

Rosenbach

Anette

Tagliamonte

Sali

Todd

Simon

. 2017. Spoken syntax in a comparative perspective: The dative and genitive alternation in varieties of English. Glossa 2(1). Art 86. 1-27.

64.

van der

Auwera

Dirk Noël

Johan

de Wit

Astrid

. 2012. The diverging need (to)’s of Asian Englishes. In Marianne

Hundt

Gut

Ulrike

(eds.), Mapping unity and diversity world-wide: Corpus-based studies of New Englishes, 54-75. Amsterdam: John Benjamins.

65.

van der

Auwera

Dirk Noël

Johan

Linden

An van

. 2013. Had better, ‘d better and better: Diachronic and transatlantic variation. In Juana

I. Marín-Arrese

Carrtero

Marta

Hita

Jorge Arús

Auwera

Johan van der

(eds.), English modality. Core, periphery and evidentiality, 119-154. Berlin: Mouton de Gruyter.

66.

van Rooy

Bertus

Wasserman

Ronel

. 2014. Do the modals of Black and White South African English converge? Journal of English Linguistics 42(1). 51-67.

67.

Wärnsby

Anna

. 2006. De(coding) modality: The case of must, may, måste, and kan. Lund: Lund University.

68.

Westney

Paul

. 1995. Modals and periphrastics in English: An investigation into the semantic correspondence between certain English modal verbs and their periphrastic equivalents. Tübingen: Max Niemeyer.

69.

Ziegeler

Debra

. 2016. The diachrony of modality and mood. In Nuyts

Jan

Auwera

Johan van der

(eds.), The Oxford handbook of modality and mood, 387-405. Oxford: Oxford University Press.