Do Sitcom Conversations Fully Depict Those in Natural Settings: A Corpus-Based Lexical Analysis

Abstract

An increasing number of studies in pragmatics, second language acquisition, and related fields have opted to use sitcom conversations as a substitute for natural conversations in their analyses. However, few studies have critically examined the validity of this substitution. Taking this into consideration, the present study aims to examine the lexical similarities and differences between sitcom and natural conversations by utilizing the Friends Corpus and the Santa Barbara Corpus of Spoken American English as samples under a synthesized analytic framework of six lexical categories that have been frequently examined in previous research. The findings indicate that there are significant differences between sitcom and natural conversations at the lexical level, particularly in terms of word lengths, keywords, and the use of discourse markers, personal pronouns, vocatives, and religious words. While sitcom conversations tend to be more concise, interactive, evaluative, and involving, natural conversations provide more explanations for their speech acts and refer to their parties who are not present in the conversations. Additionally, sitcom conversations use more intensifiers and vocatives while using fewer tentative modals, expletives, and religious words. Based on these results, it can be concluded that sitcom conversations do not fully depict the conversations in sitcoms, and thus substituting natural conversations with those in sitcoms should be approached with caution in language teaching and research. This study provides insight into the differences in lexical patterns between two types of conversations, and highlights the importance of using natural conversations as a basis for language teaching and research.

Keywords

Sitcom conversations natural conversations lexical features comparisons

Research Background

Corpus linguistics has emerged as a significant research domain in linguistics, and is now considered a mainstream methodology in linguistic analyses (Clancy, 2011; O’Keeffe & Walsh, 2012; Taljard, 2014; Upton & Cohen, 2009). Among these, studies in pragmatics have increasingly turned to corpus data as a means of exploring various linguistic phenomena such as discourse markers, stance markers, conversational patterns and structures, (im)politeness, etc. These studies are often based on data from a range of corpora, ready-made (e.g., BNC, COCA, NOW, MICASE) (e.g., C. Y. Lin, 2017; Skalicky et al., 2015), self-developed (e.g., Fuentes-Rodríguez et al., 2016; Polat, 2011), or both (e.g., Yeung, 2009). While most ready-made corpora are highly representative of the genres they aim to represent, some may suffer from a lack of representatives, particularly in areas outside of corpus linguistics such as pragmatics. As a result, many researchers in this field have turned to using conversations from popular media such as sitcoms (e.g., Forcadell, 2016; Kim, 2014; Su, 2017), movies (e.g., Ionescu, 2020), and television (e.g., Sinkeviciute & Rodriguez, 2021), etc. as data sources, due to the convenience of data collection and processing, as well as their flexibility in meeting various research objectives. However, it is important to examine whether conversations in popular media can adequately substitute natural conversations in linguistic analyses. Specifically, it is necessary to consider whether conversations in sitcoms and other media differ substantially from those that occur in natural settings. This is an important question that warrants further investigation, as the use of popular media as data sources in linguistic analyses may introduce biases or limitations in the findings.

In addition to its usefulness in linguistic analysis, data from popular media have also been widely utilized as a resource for language learning and teaching. Researchers have examined the efficacy of using films in language teaching (e.g., Allan, 1985; Rose, 2001; Stempleski & Tomalin, 1990), and explored how movies, televisions, and sitcoms can be used more effectively to teach a second language (e.g., Allan, 1985; Candlin et al., 1982; Eisenstein et al., 1987; Grant & Starks, 2001; P. M. S. Lin, 2014; Pattemore & Muñoz, 2020; Ruck, 2022). Studies have shown that popular media can help develop learners’ vocabulary competence (e.g., Bisson et al., 2013; Csomay & Petrović, 2012; MacFadden et al., 2009; Neuman & Koskinen, 1992; Rodgers & Webb, 2011; Webb, 2010, 2011), pragmatic competence (e.g., Alerwi & Alzahrani, 2020; Barón & Celaya, 2022; Derakhshan & Eslami, 2020; Omar & Razı, 2022), discoursal and stylistic competence (e.g., Marshall & Werndly, 2002; Meinhoe, 1998), as well as facilitating the acculturation of foreign cultures (e.g., Meinhoe, 1998; Tolson, 2001). Several studies have found that learners with extensive exposure to popular media perform better in learning vocabulary (e.g., Bisson et al., 2013; Verspoor et al., 2011), and developing writing competence (e.g., Verspoor et al., 2011).

The use of conversations in popular media (e.g., sitcoms, movies, television) as a substitute for natural conversations, either for research or pedagogical purposes, rests on the assumption that they are not significantly different from conversations in natural settings. Empirical studies have shown that the language used in films is “most representative of naturally-occurring data” (Rose, 2001, p. 309), and similar findings have been reported for internet televisions (e.g., P. M. S. Lin, 2014) and other settings (e.g., Abrams, 2014; Dynel, 2011; Quaglio, 2009; Richardson, 2010). However, other studies have found that the language in popular media differs to varying degrees from that in natural settings (Dose, 2013; Kozloff, 2000; Marshall & Werndly, 2002; Rose, 2001; Rossi, 2011). For instance, P. M. S. Lin (2014) has shown that different television genres resemble natural conversations to various extents, with entertainment and music programs having the least resemblance, and religion, news, and comedy programs having the most. Consequently, some scholars argue that the language in television, sitcoms, movies, etc. can never be a viable substitute for natural conversations (Emmison, 1993; Schegloff, 1988), and it is preferable to adopt natural conversations in compiling teaching materials and conducting research (Abrams, 2014; Gilmore, 2004; Huth & Taleghani-Nikazm, 2006; Wong, 2002). The controversies over the authenticity of conversations in popular media have led to the present study, which aims to compare the lexical features of conversations in sitcoms and natural conversations, and to reveal to what extent they resemble each other. It should be noted that this study only focuses on the lexical differences between the two genres, while acknowledging that there are other perspectives from which their differences could be examined.

Analytic Framework

This study aims to analyze and contrast the lexical features of the conversations in sitcoms and natural settings. Specifically, eight categories will be compared, including discourse markers, modal verbs, intensifiers, downtoners, personal pronouns, vocatives, expletives, and religious words. These categories were selected due to their frequent discussions in the field of pragmatics. Furthermore, the study will also compare the average word lengths (i.e., the average number of letters in a word) and keywords (excluding proper nouns) used in sitcoms and natural conversations, which are two of the key analyses in corpus linguistics. In total, therefore, the study will compare 10 categorical features, as depicted in Figure 1.

Figure 1.

Lexical features to be compared in the study.

Word Length

Word length refers to the average number of letters comprising the words in a corpus. The inclusion of word length as a measure of lexical features is due to the findings from previous research that have demonstrated significant differences in word length between natural conversations and prepared speech, such as writing (Fan, 2011; Shi & Lei, 2021). Sitcoms employ a language that is “prepared” for speaking, as it is first scripted and then spoken by the actors or actresses. This differs from natural conversations, which are mostly impromptu and unprepared. Therefore, the observation of word length in sitcoms and natural conversations can serve as an indicator to determine the authenticity of sitcom conversations in portraying those in natural settings.

Keywords

Keywords are words that appear “in a text or corpus statistically significantly more frequent than would be expected by chance when compared to a corpus” (Baker et al., 2006, p. 97). The comparison of keywords between sitcoms and natural settings can provide valuable insights into whether these two genres utilize similar lexical choices. In order to accurately assess the lexical features of each genre, proper names such as character, place, and institution names should be excluded from analysis, as these are often topic-specific and may not accurately reflect the generic features of the genres being compared.

Other Categories

The study will also evaluate other lexical features, such as discourse markers, modal verbs, intensifiers, downtoners, and so on, as indicated in Figure 1. The specific words under scrutiny are presented in Table 1, primarily based on previous research by Biber et al. (1999, pp. 554–556, 564-569, 1086–1088, 1110–1111), Traugot (2020), Halliday and Matthiessen (2004, p. 116), Richard and Schmidt (2010, p. 184), and Martínez (2018). It needs to be pointed out that you know, you know what, you see, I mean, etc. will be treated as lexical rather than phrasal features since they convey a coherent, indispensable meaning as a whole and are categorized as typical discourse markers.

Table 1.

Words Selected to be Searched in Different Lexical Categories.

Lexical categories	Sample words
Discourse markers	ah, all right, by the way, I mean, I see, I think, now, oh, okay, okay, right, well, yeah, you know, you see.
Modals	can, may, could, might, will, would, should, shall, must, ought to, need, have/has/had to.
Intensifiers	absolutely, bloody, completely, damn, entirely, extremely, fully, highly, incredibly, more, real, really, so, surely, terribly, too, totally, totally, very, certainly, obviously.
Downtoners	almost, fairly, hardly, less, likely, nearly, partially, pretty, probably, quite, rather, relatively, slightly, somewhat, kind of, sort of, something like, just, perhaps, probably.
Personal pronouns	I, we, me, us, you, he, she, it, they, him, her, it, them, my, our, mine, ours, your, yours, his, her, their, hers, its, theirs.
Vocatives	baby, boy(s), bro(s)/brother(s), bud/buddy, darling, dear, dude, folks, gee, girl(s), guys, honey, lad, man, mate, pal, sister, sweetie.
Expletives	shit, fuck/fucking, devil, hell, damn, bloody, bullshit, asshole.
Religious words	Christ, (my) God, (my) goodness, (my) gosh, heavens, Lord, Jesus, my!

Methodology

Data Source

The data utilized in the study was derived from two primary sources. The first source was obtained from the classic sitcom Friends, which serves as a representative sample for sitcom conversations. The second source of data was drawn from the Santa Barbara Corpus of Spoken American English (SBC for short hereafter), which was used as a representative sample for natural conversations.

Friends Corpus

The Friends Corpus (FC for short hereafter) was obtained from the website Crazy for Friends (http://www.livesinabox.com/friends/scripts.shtml), a fan club that provides free transcripts of the sitcom for educational and entertainment purposes. According to Quaglio (2009, p. 30), the transcripts were found to be “fairly accurate” and “extremely detailed” (Quaglio, 2009, p. 30), which ensures the reliability of using these data as samples for sitcom conversations. The choice of Friends as a sample was mainly based on its widespread popularity in America (and many other English and non-English speaking countries), and its significant impact on American culture, including language use among speakers of diverse groups (Quaglio, 2009, p. 12).

The investigation of the linguistic features requires manual tagging or refinement due to the potential for words to be used in both pragmatic and non-pragmatic ways. For instance, the word Well is a common discourse maker in native conversations, but its use cannot be reliably identified through automatic corpus tools alone, as it can also function as a noun (e.g., He dug a well), and an adverb (e.g., Well done!). Therefore, a combination of both automatic searching and manual refinement is requisite to distinguish its discourse marker use from its non-marker use. In many cases, such as when scrutinizing the use of boy as a vocative or a noun, this manual refinement is exceedingly time-consuming, which has constrained the size of the corpora utilized in the analysis.

To ensure feasibility and comparability, the study analyzed the first three episodes of each of the 10 seasons, resulting in a total of 30 episodes. The FC consists of 77,199 word tokens, which fall into 4,649 word types, as indicated in Table 2.

Table 2.

Basic Information About the Two Corpora in the Study.

Corpus	Tokens	Types	TTR	STTR	Word length
FC	77,199	4,649	6.02	34.22	3.70
SBC	91,233	6,570	7.00	33.24	3.63

SBC Corpus

SBC is a representative sample of contemporary spoken American English, consisting of recordings of naturally occurring interactions among speakers of different ages, genders, occupations, social backgrounds, etc. It mainly includes face-to-face conversations, as well as other types of language use such as telephone communications, pub talks, classroom lectures, meetings, etc. (Du Bois et al., 2000–2005) This makes it comparable to the language used in FC, in which face-to-face conversations are also the predominant form. SBC has 60 files, and this study chose the first 21 as a sample for natural conversations. The corpus consists of 91,233 word tokens, which fall into 6,570 word types, as shown in Table 2. The selection of a larger reference corpus (i.e., SBC) to the observed corpus (i.e., FC) is in line with the established practices in corpus linguistics (Liang et al., 2010, p. 86).

Instruments

The present study utilized two software programs as instruments, namely WordSmith Tools 6.0 and Loglikelihood and Chi-square Calculator 1.0. WordSmith Tools is a computer program designed to analyze how words behave in texts. It consists of three distinct packages, including WordList, Concord, and KeyWords. WordList provides users with the frequency of words or word clusters within a corpus, while Concord enables users to observe the co-occurrence of a word or word cluster in context. KeyWords facilitates the identification of key words within a given corpus in comparison with others. By implementing WordSmith Tools, the present study compared the lexical similarities and differences between the two corpora.

Furthermore, the Loglikelihood and Chi-square Calculator 1.0, developed by Maocheng Liang at Beihang University, China, was used to calculate the degrees of difference between the 10 categories mentioned in Section 2. The chi-square test utilized in the program is based on the classic chi-square test of Yates correction for 2 × 2 tables (Liang, 2010).

Procedures of Data Collection and Analysis

The study employed several methodological procedures to analyze the lexical features of sitcom and natural conversations. First, Wordsmith Tools 6.0 was utilized to obtain the general information of the two corpora, including type, token, TTR, STTR, and word length. Then, the software was utilized to extract the raw frequencies of the words listed in Table 1, whereby irrelevant instances such as well being used as a noun instead of a discourse marker were eliminated. Finally, the raw frequencies were input into the Loglikelihood and Chi-square Calculator (Liang, 2010) to measure the degrees of differences in the lexical features between sitcom and natural conversations.

Research Findings

Word Lengths

In the field of corpus linguistics, word length refers to the average number of letters in a word. It has been observed that different genres of language use tend to have varying word lengths. For example, Shi and Lei (2021) report that the average word length in spoken language is approximately 3.7 letters, while that in written language is around 4.4 letters (Fan, 2011). Therefore, it is necessary to investigate and compare the word lengths in FC and SBC to determine whether there are significant differences in language use between the two corpora.

As presented in Table 2, this study found that the average word length in FC is approximately 3.70 letters, which is quite similar to that of SBC (3.63 letters). This suggests that there are no significant differences in word lengths between the conversations in sitcoms and natural settings. This finding is consistent with the observation by Shi and Lei (2021) that spoken language tends to employ shorter, simpler words, typically consisting of about four letters.

Moreover, both FC and SBC exhibit a similar pattern in the frequency rankings of words of different lengths. Specifically, 4-letter words occupy the top rank, followed by 3-letter and 2-letter words, as shown in Table 3.

Table 3.

Frequency of Words of Different Lengths in FC and SBC.

	FC		SBC		χ ²	p
Word length	RF	SF	RF	SF
1-letter words	7,231	9,366.70	5,917	6,485.59	481.90	.00
2-letter words	13,851	17,941.94	16,713	18,319.03	−3.98	.04
3-letter words	17,580	22,772.32	20,370	22,327.45	4.72	.03
4-letter words	18,445	23,892.80	21,810	23,905.82	−0.00	.95
5-letter words	8,643	11,195.74	10,791	11,827.96	−16.31	.00
6-letter words	5,142	6,660.71	6,307	6,913.07	−4.16	.04
7-letter words	3,072	3,979.33	4,330	4,746.09	−58.33	.00
8-letter words	1,738	2,251.32	2,226	2,439.907	−6.39	.40
9-letter words	825	1,068.67	1,362	1,492.88	−58.39	.00
10-letter words	407	527.21	725	794.67	−44.41	.00
>10-letter words	265	343.27	682	747.54	−121.51	.00

Note. RF = raw frequency; SF = standardized frequency (per 100,000).

Regarding general frequencies, the analysis shows that FC exhibits significant overuse of 1- and 3-letter words, whereas it significantly underuses 2-, 5-, 6-, 7-, 9-, and 10-letter words and words of more than 10 letters. This suggests that the language used in sitcoms tends to rely on shorter words than found in SBC, and thus may oversimplify the language used in conversations. Such findings may indicate that scriptwriters and/or playwrights of sitcoms hold inaccurate assumptions regarding the language used in everyday conversations, as words used in natural conversations are likely not as short or as simple as those in sitcoms.

Keywords

Keywords in Friends

According to Chen (2012, p. 213), a keyword is defined as a word that occurs significantly more frequently in a given corpus than in a reference corpus. The Keyword method is a tool that can be used to measure whether there are statistically significant differences in word frequencies between two corpora. However, keywords are highly sensitive to topics, participants, and other contextual factors of the conversations. Therefore, proper nouns such as character names, place names, and institution names were excluded from the analysis to ensure that the identified keywords are relevant to the general language use in the corpora.

The analysis of keywords revealed that FC significantly overuses various lexical items related to dating and weddings (e.g., married, wedding, dating), contractions (e.g., ‘t, ‘s, don), deixis (e.g., I, you, my), discourse markers (e.g., oh, y’know, well), phatic expressions (e.g., hey, guys, dude), honorifics (e.g., honey, sweetie, please), evaluation markers (e.g., great, good, weird), negation terms (e.g., no, not), prototypical speech act markers (e.g., sorry, thank, thanks), wh-words (e.g., what, how, why), attention raisers (e.g., look, listen, wait), psychological words (e.g., believe, want, feel), modal verbs (e.g., can, should, maybe), time markers (e.g., now, minute, Monday), sex-related terms (e.g., sex, breast), food terms (e.g., coffee, chip), religious words (Jesus), etc., as presented in Table 4.

Table 4.

Words That Are Significantly Overused in FC.

Word type	Specific words
Dating and wedding	married, wedding, ring, marry, propose, honeymoon, date, engaged, dating
Contractions	’t, ‘s, don, didn, ‘ll, ‘m, ‘ve, gonna, gotta, I’m, couldn, doesn, won, wouldn, wasn, haven
Deixis	I, you, my, me, your, y, her, someone, this, here, we
Discourse markers	oh, okay/ok, whoa, y’know, yes, well
Phatic expressions	hey, hi, guys, baby, bye, hello, fine, dude
Honorifics	honey, please, sweetie, Mr., sweet
Evaluation markers	just, great, wrong, good, cute, weird
Negation	no, not
Speech act markers	sorry, thank, thanks
Wh-words	what, how, why
Attention raisers	look, listen, wait
Psychological state	believe, want, feel, promise
Modal verbs	can, should, maybe
Place names	Tulsa, London
Time markers	now, minute, Mon(day), night
Sex-related words	sex, breast
Food terms	coffee, chip
Religious words	God

The excessive use of dating, wedding, and sex-related words in FC can be attributed to the show’s central focus on the lives of young adults. Nonetheless, the remaining differences between sitcom and natural conversations suggest that sitcom conversations tend to be more concise, as indicated by their frequent use of contractions. Contractions are strongly associated with conversations (Biber et al., 1999, p. 1129), which means that sitcom conversations are a reasonable reflection of natural conversations in this regard. However, the overuse of contractions implies that sitcom conversations may over-represent certain features of natural conversations.

Moreover, the overuse of discourse markers and deixis suggests that sitcom conversations are more interactive than those in natural settings, since one of the two major roles of discourse markers is to indicate interactive relationships between speakers, hearers, and messages (Biber et al., 1999, p. 1086), and deixis in conversations are mainly used to build and maintain social relations (Kretzenbacher et al., 2020).

Furthermore, sitcom conversations tend to be more focused on participants’ involvement with their frequent use of phatic expressions, honorifics, and attention raisers. These linguistic devices overtly invite the addressees’ participation. Additionally, sitcom conversations tend to be more negative, as evidenced by their excessive of negation markers, and more evaluative, as indicated by their heavy use of evaluation markers.

Keywords in SBC

The study found that the number of words that were significantly overused in SBC was relatively small. Specifically, the results indicated that SBC primarily overused 2 contractions (i.e., cause, ‘re), 7 third-party deixes (e.g., these, their, he), 1 sequence marker (i.e., then), and 1 religious word (i.e., Jesus), as is shown in Table 5.

Table 5.

Words That Are Significantly Overused in SBC.

Word type	Specific words
Contractions	cause, ’re
Deixis	these, their, he, they, that, there, it
Time markers	Then
Religious words	Jesus

It is interesting to note the difference in the use of religious words in natural conversations and in sitcoms. While both Farr and Murphy (2009) and the present study indicate that religious words are common in natural conversations, FC and SBC show different preferences for the use of specific religious words. The overuse of Jesus in SBC suggests that it may be more commonly used in everyday conversations compared to God (with a frequency of 205 over 1), while FC seems to prefer the use of God over Jesus.

The difference in the use of cause between FC and SBC also highlights a potential difference in the way interpersonal relations are handled in natural conversations versus in sitcoms. The frequent use of cause in SBC suggests a greater emphasis on providing reasons or explanations for speech acts, which can help to avoid misunderstandings and maintain positive social relations.

In general, compared with the keywords for FC in Table 4, it can be inferred that SBC pays more attention to explanations (e.g., cause), people who are not present in the current turn of talk or place not adjacent to the speakers (e.g., their, he, there). Another interesting finding is the overuse of Jesus in SBS because under similar settings FC prefers God in conversations.

Discourse Markers

Discourse markers are words or expressions that facilitate ongoing interactions and are loosely attached to clauses (Biber et al., 1999, p. 140). They have been a central issue in pragmatics for decades and are typically associated with spoken language. This study analyzed the occurrence of discourse markers listed in Table 1 and found that there are significantly more discourse markers in FC than in SBC (χ² = 59.75, p = .00), as illustrated in Table 6. This suggests that sitcom conversations use more discourse markers than natural conversations.

Table 6.

Comparisons of Discourse Markers Between Sitcom and Natural Conversations.

Discourse marker	FC		SBC		χ ²	p
Discourse marker	RF	SF	RF	SF	χ ²	p
Ah	56	72.54	25	27.40	16.80	.00
All right	166	215.03	3	3.29	184.93	.00
By the way	17	22.02	4	4.38	5.08	.02
I mean	193	250.00	238	260.87	−0.15	.70
I see	3	3.89	10	10.96	−1.87	.17
I think	134	173.58	195	213.74	−3.26	.07
(Right) now	238	308.29	159	174.28	31.37	.00
Oh	871	1,128.25	500	548.05	173.64	.00
Ok/okay	750	971.52	395	432.96	212.71	.00
Right	75	97.15	300	328.83	−99.99	.00
Well	483	625.66	496	543.66	4.72	.03
You know	137	177.46	745	816.59	−326.68	.00
You know what	31	40.16	7	7.67	18.15	.00
You see	6	7.77	12	13.15	−0.69	.41
Yeah	548	709.85	880	964.56	−31.97	.00
Yep	12	15.54	19	20.83	−0.38	.54
Total	3,720	4,818.72	3,688	4,042.40	59.75	.00

Table 6 reveals that FC employs significantly more discourse markers such as ah, all right, by the way, (right) now, oh, ok/okay, well, and you know what, while significantly underusing the markers of right, you know, and yeah. These results suggest that sitcoms tend to overuse discourse markers in general, and the number of overused markers in sitcoms is higher than that in natural conversations. Moreover, natural conversations tend to be more affirmative, overusing markers such as right and yeah, which mainly indicate positive evaluation of previous utterances. This finding is consistent with Table 4, where negative evaluations are more prevalent in sitcom conversations. Therefore, the present study suggests that natural conversations use fewer negations or negative evaluations than those in sitcoms.

Modal Verbs

The results from the corpus (Table 7) suggest that FC significantly overuses modal verbs in general (χ² = 9.30, p = .00) and the markers of can, could, and should in particular. Additionally, FC significantly underuses the marker would in comparison with SBC. The findings demonstrate that sitcom conversations exhibit only a small proportion of modal verb difference (4 out of 12) when compared to natural conversations, though the overall frequency differs significantly. It is noteworthy that the modals overused by FC are primarily of low (i.e., can, could) and median (i.e., should) values (Halliday & Matthiessen, 2004, p. 116), and there is no discernible difference in the use of high-valued modals (e.g., must, have to) between FC and SBC. Both FC and SBC employ high-valued modals at a lower frequency, which may result in less compelling or offensive conversations. However, significant variations exist in the overuse of can, could, and should, and in the underuse of would in FC than in SBC. According to Biber et al.’s (1999, pp. 491–496) classifications, can and could are markers of permission, possibility, or ability, while should is a marker of obligation and necessity; Conversely, would is a marker of volition or prediction. Thus, it could be posited that natural conversations, as exemplified by SBC, are more tentative or less compelling than those in sitcoms.

Table 7.

Comparisons of Modal Verbs Between Sitcom and Natural Conversations.

Modal verb	FC		SBC		χ ²	p
Modal verb	RF	SF	RF	SF	χ ²	p
Can	395	511.66	278	304.71	44.48	.00
May	26	33.68	41	44.94	−1.07	.30
Could	143	185.24	124	135.92	60.12	.01
Might	32	41.45	38	41.65	0.01	.92
Will	97	125.65	114	124.95	0.00	.98
Would	161	208.55	291	318.96	−18.64	.00
Should	106	137.31	61	66.86	20.24	.00
Shall	2	2.59	5	5.48	−0.29	.59
Must	21	27.20	30	32.8	−0.28	.60
Ought to	0	0.00	4	4.38	−1.79	.18
Need	0	0.00	0	0.00	0.00	1.00
Have/has/had to	172	222.80	218	238.95	−0.40	.52
Total	1,155	1,496.13	1,204	1,319.70	9.30	.00

Intensifiers

Intensifiers are a subset of words, typically adverbs, that modify gradable adjectives, adverbs, and verbs to augment the degree or intensity (Biber et al., 1999, pp. 554–555; Richard & Schmidt, 2010: 184). Notably, the use of intensifiers varies across different genres of English; for instance, intensifiers are rarely employed in academic prose (Biber et al., 1999, p. 564), and different varieties of English exhibit a preference for different types of intensifiers (Biber et al., 1999, p. 564).

The results of our study show that although the frequency of intensifiers in FC is significantly higher in total than in SBC (χ² = 17.75, p = .00), this difference is primarily attributed to the significant overuse of so (χ² = 42.52, p = .00) and significant underuse of very (χ² = −5.05, p = .02) (Table 8). Notably, only 2 out of 18 intensifiers exhibited statistically significant differences between the corpora of FC and SBC. It is therefore inferred that while the choice of specific intensifiers may not differ drastically between the two corpora, sitcom conversations tend to incorporate more intensifiers, which could be attributed to their aim to create dramatic effects in the plot.

Table 8.

Comparisons of Intensifiers Between Sitcom and Natural Conversations.

Intensifier	FC		SBC		χ ²	p
Intensifier	RF	SF	RF	SF	χ ²	p
Absolutely	10	12.95	7	7.67	0.69	.41
Bloody	2	2.59	0	0.00	0.69	.41
Completely	3	3.89	6	6.58	−0.17	.68
Damn	5	6.48	3	3.29	0.35	.55
Entirely	1	1.30	1	1.10	0.35	.55
Extremely	2	2.59	1	1.10	0.02	.88
Fully	2	2.59	2	2.19	0.11	.74
Highly	0	0.00	0	0.00	0.00	1.00
Incredibly	4	5.18	3	3.29	0.05	.82
Really	310	401.56	318	348.56	3.02	.08
So	214	277.21	122	133.72	42.52	.00
Surely	1	1.30	1	1.10	0.35	.55
Terribly	0	0.00	1	1.10	−0.00	.93
Too	34	44.04	43	47.13	−0.03	.86
Totally	24	31.09	15	16.44	3.27	.07
Very	62	80.31	106	116.19	−5.05	.02
Certainly	2	2.59	6	6.58	−0.69	.41
Obviously	5	6.48	3	3.29	0.35	.55
Total	681	882.14	638	699.31	17.75	.00

Regarding the significant differences in intensifiers between FC and SBC, it is noteworthy that the frequency distributions of the intensifiers so and very in SBC bear a closer resemblance to the findings in Biber et al. (1999, p. 565), which report that the frequencies of these two intensifiers are not substantially different. This observation underscores the fact that the lexical patterns of conversations in sitcoms such as FC diverge, to varying degrees, from those in natural settings.

Downtoners

Downtoners are a lexical category of words that function to indicate a reduction in the degree or intensity of a particular aspect of meaning (Richard & Schmidt, 2010, p. 184). These words, such as fairly, almost, somewhat, and partially, serve to mitigate the impact of the modified item (Biber et al., 1999, pp. 555–-556). A quantitative analysis of the data presented in Table 9 reveals that the sitcom corpus (FC) exhibits a statistically significant overuse of the downtoner just (χ² = 31.34, p = .00), while underutilizing other downtoners, such as less, probably, kind of, and sort of. These findings suggest that there are notable differences between the use of downtoners in sitcom conversations and natural conversations, particularly with regard to a fair proportion (5 out of 18) of the specific downtoners analyzed above. However, it should be noted that the overall frequency differences between the two corpora did not reach statistical significance.

Table 9.

Comparisons of Downtoners Between Sitcom and Natural Conversations.

Downtoner	FC		SBC		χ ²	p
Downtoner	RF	SF	RF	SF	χ ²	p
Almost	9	11.66	11	12.06	0.02	.88
Fairly	0	0.00	5	5.48	−2.59	.11
Hardly	1	1.30	6	6.58	−1.68	.19
Less	1	1.30	34	37.27	−24.34	.00
Likely	0	0.00	1	1.10	−0.01	.93
Nearly	0	0.00	1	1.10	−0.01	.93
Partially	0	0.00	0	0.00	0.00	1.00
Pretty	34	44.04	58	63.57	−2.58	.11
Probably	30	38.86	70	76.73	−9.48	.00
Quite	11	14.25	13	14.25	−0.04	.84
Rather	0	0.00	6	6.58	−3.40	.07
Relatively	0	0.00	2	2.19	−0.35	.55
Slightly	1	1.30	1	1.10	0.35	.55
Somewhat	0	0.00	5	5.48	−2.59	.11
Kind of	22	28.50	82	89.88	−24.55	.00
Sort of	2	2.59	25	27.40	−14.55	.00
Just	676	875.66	583	639.02	31.34	.00
Perhaps	2	2.59	3	3.29	−0.03	.85
Total	789	1,022.03	906	993.06	0.32	.57

When considered in conjunction with the findings regarding the usage of intensifiers as discussed in the preceding section, it can be inferred that, on the whole, the conversations depicted in sitcoms (FC) have a tendency to amplify the degree of emphasis being conveyed (e.g., you’ve done really amazing stuff.), while those observed in natural settings (SBC) tend to mitigate or reduce the degree of emphasis (e.g., I did something kind of crazy tonight.).

Personal Pronouns

Personal pronouns are a key aspect of deixis, and central to pragmatics research. As shown in Table 10, the present study reveals that the sitcom corpus (FC) employs a significantly greater number of personal pronouns in total than the natural conversation corpus (SBC) (χ² = −384.84, p = .00), with a notable overuse of such pronouns as I, we, me, you, her, my, mine, your and her, and significant underuse of such pronouns as he, it, they, them, his, their, and its in comparison to SBC. These results suggest that sitcom conversations tend to favor the use of personal pronouns that are directly relevant to the participants engaged in the immediate conversation (e.g., I, we, you), while making infrequent references to third parties (e.g., he, they, them) who are not physically present during concurrent conversational exchanges.

Table 10.

Comparisons of Personal Pronouns Between Sitcom and Natural Conversations.

Personal Pronoun		FC		SBC		χ ²	p
Personal Pronoun		RF	SF	RF	SF
First-person pronouns	I	3,730	4,831.67	2,825	3,096.47	336.15	.00
	We	672	870.48	674	738.77	8.98	.00
	Me	626	810.89	342	374.86	138.37	.00
	Us	100	129.54	137	150.17	−1.12	.29
	My	632	818.66	267	292.66	222.72	.00
	Our	97	125.65	134	146.88	−1.23	.27
	Mine	21	27.20	4	4.38	13.17	.00
	Ours	1	1.30	1	1.10	0.35	.55
Second-person pronouns	You	3,271	4,237.10	2,450	2,685.43	306.36	.00
	Your	387	501.30	239	261.97	64.04	.00
	Yours	6	7.77	7	7.67	0.07	.80
	He	321	415.81	754	826.46	−110.55	.00
	She	312	404.15	391	428.57	−0.54	.46
	It	1,305	1,690.44	1,672	1,832.67	−4.79	.03
Third-person pronouns	They	191	247.41	832	911.95	−304.86	.00
	Him	178	230.57	214	234.56	−0.01	.91
	Her	298	386.02	218	238.95	29.13	.00
	Them	98	126.94	164	179.76	−7.17	.01
	His	94	121.76	187	204.97	−16.89	.00
	Her	298	386.02	218	238.95	29.13	.00
	Their	26	33.68	133	145.78	−54.54	.00
	Hers	3	3.89	0	0.00	1.70	.19
	lts	8	10.36	24	26.31	−4.79	.03
	Theirs	0	0.00	2	2.19	−0.35	.55
Total		12,675	16,418.61	11,889	13,031.47	384.84	.00

Regarding the frequency distribution of pronouns, Table 10 demonstrates that both FC and SBC exhibit a preference for the use of first and second person pronouns, as the frequencies of pronouns belonging to these categories are generally higher than those of third person pronouns. This finding suggests that both sitcom and natural conversations tend to prioritize the reference of individuals who are “in immediate contact” (Biber et al., 1999, p. 333) in the ongoing exchange, as opposed to those who are not present during the conversation.

Nonetheless, it is worth noting that there are observable differences between FC and SBC with regard to pronoun usage. Specifically, FC displays a tendency to overuse first and second person pronouns in comparison to SBC, while also underutilizing third person pronouns. This finding highlights the importance of making reference to third parties during natural conversations, which appears to be a less prominent feature in sitcom discourse.

Vocatives

Vocatives are a class of words used to draw the addressee’s attention and/or maintain and reinforce social relationships in conversation (Biber et al., 1999, p. 1112). Examples include boys, buddy and sweetie. Analysis of the data (Table 11) reveals that FC significantly overuses vocatives in total compared to SBC (χ² = −106.98, p = .00). Further examination shows that FC contains significantly more instances of vocatives such as baby, dude, (you) guys, honey, man/men, and sweetie, while exhibiting fewer occurrences of dear than SBC. These findings suggest that sitcom conversations tend to employ a higher frequency of vocatives to attract the attention of the individuals being addressed during conversation.

Table 11.

Comparisons of Vocatives Between Sitcom and Natural Conversations.

Vocative	FC		SBC		χ ²	p
Vocative	RF	SF	RF	SF	χ ²	p
Baby	10	12.95	1	1.10	7.28	.01
Babies	2	2.59	0	0.00	0.69	.41
Boy	8	10.36	16	17.54	−1.05	.31
Boys	2	2.59	0	0.00	0.69	.41
Bro(s)	0	0.00	1(0)	1.10	−0.01	.93
Brother(s)	0	0.00	0	0.00	0.00	1.00
Bud	0	0.00	1	1.10	−0.01	.93
Buddy	7	9.07	3	3.29	1.48	.22
Buddies	0	0.00	0	0.00	0.00	1.00
Darling	2	2.59	0	0.00	0.69	.41
Dear	0	0.00	12	13.15	−8.39	.00
Dude	19	24.61	1	1.10	17.55	.00
Dudes	0	0.00	0	0.00	0.00	1.00
Folks	1	1.30	2	2.19	0.02	.88
(You) guys	39	50.52	3	3.29	35.55	.00
Honey	72	93.27	8	8.77	61.12	.00
Lad	0	0.00	0	0.00	0.00	1.00
Man/men	27/0	34.97	4/0	4.38	19.63	.00
Mate/mates	0	0.00	0	0.00	0.00	1.00
Pal	1	1.30	0	0.00	0.01	.93
Pals	0	0.00	0	0.00	0.00	1.00
Sweetie	23	29.79	0	0.00	25.05	.00
Total	215	278.50	65	71.25	106.98	.00

It is worth considering that this difference in vocative usage may also be influenced by the specific registers employed in sitcom discourse. For example, Friends, the sitcom analyzed in this study, predominantly focuses on the lives of young individuals living in close proximity to one another, and as such, the characters in the show often address one another using terms such as dude, guys, and sweetie in order to capture their attention or elicit a response. Moreover, vocatives can serve to emphasize the interpersonal relationships between conversationalists, as demonstrated in the following example: “Well, you got here just in time. I really have to go, buddy.” This function of vocatives is especially relevant in the context of sitcoms, where the plots are typically more intense and character-driven.

Expletives

Expletives, which are considered taboo expressions or swearwords not typically employed or encouraged in everyday conversations (Biber et al., 1999, pp. 1094–1095), are also examined in this study. The results presented in Table 12 suggest that there is no significant difference in the frequency of expletive usage between FC and SBC (χ² = −1.68, p = .20) with the exception of the terms shit and hell. Specifically, FC demonstrates a propensity for employing the term hell with a relatively high frequency (χ² = 9.28, p = .00), whereas it makes use of the term shit significantly less frequently than SBC (χ² = −22.13, p = .00). However, despite these distinctions, no significant difference between the overall usage of expletives in FC and SBC was observed.

Table 12.

Comparisons of Expletives Between Sitcom and Natural Conversations.

Expletive	FC		SBC		χ ²	p
Expletive	RF	SF	RF	SF	χ ²	p
Shit	0	0.00	31	30.90	−22.13	.00
Fuck	0	0.00	2	2.19	−0.35	.55
Fucking	0	0.00	5	5.48	−2.59	.11
Hell	28	36.27	13	14.25	7.45	.01
Damn	5	6.48	4	4.38	0.06	.80
Bloody	2	2.59	0	0.00	0.69	.41
Bullshit	0	0.00	5	5.48	−2.59	.11
Asshole	0	0.00	1	1.10	−0.01	.93
Bastard	0	0.00	0	0.00	0.00	1.00
Total	35	45.34	61	66.86	−3.03	.08

These results suggest that the use of expletives is generally avoided in both sitcom and natural conversations, as reflected by their comparatively lower overall frequency in relation to other linguistic features such as discourse markers, intensifiers, downgraders, and vocatives in similar conversational settings. However, the data also reveals that expletives are more diverse and more frequently used in natural conversations than in sitcoms. This suggests that natural conversations, as illustrated by SBC, tend to exhibit a higher degree of “vulgarity” than sitcoms, as these words have the potential to offend to varying degrees (Biber et al., 1999, pp. 1094–1095), although, in rare or highly contextualized instances, they may also function as a marker of solidarity or politeness (Daly et al., 2004).

Religious Words

Religious words are frequently employed in conversations to convey intense emotions, particularly in response to highly negative experiences (Biber et al., 1999, p. 1094). Table 13 demonstrates that FC employs a significantly greater number of religious words in total (χ² = 3.72, p = .00), specifically with the phrase (my) God (χ² = 77.34, p = .00), when compared to SBC. However, FC underutilizes certain religious words such as Christ, heaven(s), Lord, and Jesus in comparison to SBC. This finding is partially consistent with Farr and Murphy (2009), who discovered that native English speakers commonly use God, Jesus, and Christ in natural conversations.

Table 13.

Comparisons of Religious Words Between Sitcom and Natural Conversations.

Religious word	FC		SBC		χ ²	p
Religious word	RF	SF	RF	SF	χ ²	p
Christ	0	0.00	25	27.40	−19.35	.00
(My) God	205	265.55	96	105.22	59.36	.00
Goddess	0	0.00	0	0.00	0.00	1.00
Goodness	1	1.30	2	2.19	−0.02	.88
(My) gosh	7	9.07	5	5.48	0.34	.56
Heaven(s)	1	1.30	12	13.15	−6.16	.01
Lord	2	2.59	25	27.40	−14.55	.00
Jesus	1	1.30	47	51.52	−35.28	.00
Total	217	281.09	212	232.37	3.72	.05

The disparity in the usage of religious words between sitcom conversations and natural conversations is evident, with the former exhibiting a heavier reliance on the word God and a lesser usage of other religious words such as Christ, heaven(s), Lord, and Jesus than the latter. This finding suggests that sitcoms tend to limit the use of religious words in conversations, possibly due to linguistic and cultural constraints. In contrast, the more diverse usage of religious words in natural conversations implies that they serve as a means to convey intense negative emotions. Hence, it can be inferred that the difference in the employment of religious words between sitcoms and natural conversations could be attributed to the restrictions and norms surrounding their language use.

Conclusion

The present study aims to investigate the lexical differences between conversations in sitcoms and natural settings. The findings suggest that sitcoms tend to employ shorter words, particularly those with three letters, while the average word length between the two settings remains relatively similar, hovering around 3.7 letters per word. This oversimplification observed in sitcom conversations contrasts with the comparatively longer words that are utilized in natural conversations.

An analysis of keywords further reveals that sitcom conversations, as exemplified by FC, exhibit more economy, interaction, involvement, and evaluation, whereas natural conversations tend to provide more explanations and refer to third parties. Additionally, the study identifies specific types of words that are overused in each setting, with discourse markers, modal verbs, intensifiers, personal pronouns, vocatives, and expletives being more prevalent in sitcom conversations, and downtoners and religious words being more frequently employed in natural conversations.

The findings of this study suggest that significant lexical differences exist between conversations in sitcoms and natural settings. Thus, it is recommended that language researchers exercise great caution when utilizing sitcom conversations as data sources, given their limited ability to fully depict the lexical features of natural conversations. While sitcom conversations offer a more convenient and efficient means of collecting and analyzing data, language teachers and teaching material developers should be aware of these differences and adjust their usage of sitcom conversations accordingly to reflect the nature of natural conversations more closely. However, it may not be feasible to directly adopt natural conversations for teaching purposes owing to such imperfections like false starts, overlaps, repairs, etc. in natural conversations, hence modifications are necessary even if they are utilized for educational purposes. It should be noted that this study only examines lexical differences between sitcom and natural conversations, and further research is needed to investigate other linguistic features (e.g., sentence types, syntactic features, speech acts) or to present a more thorough picture of a given phenomenon (e.g., expletives, the discourse marker well, tag questions).

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Social Science Foundation (grant no. 19BYY225), Humanities and Social Science Fund of MOE (grant no. 22JJD740011), and Beijing Foreign Studies University (grant no. 2020SYLZDXM011; ZGWYJYJJ11Z002).

Ethics Statement

Not applicable.

ORCID iD

Min Li

Data Availability Statement

Data supporting the findings of this study are available from the corresponding author upon reasonable request.

References

Abrams

Z. I.

(2014). Using film to provide a context for teaching L2 pragmatics. System, 46, 55–64. https://doi.org/10.1016/j.system.2014.06.005

Alerwi

A. A.

Alzahrani

(2020). Using sitcoms to improve the acquisition of speech Acts by EFL students: Focusing on request, refusal, apology, and compliment response. Journal of Applied Linguistics and Language Research, 7(1), 63–79.

Allan

(1985). Teaching English with video. Longman.

Baker

Hardie

McEnery

(2006). A glossary of corpus linguistics. Edinburgh University Press.

Barón

Celaya

M. L.

(2022). ‘May I do something for you?’: The effects of audio-visual material (captioned and non-captioned) on EFL pragmatic learning. Language Teaching Research, 26(2), 238–255. https://doi.org/10.1177/13621688211067000

Biber

Johansson

Leech

Conrad

Finegan

(1999). Longman grammar of spoken and written English. Pearson Education Limited.

Bisson

M. J.

van Heuven

W. J.

Conklin

Tunney

R. J.

(2013). Incidental acquisition of foreign language vocabulary through brief multi-modal exposure. PLoS One, 8(4), e60912–e60917. https://doi.org/10.1371/journal.pone.0060912

Candlin

Charles

Willis

(1982). Video in English language teaching: An inquiry into the potential uses of video recordings in the teaching of English as a foreign language. University of Aston, Language Studies Unit.

Chen

(2012). Exploring corpus linguistics: Language in action. Routledge.

10.

Clancy

(2011). Complementary perspectives on hedging behavior in family discourse: The analytical synergy of variational pragmatics and corpus linguistics. International Journal of Corpus Linguistics, 16(3), 371–390. https://doi.org/10.1075/ijcl.16.3.05cla

11.

Csomay

Petrović

(2012). ‘Yes, your honor!’: A corpus-based study of technical vocabulary in discipline-related movies and TV shows. System, 40(2), 305–315. https://doi.org/10.1016/j.system.2012.05.004

12.

Daly

Holmes

Newton

Stubbe

(2004). Expletives as solidarity signals in FTAs on the factory floor. Journal of Pragmatics, 36(5), 945–964. https://doi.org/10.1016/j.pragma.2003.12.004

13.

Derakhshan

Eslami

Z. R.

(2020). The effect of metapragmatic awareness, interactive translation, and discussion through video-enhanced input on EFL learners’ comprehension of implicature. Applied Research on English Language, 9(1), 25–52. https://doi.org/10.22108/are.2019.118062.1476

14.

Dose

(2013). Flipping the script: A corpus of American Television Series (CATS) for corpus-based language learning and teaching. Studies in Variation, Contacts and Change in English. https://varieng.helsinki.fi/series/volumes/13/dose/.

15.

Du Bois

Chafe

Meyer

Thompson

Englebretson

Martey

. (2000–2005). Santa Barbara Corpus of Spoken American English (Parts 1-4). Linguistic Data Consortium.

16.

Dynel

(2011). Stranger than fiction? A few methodological notes on linguistic research in film discourse. Brno Studies in English, 37(1), 41–61. https://doi.org/10.5817/bse2011-1-3

17.

Eisenstein

Shuller

Bodman

(1987). Learning English with an invisible teacher: An experimental video approach. System, 15(2), 209–216. https://doi.org/10.1016/0346-251x(87)90069-8

18.

Emmison

(1993). On the analyzability of conversational fabrication: A conceptual inquiry and single case example. Australian Review of Applied Linguistics, 16(1), 83–108. https://doi.org/10.1075/aral.16.1.06emm

19.

Fan

(2011). A corpus based quantitative study on the change of TTR, word length and sentence length of the English language. In Grzybek

Köhler

(Eds.), Exact methods in the study of language and text: Dedicated to Gabriel Altmann on the occasion of his 75th birthday (pp. 123–130). Mouton de Gruyter.

20.

Farr

Murphy

(2009). Religious references in contemporary Irish English: ‘For the love of God almighty. . . . I’m a holy terror for turf. Intercultural Pragmatics, 6(4), 535–555. https://doi.org/10.1515/iprg.2009.027

21.

Forcadell

(2016). New prosodic patterns in Catalan: Information status and (de)accentability. Journal of Pragmatics, 97, 1–20. https://doi.org/10.1016/j.pragma.2016.03.007

22.

Fuentes-Rodríguez

Placencia

M. E.

Palma-Fahey

(2016). Regional pragmatic variation in the use of the discourse marker pues in informal talk among university students in Quito (Ecuador), Santiago (Chile) and Seville (Spain). Journal of Pragmatics, 97, 74–92. https://doi.org/10.1016/j.pragma.2016.03.006

23.

Gilmore

(2004). A comparison of textbook and authentic interactions. ELT Journal, 58(4), 363–374. https://doi.org/10.1093/elt/58.4.363

24.

Grant

Starks

(2001). Screening appropriate teaching materials: Closings from textbooks and television soap operas. International Review of Applied Linguistics in Language Teaching, 39(1), 39–50. https://doi.org/10.1515/iral.39.1.39

25.

Halliday

M. A. K.

Matthiessen

(Eds.). (2004). An introduction to functional grammar (3rd ed.). Arnold.

26.

Huth

Taleghani-Nikazm

(2006). How can insights from conversation analysis be directly applied to teaching L2 pragmatics? Language Teaching Research, 10(1), 53–79. https://doi.org/10.1191/1362168806lr184oa

27.

Ionescu

(2020). Topic shifters in Romanian: A contrastive analysis. Journal of Pragmatics, 156, 110–120. https://doi.org/10.1016/j.pragma.2019.02.003

28.

Kim

(2014). How Korean EFL learners understand sarcasm in L2 English. Journal of Pragmatics, 60, 193–206. https://doi.org/10.1016/j.pragma.2013.08.016

29.

Kozloff

(2000). Overhearing film dialogue. University of California Press.

30.

Kretzenbacher

H. L.

Hajek

Norrby

Schüpbach

(2020). Social deixis at international conferences: Austrian German speakers’ introduction and address behaviour in German and English. Journal of Pragmatics, 169, 100–119. https://doi.org/10.1016/j.pragma.2020.08.007

31.

Liang

(2010). Loglikelihood and chi-square calculator 1.0. Beihang University.

32.

Liang

(2010). Using corpora: A practical coursebook. Foreign Language Teaching and Research Press.

33.

Lin

C. Y.

(2017). “I see absolutely nothing wrong with that in fact I think …”: Functions of modifiers in shaping dynamic relationships in dissertation defenses. Journal of English for Academic Purposes, 28, 14–24. https://doi.org/10.1016/j.jeap.2017.05.001

34.

Lin

P. M. S.

(2014). Investigating the validity of internet television as a resource for acquiring L2 formulaic sequences. System, 42, 164–176. https://doi.org/10.1016/j.system.2013.11.010

35.

MacFadden

Barrett

Horst

(2009). What’s in a television word list? A corpus-informed investigation. Concordia Working Papers in Applied Linguistics, 2, 78-98.

36.

Marshall

Werndly

(2002). The language of television. Routledge.

37.

Martínez

I. M. P.

(2018). “Help me move to that, blood”. A corpus-based study of the syntax and pragmatics of vocatives in the language of British teenagers. Journal of Pragmatics, 130, 33–50. https://doi.org/10.1016/j.pragma.2018.04.001

38.

Meinhoe

(1998). Language learning in the age of satellite television. Oxford University Press.

39.

Neuman

S. B.

Koskinen

(1992). Captioned television as comprehensible input: Effects of incidental word learning from context for language minority students. Reading Research Quarterly, 27(1), 94–109. https://doi.org/10.2307/747835

40.

Omar

F. R.

Razı

. (2022). Impact of instruction based on movie and TV series clips on EFL learners’ pragmatic competence: Speech acts in focus. Frontiers in Psychology, 13, 1–14. https://doi.org/10.3389/fpsyg.2022.974757

41.

O’Keeffe

Walsh

(2012). Applying corpus linguistics and conversation analysis in the investigation of small group teaching in higher education. Corpus Linguistics and Linguistic Theory, 8(1), 159–181. https://doi.org/10.1515/cllt-2012-0007

42.

Pattemore

Muñoz

(2020). Learning L2 constructions from captioned audio-visual exposure: The effect of learner-related factors. System, 93, 1–13. https://doi.org/10.1016/j.system.2020.102303

43.

Polat

(2011). Investigating acquisition of discourse markers through a developmental learner corpus. Journal of Pragmatics, 43(15), 3745–3756. https://doi.org/10.1016/j.pragma.2011.09.009

44.

Quaglio

(2009). Television dialogue: The sitcom Friends vs. natural conversation. John Benjamins.

45.

Richard

J. C.

Schmidt

(Eds.). (2010). Longman dictionary of language teaching and applied linguistics (4th ed.). Pearson Education Limited.

46.

Richardson

(2010). Television dramatic dialogue: A sociolinguistic study. Oxford University Press.

47.

Rodgers

Webb

(2011). Narrow viewing: The vocabulary in related television programs. TESOL Quarterly, 45(4), 689–717. https://doi.org/10.5054/tq.2011.268062

48.

Rose

K. R.

(2001). Compliments and compliment responses in film: Implications for pragmatics research and language teaching. International Review of Applied Linguistics in Language Teaching, 39(4), 309–326. https://doi.org/10.1515/iral.2001.007

49.

Rossi

(2011). Discourse analysis of film dialogues: Italian comedy between linguistic realism and pragmatic non-realism. In Piazza

Bednarek

Rossi

(Eds.), Telecinematic discourse: Approaches to the language of films and television series (pp. 21–46). John Benjamins.

50.

Ruck

(2022). Elementary-level learners’ engagement with multimodal resources in two audio-visual genres. Language Learning Journal, 50(3), 328–343. https://doi.org/10.1080/09571736.2020.1752291

51.

Schegloff

(1988). Goffman and the analysis of conversation. In Drew

Wotton

(Eds.), Erving Goffman: Exploring the interaction order (pp. 89–135). Northeastern University Press.

52.

Shi

Lei

(2021). Lexical use and social class: A study on lexical richness, word length, and word class in spoken English. Lingua, 262, 1–14. https://doi.org/10.1016/j.lingua.2021.103155

53.

Sinkeviciute

Rodriguez

(2021). “So… introductions”: Conversational openings in getting acquainted interactions. Journal of Pragmatics, 179, 44–53. https://doi.org/10.1016/j.pragma.2021.04.024

54.

Skalicky

Berger

C. M.

Bell

N. D.

(2015). The functions of “just kidding” in American English. Journal of Pragmatics, 85, 18–31. https://doi.org/10.1016/j.pragma.2015.05.024

55.

Stempleski

Tomalin

(1990). Video in action: Recipes for using video in language learning. Prentice Hall.

56.

(2017). Local grammars of speech acts: An exploratory study. Journal of Pragmatics, 111, 72–83. https://doi.org/10.1016/j.pragma.2017.02.008

57.

Taljard

(2014). In search of larger units of meaning: A foray into Northern Sotho data. Language Matters, 45, 91–109. https://doi.org/10.1080/10228195.2013.790916

58.

Tolson

(2001). Television talk shows: Discourse, performance, spectacle. Lawrence Erlbaum Associates.

59.

Traugot

(2020). Expressions of stance-to-text: Discourse management markers as stance markers. Language Sciences, 82, 1–14. https://doi.org/10.1016./j.langsci.2020.101329.

60.

Upton

T. A.

Cohen

M. A.

(2009). An approach to corpus-based discourse analysis: The move analysis as example. Discourse Studies, 11(5), 585–605. https://doi.org/10.1177/1461445609341006

61.

Verspoor

M. H.

De Bot

van Rein

(2011). English as a foreign language: The role of out-of-school language input. In de Houwer

Wilton

(Eds.), English in Europe Today: Sociocultural and Educational Perspectives (pp. 147–166). John Benjamins.

62.

Webb

(2010). A corpus driven study of the potential for vocabulary learning through watching movies. International Journal of Corpus Linguistics, 15(4), 497–519. https://doi.org/10.1075/ijcl.15.4.03web

63.

Webb

(2011). Selecting television programs for language learning: Investigating television programs from the same genre. International Journal of English Studies, 11(1), 117–136. https://doi.org/10.6018/ijes/2011/1/137131

64.

Wong

(2002). Applying’ conversation analysis in applied linguistics: Evaluating dialogue in English as a second language textbooks. International Review of Applied Linguistics in Language Teaching, 40(1), 37–60. https://doi.org/10.1515/iral.2002.003

65.

Yeung

(2009). Use and misuse of ‘besides’: A corpus study comparing native speakers’ and learners’ English. System, 37(2), 330–342. https://doi.org/10.1016/j.system.2008.11.007