Sage Journals: Discover world-class research

Abstract

Epistemic stance is adopted as an approach to convey the speakers’ certainty about knowledge and beliefs. This research aims to make a comprehensive comparison between professional live streamers and Chinese college students in terms of epistemic stance expression in their live streaming. Focusing on three distinct product categories—fashion, food, and home and electronics—the research collected texts from Amazon live streamers’ online broadcasts and Chinese college students’ in-class live streaming training. The investigation employed epistemic stance features derived from Hyland’s model, utilizing 76 stance features categorized as boosters or hedges for analysis. The results revealed that both professional live streamers and college students used boosters significantly more than hedges when expressing epistemic certainty, along with a distinct co-occurrence pattern of self-referential expressions (e.g., I, we) combined with boosters. However, there were statistically significant differences between the two groups in the use of epistemic stance expressions. Professional streamers demonstrated higher frequencies of epistemic devices overall, which was observed in three parts of speech (adverbs, modals, and verbs) and across three product categories. These findings have theoretical significance for advancing study of stance expression and pedagogical implications for business English applied in live streaming training context.

Plain Language Summary

Expressing Certainty in Live Streaming: How Professional Streamers and College Students Differ in Their Expressions

This study compares how professional live streamers and Chinese college students express certainty (epistemic stance) in live streams. It looked at three product categories: fashion, food, and home & electronics. Data came from Amazon streamers’ broadcasts and students’ in-class live stream training. Using Hyland’s model, researchers analyzed 76 features of certainty expression, split into “boosters” (which strengthen certainty) and “hedges” (which weaken it). Both groups used boosters much more than hedges. They also often combined self-referential phrases with boosters. But there were big differences: professionals used more certainty-related language overall. This was true for adverbs, modals, and verbs, and across all three product categories. The findings help advance research on how people express certainty and offer tips for teaching business English in live stream training.

Keywords

epistemic stance boosters hedges live streaming in-class live streaming training

Introduction

With the continuous development of cross-border e-commerce (CBEC), multiple platforms have launched the function of cross-border live streaming, such as AliExpress, Amazon, and Shopee. At the same time, short video platforms and social media (i.e., TikTok) are constantly expanding their markets in fields such as fashion, food, electronics, and home furnishings through live streaming. Live streaming has emerged as a major contributor to the growth of CBEC nowadays. The issue of live streaming training has received considerable attention in the field of higher education (Chen, 2022; Feng & Tan, 2025; Jiang, 2021; Liu, 2018) in China, although there have been few attempts to investigate how college students learn these skills from the perspective of linguistics. It has been urgent to train qualified live streamers for the applied skill-based class and the CBEC industry. The development of CBEC live streaming requires general English talents. At present, whether in China or other countries, Multi-Channel Network (MCN) institutions and foreign trade or manufacturing companies that also engage in live streaming are all in significant shortage of CBEC talents who can speak foreign languages, select products, edit product information, and appear on camera (Chen, 2022).

Some prior research has been involved with emotional vocabulary, but rarely with stance expression. In terms of three hot categories, including fashion, food, and home and electronics, previous scholars (Lin, 2017; Wang & Pan, 2022) found some differences in the expression of objectivity and emotionality. For non-standardized products, live streaming relies on “trust-based” communication to build consumer trust. For example, streamers act as “intermediaries” to recommend products, emphasizing “personal trial use” and “quality assurance” instead of relying on parameter lists. This approach contrasts significantly with the communication techniques used for standardized products (such as 3C products). When introducing fashion products (such as clothing and cosmetics), streamers frequently employ adverbs of degree, such as “super beautiful,”“very good,” and “absolutely beautiful.” By using exaggerated and emotional descriptions, they reinforce subjective aesthetic guidance and compensate for the uncertainty of product attributes. For instance, they use abstract terms like “high-end” and “elegant” in place of objective parameters.

This study focuses on how Amazon streamers conduct live streaming in a professional way and from the perspective of epistemic stance, which holds pedagogical implications for college students to improve their capability of engaging in the CBEC industry.

Literature Review

Epistemic Stance

Epistemic stance categorized by Biber et al. (1999) constitutes one of the three functional domains of stance, specifically denoting the degree of certainty associated with knowledge claims. Authors leverage a range of stance devices to encode epistemic stance, including certainty adverbs (e.g., actually, absolutely), verb/noun/adjective constructions with to/that-clauses (e.g., conclude, confirm), likelihood-related verb/noun/adjective phrases (e.g., presume, probable), and likelihood adverbs (e.g., presumably, probably). These linguistic markers serve a dual purpose: communicating the author’s commitment to propositional validity and establishing rhetorical credibility (Biber et al., 1999).

Numerous scholars (Alghazo et al., 2021; Chen & Jiang, 2025; Chen & Zhou, 2025; Hyland, 2005a, 2005b; Millar et al., 2023; Mohammed & Tom, 2018) have employed stance markers to implement the measurement of speakers’ or writers’ attitudes and epistemic stance. This study adopts the definitional framework proposed by Hyland (2005a), who conceptualized epistemic stance as comprising two primary categories: hedges and boosters. Boosters, such as “sure,”“always,”“really,” and “think,” function to convey unwavering conviction in assertions, while simultaneously signaling the speaker’s engagement with the topic and alignment with the audience. These devices serve to emphasize shared epistemological ground, reinforce in-group identity, and foster interactive engagement (Hyland, 1999, 2005b). In contrast, hedges like “possible,”“maybe,” and “feel” act as linguistic mitigators, allowing writers to refrain from absolute commitment to a statement and present information as subjective opinion rather than objective fact (Hyland, 2005b).

In the context of social commerce, epistemic devices embedded in Amazon Live comments have been shown to exert significant influence on consumer trust and shopping intention, providing both emotional and informational support that shapes decision-making processes (Chen & Shen, 2015). Additionally, epistemic markers play a pivotal role in constructing the ethos of corporate leaders: CEOs often balance boosters and hedges to project an image of authoritative expertise coupled with sincerity (Hyland, 2005a).

Cross-cultural research has revealed notable disparities in epistemic stance usage. For instance, Crismore et al. (1993) found that Finnish students employed significantly more meta-discourse and hedges in their writing compared to their U.S. counterparts, a difference attributed to cultural communicative norms. Similarly, Hyland and Milton (1997) documented that L1 writers used approximately twice as many hedges as Hong Kong L2 students, with the latter relying more heavily on boosters in their essays. In a separate line of inquiry, Millar et al. (2023) analyzed NIH-funded project abstracts (1985–2020) and observed a shift from cautious, tentative language to more confident, optimistic, and promissory tones, marked by declines in weak modality markers and increases in certainty expressions. These shifts may be linked to evolving research ecosystem dynamics, including the heightened emphasis on grant proposal “salesmanship” and structural-cultural changes. Alghazo et al. (2021) reported that Arabic research-article abstracts contained more boosters and fewer hedges than English ones, preferring the certainty expression in a way of assertion cultivated by collective traditions. Cultural proximity would have an influence on speakers’ encoding of certainty. It was found that significantly less epistemic stance markers were used by Chinese college students in speech than native speakers in the context of traditional medicine, due to the culturally salient topics (Wei & Wang, 2025).

Translation studies have highlighted translators’ strategic preservation of authorial epistemic stance. Huang et al. (2025) noted that a significant proportion of stance expressions are directly transferred in translated research abstracts, particularly in clauses led by certainty stance verbs, thus maintaining the integrity of the original author’s evidentiality. This practice reflects a delicate balance between epistemic fidelity to the source text and the need to ensure accessibility and engagement for the target audience, as evidenced by a moderately significant association between epistemic stance and rhetorical moves in translated abstracts.

Recent scholarship by Stein (2025) has illuminated the role of epistemic stance in academic identity construction, demonstrating that master’s and PhD students leverage stance-taking to position themselves as “potential members” of the scientific community.

The choice to adopt Hyland’s (2005a) epistemic model in this study is grounded in two key considerations: (i) its widespread acceptance and adoption in stance research over the past decades (Hyland & Jiang, 2022); (ii) its explicit focus on the interactional functions of stance markers, which aligns with the study’s central concern with interactivity in live streaming contexts; and (iii) the model’s focus on stance as a social-contextual practice supports comparative analysis. Hyland (2005a) argues that stance reflects speaker identity and communicative purpose, which is critical for exploring why professional streamers and college students may differ in certainty use, beyond just describing differences. Basically, the professionals do the job for commercial goals, but college students complete the assignment for educational objectives. As Hou et al. (2019) have shown, interactivity in live streaming environments plays a critical role in shaping viewers’ purchase intentions. This theoretical orientation gives rise to a key research implication: the establishment of live streaming learning corpora centered on epistemic stance analysis, diverging from the semantic-focused approaches typically employed in large language models.

In-Class Live Streaming

Studies of live streaming in class have traditionally focused on the role of live streaming as an advanced approach of interaction and communication between teachers and students (Abdous & Yoshimura, 2010; Abdous & He, 2011; Cacault et al., 2021; Foertsch et al., 2002; Pan & Shen, 2022; Ubaedillah et al., 2021), emphasizing its capacity to enable real-time engagement through features like instant feedback and collaborative tasks. This body of research often treats live streaming as a tool to replicate face-to-face classroom dynamics in virtual settings. Yet, it tends to overlook its role as a specialized skill in CBEC training, a dimension addressed by Feng and Tan (2025), who highlight its distinct requirements beyond traditional educational interaction.

Some scholars (Dewanta, 2020; Taubah, 2020) have explored the functions of social media platforms such as TikTok in foreign language learning, focusing on how short-form live streaming content facilitates informal language acquisition. Their work underscores the potential of interactive features like comments and real-time polls to motivate learner engagement, yet these studies remain primarily in the realm of general language learning and have not explicitly connected live streaming to professional CBEC contexts where English serves as a lingua franca for business transactions.

In terms of talent development and cross-cultural training, Feng and Tan (2025) identify critical limitations in current educational models. They argue that inadequate language proficiency in business-specific contexts and insufficient cross-cultural competence hinder students’ practical operations in CBEC live streaming. The authors note that curricula often overemphasize language skills like translation techniques and literary theories at the expense of useful modules, such as multi-platform operations, big data analysis, and cultural adaptability training. As a result, students struggle to comprehend English business documents, engage in customer service communication, and design cross-cultural marketing scripts in real-world CBEC scenarios.

While there are a few studies on live streaming applied in English teaching, very little is known about how college students use English as an instrument in live streaming training. Most existing research focuses on pedagogical design or technological features rather than the nuanced linguistic strategies students employ to communicate in CBEC-related contexts. This gap is further compounded by the lack of studies comparing how Chinese college students and professional live streamers express epistemic stance—the linguistic encoding of knowledge certainty—in live streaming settings, and how these two groups differ in their communicative approaches. To address this gap, the present study investigates how professional streamers use epistemic devices to convey certainty about knowledge and to provide insights to enhance college students’ business English skills in CBEC live streaming.

Two research questions were explored in this study:

(i) How do Chinese college students differ from professional live streamers in their use of epistemic devices of different parts of speech (POS)?

(ii) How do Chinese college students differ from professional live streamers in their expression of epistemic stance in terms of different product categories?

Methodology

Theoretical Framework

Given the strong compatibility of Hyland’s model with this study, the current comparative analysis was employed to gain insights into the gaps in the use of epistemic stance features between professional live streamers and college students. First, due to the model’s wide acceptance in stance studies (Hyland & Jiang, 2022), its framework has been validated in various genres. The analysis of epistemic stance markers in live streaming can connect with previous work (e.g., Alghazo et al., 2021; Chen & Jiang, 2025; Millar et al., 2023) that successfully employed stance markers (including its core categories: hedges and boosters) to observe speakers’/writers’ epistemic stance, affect, and even relationship with the listeners/readers. Second, the branch (i.e., epistemic stance) of the model’s interactional perspective suits the live-stream context: boosters imply substantial certainty and maintain viewers engagement, and hedges lessen the extent of absolute commitment. The two categories are vital to live product presentations. Hyland’s (2005a, 2005b) earlier studies demonstrate how such markers balance authority and credibility to construct discursive identity, providing an efficient framework for analyzing how streamers use language to draw and maintain their viewers’ interests. Third, as a social-contextual practice, this framework can support this comparative analysis of certainty expression by different groups. Professional live streamers and college students play distinct social roles. Hyland’s argument contributes to understanding why each group uses them differently, revealing the deep motives behind epistemic stance choices in live streaming.

Furthermore, this study adopted a combined approach of corpus-based linguistic analysis and non-parametric independent samples testing, which was designed for the unique alignment with the characteristics of live streaming discourse and the needs of comparison between the professional streamers and college students.

First, for the current research on epistemic stance, the choice of corpus-based analysis was to match the need for authentic and rich linguistic data that natural language applied in live streaming could supply. Corpus-based approaches allow the researchers to systematically investigate a large amount of real streaming transcripts to acquire sets of epistemic stance markers (e.g., never, of course) (Hyland, 2005a). For the domains of other studies, like media, education, and sociology, this approach can be adopted as an applicable tool to make both qualitative and quantitative analysis of natural language use as well.

Second, non-parametric independent samples testing (Mann–Whitney U test in this article) was applied to compare certainty expression by stance markers between the two different live streaming groups, primarily in view of the statistical characteristics of the data in this study. In the initial stage of testing, it was found that the frequency per thousand tokens of epistemic stance markers did not follow a normal distribution (Shapiro–Wilk test: W = .88, p < .05) which is a common feature of corpus linguistics, because some situational factors (e.g., differences in culture contexts and communities) may commonly drive variation of language use (Alghazo et al, 2021; Crismore et al., 1993; Fabíola, 2025; Wei & Wang, 2025). Parametric tests (e.g., t-tests) require normal data distribution to ensure reliable results, so they are inapplicable to this paper; however, non-parametric tests like the Mann–Whitney U test are robust to non-normal data and even well-suited to the sample size less than 30 (Fagerland & Sandvik, 2009; Montgomery, 2017; Zimmerman, 1998; over 90 texts per group of this study).

This methodological combination is not subjective or uncommon: previous studies (Hunston, 2022; Siegel & Castellan, 1988) addressed similar comparison between groups have successfully applied corpus analysis with non-parametric testing, confirming its validity for exploring natural language differences. The following data sources (section “Data”) and analysis steps (section “Analysis”) will elaborate on this framework in detail.

The core formula for calculating the Mann–Whitney U test is:

U_{1} = n_{1} * n_{2} + n_{1} * (n_{1} + 1) / 2 - R_{1}

U_{2} = n_{1} * n_{2} - U_{1}

Where:

n ₁: Sample size of professional streamers (n₁ = 98);

n ₂: Sample size of student streamers (n₂ = 92);

R ₁: Total rank sum of professional streamers’ epistemic stance marker frequency;

U ₁/U₂: U statistics for each group (the smaller U value is used for hypothesis testing).

Data

A random collection of texts was recruited from Amazon.com and in-class assignments separately. The item popularity on Amazon was also considered as a balancing factor, weighted according to user ratings (30% rated 4–5 stars, 40% 3 stars, and 30% 0–2 stars) to guarantee representativeness and avoid bias toward a single product. The research data sourced from the classrooms of a vocational-oriented university was highly representative, as there are numerous such types of universities or colleges in China. The students who participated in the live streaming training were all juniors and mastered intermediate to high-level English application skills, enabling them to complete script writing independently. It is important to note that the texts in this database were collected from 2022 to the first half of 2024, when domestic AI tools such as DeepSeek, Doubao, and Kimi had not yet been released or widely promoted in China. Besides, the author confirmed this matter with the students, and based on their language expression habits and some grammatical errors in the scripts, essentially eliminated the possibility that the students used AI tools to complete the assignment. Therefore, the influence of AI technology on the students’ expression of epistemic stance in live streaming was excluded, and the texts were independently completed by the students.

All the live streaming videos were transcribed into written text through Tongyi Tingwu¹ and then reviewed, with an accuracy rate of over 98%. The scripts of professional online videos constituted Amazon Live Streaming Corpus (ALSC) with 98 texts and those of in-class assignment made up Student Live Streaming Corpus (SLSC) with 92 texts. The length of each script in ALSC varies from 200 to 700 tokens, while that in SLSC was consistently maintained at around 400 tokens due to a 3-min limit for these in-class assignments for each team with no more than three students.

As Table 1 shows, the categories in each corpus were classified based on the standard classification rules on Amazon, including food, fashion, and home and electronics (the third type combined two popular categories due to the limited quantity of scripts from SLSC). Considering the structure of product categories, the composition of the two corpora is listed below.

Table 1

The Composition of ALSC and SLSC.

	ALSC		SLSC
Category	Text	Token	Text	Token
Fashion	35	11,729	31	12,545
Food	30	12,623	31	13,264
Home and electronics	33	12,831	30	11,189
Total	98	37,183	92	36,998

The total number of epistemic stance features was 76, including 29 boosters and 47 hedges listed in the Appendix, calculated by LancsBox (5.0.3). For the research purpose, the scope of hedges and boosters was defined manually by the author based on Hyland’s (20025a) model. Therefore, the final frequencies per thousand tokens includes author-adjusted estimates (Source: Author Estimates).

Analysis

Due to the non-normal distribution of the data in this study, the Mann–Whitney U test as a typical and effective method of non-parametric independent samples tests, was employed in this research to conduct the contrastive analysis for the present test conditions. So, the mean rank obtained from the test was used to compare two independent samples, coupled with the frequency per thousand tokens. The mean rank involves ranking all combined observations and averaging the ranks for each group. This metric helps determine if one group tends to have higher values than the other, indicating potential differences in the frequency per thousand tokens in the two corpora. It is crucial for assessing statistical significance when assumptions for parametric tests are not met, offering a robust method to compare central tendencies. There are four steps for the data analysis. This article adopted SPSS 29 to develop non-parametric independent samples testing for the two groups. The significance level for the test (α = .01) was set by the author according to standard practice in the field (the default is .05). As a result, any claim that a difference is “significant” reflects this author-adjusted criterion (Source: Author Estimates).

First, for each of the three product categories, we calculated the frequency per thousand tokens of boosters and hedges as epistemic stance features with the linguistic statistics tool LancsBox (5.0.3) based on five types of POS (i.e., adjectives, adverbs, modals, phrases, and verbs). From the perspective of parts of speech, more refined classifications of stance features can be made, and it’s more beneficial to conduct systematic statistical comparisons between two different subjects investigated in this study.

Second, a general descriptive analysis for each corpus in terms of the uses of epistemic stance features was made by observing the frequency per thousand tokens and mean ranks via non-parametric independent sample tests.

Third, non-parametric independent samples tests compared the specific differences in the expression of epistemic stance between the two corpora from the perspective of POS.

Fourth, along the dimension of product categories, make non-parametric independent sample tests between the two corpora based on boosters and hedges separately.

Results

All figures and tables in this section were generated by the authors using LancsBox (5.0.3)/SPSS 29 (Source: Author Estimates).

Overall Frequency of the Epistemic Stance Features

Table 2 presents that there are significant differences separately in the total use of epistemic stance features, boosters, and hedges (p < .001). The frequencies per 1,000 tokens of epistemic stance features from ALSC are more than twice as many as those from SLSC (Figure 1).

Table 2.

The General Comparison Between the Two Corpora.

Feature (n)	ALSC		SLSC		p
Feature (n)	% frequency	Mean rank	% frequency	Mean rank	p
Boosters (29)	1,631.81	127.29	586.49	61.64	<.001
Hedges (47)	725.66	108.33	406.21	81.84	<.001
Total: Epistemic stance features (76)	2357.47	124.62	992.70	64.48	<.001

Figure 1.

Non-parametric independent sample tests for epistemic stance feature between ALSC and SLSC.

By observing the frequency per 1,000 tokens, ALSC, boosters are more than twice as hedges; SLSC, the two types of stance markers occurred nearly in the same frequency (Figure 2).

Figure 2.

Non-parametric independent sample tests for boosters and hedges separately between ALSC and SLSC.

Comparison in Terms of POS

Table 3 presents a further analysis of the expression of epistemic stance by the two groups in terms of POS. There are statistically significant differences separately in adverbs, modals, and verbs of the two corpora. The frequencies of these stance markers are relatively higher in ALSC than in SLSC in most cases. Although there is no statistical difference separately in the use of adjectives and phrases, the two types of POS in SLSC are slightly more used than those in ALSC.

Table 3.

The Comparison in Terms of POS Categories of Epistemic Stance Features.

POS	ALSC		SLSC		p
POS	% frequency	Mean rank	% frequency	Mean rank	p
Adj	120.06	94.55	82.42	96.51	.759
Adv	1,160.63	125.69	301.10	63.34	<.001
Modal	357.81	104.58	194.03	85.83	.014
Phrase	56.17	94.37	39.12	96.70	.636
V	662.80	107.63	376.04	82.58	.002

And they exhibit a relatively low mean rank while presenting a relatively high frequency per 1,000 tokens (Figure 3).

Figure 3.

Non-parametric independent sample tests across POS separately between ALSC and SLSC.

Comparison in Terms of Product Categories

Furthermore, it shows that the professional live streamers and college students expressed their epistemic stance in a significantly different way when the live streaming products varied from fashion, food, or home and electronics (Table 4). The frequency per thousand tokens of epistemic stance features from each product category in ALSC seemed higher than that in SLSC. The hedges and boosters from the fashion category in ALSC accounted for nearly half of the total stance markers, while those in SLSC accounted for less than one third. Meanwhile, those from the food category were less than home & electronics in ALSC. Those from the food category were more than home & electronics in SLSC (Figure 4).

Table 4.

The Comparison in Terms of Product Categories.

Product category	ALSC		SLSC		p
Product category	% frequency	Mean rank	% frequency	Mean rank	p
Fashion	1,043.53	44.53	387.52	21.05	<.001
Food	610.44	37.27	331.35	24.94	.007
Home and electronics	703.50	43.09	273.84	19.08	<.001

Figure 4.

Non-parametric independent samples tests for epistemic devices used in fashion, food, and home and electronics live streaming.

Among the three different product categories, the order from largest to smallest is fashion, home and electronics, and food based on the frequency per 1,000 tokens of epistemic stance features in ALSC. But the order in SLSC is fashion, food, and home and electronics.

In each of the three product categories, Table 5 shows a significant difference in the use of boosters or hedges between professional streamers and college students, except for hedges from food live streaming (p = .988). All of the features of epistemic stance in ALSC seem much more than those in ALSC (Figure 5), similar to the results in Table 2 to 4. Although the boosters in each product category have a higher frequency per thousand tokens than hedges in both ALSC and SLSC, the two groups explored the two types of stance markers in different ways. That is to say, boosters are more than twice as many as hedges in ALSC. But in SLSC, boosters are less than twice hedges.

Table 5.

The Comparison in Terms of Epistemic Stance Features in Each Product Category.

Category	ALSC		SLSC		p
Category	% frequency	Mean rank	% frequency	Mean rank	p
Fashion
Boosters	730.90	45.56	227.49	19.89	<.001
Hedges	312.63	38.94	160.03	27.35	.014
Food
Boosters	413.76	40.30	179.01	22.00	<.001
Hedges	196.67	31.03	152.34	30.97	.988
Home and electronics
Boosters	487.15	42.88	179.99	20.03	<.001
Hedges	216.35	39.21	93.85	24.07	< .001

Figure 5.

Non-parametric independent samples tests for boosters and hedges separately used in fashion, food, and home and electronics live streaming.

Table 5, among the three different product categories in both corpora, for boosters, the order from largest to smallest is fashion, home and electronics, and food based on frequency per 1,000 tokens. For hedges, the same is true in ALSC, but in SLSC, the order is fashion, food, and home and electronics.

Discussion

Overall, the results above indicate that the college students’ expression of epistemic stance differs mainly and significantly from professional streamers when making live streaming, compared from the perspective of POS or product categories. These findings are consistent with previous research conclusions that there are significant differences between English learners and native speakers when expressing stance (Crismore et al., 1993; Hyland & Milton, 1997). This section moves on to elaborate on the specific differences with real cases between the two groups and the probable causes.

First, epistemic devices are crucial to negotiating knowledge claims with a potentially skeptical audience (Hyland, 2005a). They play a crucial role in convincing customers of the authenticity of the live streaming content. Statistically, the boosters were used much more than hedges, which implied a rigid demand for the reliability of information related to the quality of products recommended or shared by streamers, and asserted the statement with a definitive tone. After all, product information from reliable channels, like Amazon Live, is considered more useful and offers crucial insights into the quality of the product (Chen & Shen, 2015).

Example 1: Purchasing a full size is a great value. As you can see, definitely something that works. (ALSC-home and electronics).

Example 2: It turns my normal TV into a smart TV, so that’s really awesome. (ALS-home and electronics).

During live streaming, students tended to lack the awareness of making commitment-based recommendations and merely provided routine introductions and promotions. The underlying reason may be that significant cultural proximity drives Chinese students to use epistemic stance markers in a less frequency than native English speakers when orally introducing familiar topics or items, reflecting a cross-cultural stance strategy that emotional engagement–persuasion has priority over knowledge emphasis and credibility of information (Wei & Wang, 2025).

Moreover, not every member of the live streaming team had used the products in person, which unconsciously weakens the degree of commitment in their statements.

Meanwhile, by manually checking the concordance line displayed by LancsBox (5.0.3), there was a co-occurrence pattern of self-referential expression with boosters (e.g., “I know that…,”“I think,” and “I believe”). Similarly, professional streamers more frequently adopted this pattern than college students.

The stance markers are always explored in the context of serving an interactional function (Hyland, 2005a), which mainly aims at the establishment and maintenance of a relationship, like the reliable relationship between streamers and viewers. In the context of live streaming, the expectation from viewers or potential consumers drove speakers to use the “I + boosters” pattern to match their need for certainty or reliability in the proper functions and high quality of the promoted products. The use of self-referential expressions in conjunction with epistemic verbs of judgment implies an explicit acknowledgment of personal responsibility (Hyland, 2005a). Except for “I believe” that had a slightly higher %frequency in SLSC, the other two co-occurrence patterns in ALSC were both used more often.

However, college students didn’t encounter the real potential consumers in the whole live steaming training so there was a lack of driving power from that kind of expectation in a real live streaming video.

Example 3: This is not a giant machine. I know that we’ve all seen those coffee makers that like the size of a Buick on your car (ALSC-food).

Example 4: It comes uploaded with all of the apps that you would need for things like Hulu, Disney Plus, Netflix, even peacocks on here, which I thought was really convenient. (ALSC-home and electronics).

Example 5: I believe there is always one that can touch your heartstrings (SLSC-fashion).

Second, there was a particular modal verb “must,” which had a frequency per thousand tokens of only 8.91 in ALSC but 46.73 in SLSC. Chinese students tend to use more “must” to imply there is a strong likelihood or expectation that a particular action will be taken or a situation will occur. “Must” seems least likely to cause grammatical errors (Long, 2016), and learners tend to have a slight over-reliance on it, frequently using personal pronouns followed by modal verbs (Biber et al., 1999; Liang, 2008).

Besides, the pattern of “…must know that… was found sometimes in SLSC.”“Must” and “know” are emphatic to reinforce the truth and value and make the speakers’ perspective prominent (Hyland, 2005a). The two boosters co-occurred here to make the emphatic tone double. The students emphasized the point that the information following had been already learned by the viewers by default.

Example 6: And you guys must know that the “mid-year” shopping festival on 1st, June is coming, right? (SLSC-home and electronics).

Example 7: I guess you must know that the price of our watch offline is expensive, but today, this product in my live streaming will bring you guys a particularly nice price and we have many free gifts (SLSC-home and electronics).

“Absolutely,”“always,” and “really” are amplifying adverbs to strengthen verbs and adverbs (Hyland, 2005a). The boosters below, “absolutely” and “really,” are employed by professional streamers to highlight the advantages of products, like the beautiful color or good looks. “Always” in Example 9 implies the high consumption frequency of the brand recommended in the video. All the devices serve a function of intensifying the certainty of information provided by the speakers.

Example 8: This one is a true wrap dress and I love the rough hand. Color is absolutely gorgeous (ALSC-fashion).

Example 9: I always have a box of Waka with me in the department here or in the car (ALSC-food).

Example 10: I think those roses are going to look really nice once they set on top of the chocolate (ALSC-home and electronics).

Third, the finding (Table 4) shows that although there was a significant difference in the two groups’ expression of epistemic stance in each of the three categories, the use of epistemic devices in fashion products was ranked the first simultaneously. Nearly half (44.26%) of epistemic stance features in ALSC were distributed in the category of fashion, in which professional streamers tend to exercise a greater degree of subjective aesthetic judgment and emotional description (Lin, 2017; Wang & Pan, 2022) in comparison to food and home and electronics, when recommending and sharing products. Apart from some objective information such as size, color, and material, the depiction of application scenarios and the expected performance is relatively subjective and abstract. Nevertheless, viewers frequently rely on streamers’ comments on product performance as significant references for their purchasing decisions. They even highly trust (Zhang, 2022) some popular streamers with a certain level of reputation and trustworthiness, and even relinquish their aesthetic consciousness to varying degrees (Yan & Li, 2020). This phenomenon may be underscored by the positive relationship between evoked emotions and individuals’ subjective judgments of persuasion effectiveness (Marian & Peter, 1994). This integration suggests that streamers’ evaluative narratives not only serve as decision-making cues but also potentially influence viewers’ emotional states, which in turn shape their perceived validity of the persuasive attempt. So fashion streamers need to rely on markers of epistemic stance to complement their certainty about knowledge and beliefs (Millar et al., 2023). Similarly, the percentage of epistemic devices from the fashion category in SLSC was 39.04%, accounting for the highest proportion among the three categories. The college students may consciously or unconsciously use boosters and hedges most frequently in the fashion category, but there was still a gap compared with professional streamers.

The probable cause for the significant differences between the two groups across the categories is essentially consistent with those previously mentioned above and will not be reiterated in this part. The sole exception here is the finding that there is no significant difference in the use of hedges in food live streaming (Table 5).

Example 11: That’s my personal experience, obviously, but it is a painless, quick job to brush up my hair even when it’s wet right out of the shower (ALSC-fashion).

Example 12: Actually, the discount on lipstick today is very strong (SLSC-fashion).

Example 13: I want to share with you why you should take Waka on your next camping trip (ALSC-food).

Example 14: The corn is naturally grown so the size and shape seems vary (SLSC-food).

Finally, while the Mann–Whitney U tests consistently yielded p-values < .001, the gaps in frequency per 1,000 tokens also imply practical significance in education. For example (in Table 2), professional streamers adopted 1,632% in terms of boosters, whereas university streamers produced 586%, a mean difference of 1,046 per 1,000 tokens. In a 400-token live streaming session, this translates into ≈420 extra booster acts in the professionals’ discourse, or roughly one marker every second sentence. Such density is not likely to escape viewers’ notice and could immediately impose a significant effect on the perceived interactivity and credibility of the streamer. Effect-size indices also support this interpretation. In terms of all product categories, the rank-biserial correlations for all annotation items with p < .001 in Tables 2 to 5 ranged from .86 to .97, exceeding Cohen’s convention for “large” effects, meaning that this predictive power is robust. From a pedagogical perspective, these magnitudes show that the gaps are worth targeting in live streaming assignments rather than being dismissed as trivial, even though they are already statistically significant.

Conclusion

This study set out to investigate how professional live streamers show their certainty and confidence based on epistemic devices and the differences or gaps between college students and them. Analysis among 76 epistemic stance devices effectively supports the comparisons between the two groups. Totally, it was found that the boosters were used much more than hedges by both groups. Professional streamers more frequently adopted epistemic devices, as well as the co-occurrence pattern of self-referential expression with boosters, than college students due to cultural (Crismore et al., 1993) and identity differences. With respect to the two research questions in section “Literature Review,” there were statistically significant differences separately both in each of three parts of speech (including adverbs, modals, and verbs) and three product categories between the two groups’ expression of epistemic stance.

This study has significant academic and practical implications for corpus linguistics and language pedagogy. These findings underline that the perceived community differences are not only statistically robust but also of practical importance. The large effect sizes and the substantial frequency discrepancies indicate that significant improvements in certainty expression by college students will be promoted with training in the way professional streamers present epistemic stance. The findings underscore the necessity of reforming classroom instruction by focusing on authentic language application and context-driven communication. Specifically, the research demonstrates that tailoring instructional content to specialized scenarios, such as CBEC live streaming, requires the development of targeted epistemic stance expression. The use of boosters and hedges in a more appropriate way, informed by corpus-based analyses of professional discourse, enables English learners (e.g., majors of business English, international trade, marketing, and e-commerce) to navigate advanced communication requirements in business contexts, such as introduction to product details, managing audience interactions, and projecting professional authority. By integrating such language applications into curricula, language education can more effectively bridge the gap between academic learning and professional training. Thus, this study not only enriches the theoretical framework of corpus-based instructional design but also provides applicable pedagogical tools to enhance students’ pragmatic competence, prepare them to tackle the communicative complexities of specialized domains, and promote their level of skills in CBEC business.

The research findings imply that college students didn’t express certainty in a professional way in the live streaming. For example, there was a lack of total epistemic stance markers employed by the students, especially boosters in all product categories. The observed gap inspires universities to explore corresponding application-oriented textbooks and training sessions for live streaming, in partnership with the relevant enterprises as teaching-practice bases where the students will be given the chance to learn from professional streamers. For classroom exercise and teaching strategies, epistemic stance markers as linguistic features that establish authority and enhance persuasion, should be explicitly underlined. For example, students can be encouraged to increase the marker frequency in live-streaming assignments. For example, a type of classroom interaction, like live-stream comment practice (author-designed task), could be applied as a short-term (i.e., 2–3 week) training. Before class, the teacher provides a list of stance markers for students to make preparations for in-class live streaming. In class, a Tencent Meeting room is opened where the students deliver their live-stream assignment, and the rest perform as viewers, raising real-time bullet-screen questions. When a student answers a question, he or she is required to use the markers as much as possible; viewers attempt to write down every marker they hear. At the end, the student who has used the most markers and the viewer who has recorded the most markers both receive a reward.

Because the students completed the in-class assignment of live streaming as a team work, a part of transcripts in this study represent the joint output of male and female students together. Therefore, the analysis excluded gender as a variable. To transcend this limitation, future research will be extended to focus on individual live streaming. By observing single-streamer transcripts, the impact of gender on application of epistemic devices maybe isolated appropriately. Additionally, the study is limited by its cross-sectional design: the texts were generated by different students, and each student took only one live streaming assignment each semester. Consequently, a longitudinal study of tracking the same group of students’ epistemic stance expressions over time couldn’t be achieved yet. To address this limitation, a logistical plan will be developed in the future: adjust the frequency of live streaming arrangements within the curriculum framework, and establish a standardized tracking mechanism, including a fixed data collection template and assessment schedule.

Another potential problem is that the scope of this investigation may not provide a comprehensive review of more best-selling categories (i.e., pet supplies, and outdoors). The future research should consider more dimensions for observing the expression of streamers’ stance in live streaming, for example, attitude markers and more product categories. More broadly, research is also needed to make a multi-modal analysis for live streaming. The current study analyzed only linguistic texts, but a further study could assess the roles of body language, visual signals (e.g., the color and style of background plate designed for different live-stream rooms), and cultural factors in real-time stance expression and how the combination of verbal and non-verbal communication delivers epistemic stance, affect, and relation. As more AI technologies will be applied in and after class (Li et al., 2025), it is necessary to compare the stance expression between professional streamers and AI tools or the changes after the students adopt these tools to edit the scripts of live streaming training. This “AI-clean” dataset will serve as the control condition for extending the present work further.

Footnotes

Appendix

Appendix.

The Epistemic Markers in this Paper.

Boosters (29)

Actually, always, believe, certain, clearly, definitely, absolutely, find, found, in fact, know, known, must (possibility), never, obvious, obviously, of course, prove, proved, realize, realized, really, show, sure, think, thinks, thought, true, truly

Hedges (47)

About, almost, appear, appeared, appears, approximately, around, assume, assumed, could, couldn’t, feel, feels, felt, generally, guess, indicate, indicated, indicates, in my opinion, likely, may, maybe, might, mostly, often, perhaps, possible, probably, quite, relatively, seems, should, sometimes, suggest, suggested, suggests, suppose, supposed, supposes, tend to, tended to, tends to, typically, usually, would, wouldn’t

ORCID iD

Shuang Wang

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

Data will be made available on request.

Notes

References

Abdous

(2011). Using text mining to uncover students’ technology-related problems in live video streaming. British Journal of Educational Technology, 42(1), 40–49.

Abdous

Yoshimura

(2010). Learner outcomes and satisfaction: A comparison of live video-streamed instruction, satellite broadcast instruction, and face-to-face instruction. Computers & Education, 55(3), 733–741.

Alghazo

Salem

M. N. A.

Alrashdan

(2021). Stance and engagement in English and Arabic research article abstracts. System, 103, Article 102681.

Biber

Johansson

Leech

Conrad

Finegan

(1999). Longman grammar of spoken and written English. Longman.

Cacault

M. P.

Laurent-Lucchetti

Hildebrand

Pellizzari

(2021). Distance learning in higher education: Evidence from a randomized experiment. Journal of the European Economic Association, 19(4), 2322–2372.

Chen

Jiang

(2025). Understanding Chinese MA students’ interpersonal stance of anticipatory “it” patterns: Using corpus results to guide questionnaire and discourse-based interview. SAGE Open, 15(1).

Chen

Shen

X. L.

(2015). Consumers’ decisions in social commerce context: An empirical investigation. Decision Support Systems, 79(11), 55–64.

Chen

Zhou

(2025). Review of methods in historical corpus pragmatics: Innovative methodologies and insights into epistemic stance. Corpus Pragmatics, 9(2), 219–224.

Chen

(2022). Cultivating cross-border e-commerce live-streaming talents based on RCEP. China Business Review, (20), 160–165.

10.

Crismore

Markkanen

Steffensen

(1993). Metadiscourse in persuasive writing: A study of texts written by American and Finnish university students. Written Communication, 10(1), 39–71.

11.

Dewanta

(2020). The utilization of the TikTok social media application in Indonesian language learning. Journal of Indonesian Language Education and Learning, 9(2), 79–85.

12.

Fagerland

M. W.

Sandvik

(2009). Performance of five two-sample location tests for skewed distributions with unequal variances. Contemporary Clinical Trials, 30(6), 512–520.

13.

Feng

Tan

(2025). Research on the talent development mechanism and pathway for business English professionals in the context of cross-border e-commerce boom: A comprehensive talent cultivation framework based on dynamic feedback and cultural adaptation. Creative Education Studies, 13(5), 104–110.

14.

Foertsch

Moses

Strikwerda

Litzkow

(2002). Reversing the lecture/homework paradigm using eTEACH web-based streaming video software. Journal of Engineering Education, 91(3), 267–272.

15.

Hou

Guanm

Chong

A. Y. L.

(2019). Factors influencing people’s continuous watching intention and consumption intention in live streaming: Evidence from China. Internet Research, 30(1), 141–163.

16.

Huang

Jia

(2025). Transvocal stance in academic translation: A rhetorical analysis of grammatical stance in translated applied linguistics English research article abstracts. Journal of English for Academic Purposes, 74, Article 101472. https://doi.org/10.1016/j.jeap.2025.101472

17.

Hunston

(2022). Corpora in applied linguistics (2nd ed.). Cambridge University Press.

18.

Hyland

(1999). Disciplinary discourses: Writer stance in research articles. In Candlin

Hyland

(Eds.), Writing: Texts: Processes and practices (pp. 99–121). Longman.

19.

Hyland

(2005a). Metadiscourse: Exploring interaction in writing. Continuum.

20.

Hyland

(2005b). Stance and engagement: A model of interaction in academic discourse. Discourse Studies, 7(2), 173–192.

21.

Hyland

Jiang

F. K.

(2022). Metadiscourse: The evolution of an approach to texts. Text & Talk, 44(3), 411–433.

22.

Hyland

Milton

(1997). Hedging in L1 and L2 student writing. Journal of Second Language Writing, 6(2), 183–206.

23.

Jiang

(2021). Reflections on the construction of business English majors in vocational colleges under the trend of live-streaming. China & Foreign, 28(1), 1299–1300.

24.

Qiao

W. F.

(2025). Can large language model tools promote the development of university students’ higher-order thinking skills? An empirical analysis based on a questionnaire survey of students from 12 Double First-Class Universities. Modern Educational Technology, 35(1): 34–43. https://doi.org/10.3969/j.issn.1009-8097.2025.01.004

25.

Liang

(2008). A study of modal sequences in Chinese college students’ written English. Foreign Language Teaching and Research, 40(1), 51–59.

26.

Lin

(2017). A study on communication strategy and communication effect of fashion live streaming [Master’s thesis]. Wuhan University.

27.

Liu

(2018). Practice and reflections on the construction of a blended cross-border e-commerce institute for business English majors in vocational colleges. Journal of Wuhan Institute of Technology, 20(1), Article 5.

28.

Long

S. Y.

H. B.

Chen

T. Z.

Wang

(2016). The use of modal verbs in argumentative essays written by professional students. Foreign Languages Journal, 188(1), 124–131.

29.

Marian

Peter

(1994). The persuasion knowledge model: How people cope with persuasion attempts. Journal of Consumer Research, 21(1), 1–31.

30.

Millar

Mathis

Batalo

Budgell

(2023). Trends in the expression of epistemic stance in NIH research funding applications: 1985–2020. Applied Linguistics, 44(1), 1–18. https://doi.org/10.1093/applin/amad050

31.

Mohammed

A. M.

Tom

(2018). Linguistic markers of moderate and absolute natural language. Personality and Individual Differences, 134, 119–124.

32.

Montgomery

D. C.

(2017). Design and analysis of experiments (9th ed.). John Wiley & Sons.

33.

Pan

Shen

(2022). Project-based teaching exploration based on e-commerce live broadcast events. Education Research, 5(3), 34–36.

34.

Siegel

Castellan

N. J.

(1988). Non-parametric statistics for the behavioral sciences (2nd ed.). McGraw-Hill.

35.

Stein

(2025). Recruiting help in everyday research work: Epistemic stance taking and accountability in interaction. Learning. Culture and Social Interaction, 51, Article 100890. https://doi.org/10.1016/j.lcsi.2025.100890

36.

Taubah

(2020). The TikTok application as a medium for learning the skill of spoken expression. Mu’allim Journal of Islamic Education, 2(1), 57–66.

37.

Ubaedillah

Pratiwi

D. I.

Huda

S. T.

Kurniawan

D. A.

(2021). An exploratory study of English teachers: The use of social media for teaching English on distance learning. Indonesian Journal of English Language Teaching and Applied Linguistics, 5(2), 361–372.

38.

Wang

Y. B.

Pan

D. T.

(2022). Research on multimodal interaction in Taobao live streaming. Chinese Journal of Language Policy and Planning, 7(3), 34–46.

39.

Wei

Wang

(2025). Expressing stance: A cross-linguistic study of effective and epistemic stance marking in Chinese and English opinion reports. Journal of Pragmatics, 208, 106–120.

40.

Yan

D. C.

(2020). Scene, symbol, power: Visual landscape and value reflection of E-commerce live streaming. Modern Communication, 287(6), 124–129.

41.

Zhang

Liu

Wang

Zhao

(2022). How to retain customers: Understanding the role of trust in live streaming commerce with a socio-technical perspective. Computers in Human Behavior, 127, Article 107052. https://doi.org/10.1016/j.chb.2021.107052

42.

Zimmerman

D. W.

(1998). Invalidation of parametric and non-parametric statistical tests by concurrent violation of two assumptions. Journal of Experimental Education, 67(1), 55–68.

The Expression of Certainty in Live Streaming: A Comparative Analysis of Epistemic Stance Between Professional Streamers and College Students

Abstract

Plain Language Summary

Keywords

Introduction

Literature Review

Epistemic Stance

In-Class Live Streaming

Methodology

Theoretical Framework

Data

Analysis

Results

Overall Frequency of the Epistemic Stance Features

Comparison in Terms of POS

Comparison in Terms of Product Categories

Discussion

Conclusion

Footnotes

Appendix

ORCID iD

Funding

Declaration of Conflicting Interests

Data Availability Statement

Notes

References