Abstract
As an emerging social technology, live streaming has facilitated a synchronous and interactive selling setting for e-commerce sellers and consumers. Despite growing operations management (OM) in e-commerce live streaming (ELS), the prior literature has largely neglected the directive social operations of ELS broadcasters, which are crucial to a company's business operations and marketing strategies. To mitigate this gap, we investigated the effects of social call-to-actions (SCTAs), an ELS-specific directive social operation, on consumer purchase outcomes in ELS. To explicate the mechanisms driving the purchase effect of SCTAs, we draw upon the notions of cognitive and affective marketing appeals to categorize SCTAs into two types: cognitive and affective SCTAs. We measure broadcasters’ SCTAs from ELS speech-to-text data using text-mining techniques and adopt econometric model estimations via an instrumental variables identification approach to quantify the relationships between ELS broadcasters’ SCTAs and consumer purchases. Our results uncover a significant positive purchase effect of affective SCTAs but an insignificant effect of cognitive SCTAs on product-level purchase orders in ELS. Specifically, a one-unit increase in affective SCTA-related words per minute is related to increased purchase orders of a product by 0.43%, i.e., averaging a 1.10% growth in demand per minute. Additionally, we find that the effects of cognitive and affective SCTAs are contingent on product and broadcaster types. Our session-level mediation effect analysis verifies the underlying mechanisms that drive the purchase effects of broadcasters’ SCTAs. Specifically, we find that cognitive SCTAs impede social engagements and thereby purchases, whereas affective SCTAs can boost social engagements, leading to increased session traffic and ultimately product sales in ELS. Our study provides novel empirical findings and important practical implications for ELS platforms and broadcasters.
Keywords
Introduction
Live streaming has burgeoned over the recent years in various contexts, from social media and e-sports to e-commerce. Many e-commerce platforms have provided live streaming services for brands, sellers, and influencers to promote and sell products. Notable examples include Taobao Live launched by Alibaba in 2016, Amazon Live introduced for U.S. vendors in 2019 and Shopee Live created for the Southeast Asian market. The phenomenon of e-commerce live streaming (ELS), that combines both elements of consumer engagement and seller retailing, has surged in China in the past few years. Specifically, ELS platforms in China such as Taobao Live have attracted more than 515 million customers by 2022, accounting for 60.9% of total online shopping customers (CNNIC, 2023). Furthermore, there is a rising demand for ELS broadcasters who serve as store/brand influencers and sales representatives. On Taobao for instance, ELS broadcasters soared by 661% in 2020. Apart from China, other countries have also witnessed a booming ELS market for customers and sellers. For example, the U.S. live streaming commerce reached $11 billion in sales by 2021 (Robertson, 2022).
Recent research has started to recognize the significant economic impact of ELS on customer purchase intentions (Zhang et al., 2020) and product sales (Chen et al., 2020). The success of ELS in the increasingly competitive e-commerce market can be mainly attributed to its interactive personal selling process (Wongkitrungrueng et al., 2020). Unlike web pages using texts and pictures for product presentation, ELS allows broadcasters to engage or interact with the viewers via social technology features (e.g., live chats) and sell products in a vivid conversational style coupled with active interactions and detailed product demonstrations. With such interactions and demonstrations, broadcasters can instantaneously resolve consumers’ queries about products on sale. Such vividness and interactivity of ELS can thus largely mitigate consumer uncertainty and the impersonal nature of conventional e-commerce (Chen et al., 2020).
However, despite these opportunities, achieving good sales in ELS is still challenging for practitioners involved in ELS operations management (OM). Specifically, ELS's unique one-to-many and multitasking setting suggests that ELS broadcasters often need to engage with viewers synchronously in a series of cognitively demanding social and selling tasks such as building rapport and affinity, requesting social engagements, describing products, offering promotions, and answering queries in real time (He et al., 2021). Thus, ELS operations execution involving social and selling operations is notably more complex than those in TV shopping contexts (Fritchie and Johnson, 2003), where sales hosts serve an untargeted mass of passive TV audiences. While ELS and TV shopping have attracted much research attention that examines the effects of broadcasters’ or hosts’ selling operations on sales performance (Fritchie and Johnson, 2003; Wu et al., 2023), research on how broadcasters’ social operations affect consumer purchase in ELS still lags. Below, we highlight three research gaps in the related literature that motivate this paper.
First, no prior research has studied the purchase effects of social operations in terms of social call-to-actions (SCTAs). Although prior TV shopping research has attempted to examine the roles of social operations via hosts’ relationship-building actions (i.e., non-directive or relational-oriented, such as welcoming new audience, expressing gratitude, direct address, etc.) (Stephens et al., 1996), the literature has paid little attention to directive 1 social operations factors, i.e., SCTAs which are not quite relevant or practicable in TV shopping. Specifically, we define SCTAs as the broadcasters’ actions (intensity 2 ) in ELS to exhort the audience (without an overt selling intent) to leverage social technology features (i.e., chat, share, like, follow) for the purpose of real-time social engagement and communication (Eisenberg et al., 2006). Critically, while call-to-actions have been found to be an effective communication strategy in other non-real-time marketing contexts (Jung et al., 2020; Vafainia et al., 2019), research on SCTAs’ effect on consumer purchases in a synchronous or real-time selling setting like ELS is still lacking. In practice nowadays, broadcasters could incorporate SCTAs in ELS script planning and show scheduling for improved sales in ELS. 3 Companies are also increasingly encouraged to integrate operational decisions in ELS (e.g., negotiations with broadcasters) into their business operation plans (Lin et al., 2024). Despite the theoretical and practical importance, the relationship between SCTAs, an ELS-specific directive social operation, and purchase outcomes in ELS remains largely unknown. Further, the potential heterogeneous nature of this relationship along the dimensions of hedonic-utilitarian product types and influencer-store broadcaster types—both attributes that can provide key implications for more refined ELS resource allocations and operational strategies—is also unexplored in the ELS literature.
Second, past ELS research has examined various product-level contingency factors, such as price (Yu et al., 2022) and search-experience product attributes (Liu et al., 2023). However, researchers have neither considered the moderating roles of the hedonic-utilitarian product attributes (Babin et al., 1994) nor studied how this factor could moderate the relationship between consumers’ purchases and broadcasters’ SCTAs in ELS. The previous marketing literature has implored the importance of aligning product types with marketing appeal messages (Klein and Melnyk, 2016). Consumers often employ different information processing routes (e.g., affective vs. cognitive) when evaluating hedonic and utilitarian product attributes (Schulze et al., 2014). Therefore, the purchase effects of SCTAs, conditional on consumers’ perceptions of broadcasters’ SCTAs, are likely moderated by information processing routes associated with hedonic and utilitarian products in ELS. Thus, it is vital to study how the purchase effects of SCTAs could be contingent on hedonic-utilitarian product attributes, in order to provide ELS broadcasters with nuanced SCTA strategies for promoting different types of products.
Third, although the OM literature has examined how manufacturers and brands can make optimal decisions when collaborating with influencers via live streaming (Lin et al., 2024), no prior research has examined the influencer broadcaster type as a contingency factor. To be specific, there exist two types of broadcasters in ELS—store broadcasters who promote and market products for their own stores, and influencer broadcasters who do not own online stores but sell products for these stores (Chen et al., 2023). Notably, prior marketing research has uncovered that the marketing content producer type (e.g., brands vs. influencers) is a key factor influencing consumer engagement (Lou et al., 2019). Moreover, in the context of ELS, consumers often perceive their relationships with influencer broadcasters as akin to those of friendships or parasocial relations (Kim and Kim, 2020). They are thus more likely to be persuaded when influencers encourage immediate engagement actions. Since SCTAs are expected to directly influence consumers’ social engagements and subsequent purchase behaviors, we conjecture that the purchase effects of SCTAs may be contingent on broadcaster types in ELS, which has not been studied before. Understanding the moderating role of broadcaster-type attributes can help online stores and influencers strategically manage SCTAs to drive consumer purchases in ELS.
Motivated by the prior three research gaps, we examine ELS as a form of social technology, specifically in terms of enabling SCTAs to drive consumer purchases. Figure A1-1 in the Online Appendix illustrates our key positioning for this study and its differentiation from prior research. Hence, our research questions (RQs) are:
RQ1: How and to what extent are broadcasters’ social call-to-actions (SCTAs) related to consumer purchase outcomes in ELS?
RQ2: How are the purchase effects of SCTAs in ELS moderated by product-type attributes (i.e., hedonic vs. utilitarian products)?
RQ3: How are the purchase effects of SCTAs in ELS moderated by broadcaster-type attributes (i.e., influencer vs. store broadcasters)?
To explicate the mechanisms behind the purchase effects of SCTAs, we draw upon the notions of cognitive and affective marketing appeals (MacInnis and Jaworski, 1989; Tu et al., 2022) to categorize SCTAs into two types: cognitively-oriented (i.e., cognitive SCTAs) and affectively-oriented (i.e., affective SCTAs). In particular, cognitive SCTAs are oriented to calling for consumers’ actions that require controlled, slow, and deliberate cognitive processing (e.g., actions that exhort consumers to send live chats to interact with the broadcasters) while affective SCTAs are oriented to eliciting consumers’ actions associated with automatic, rapid and crude affective processing (e.g., actions soliciting consumers’ tapping on the “like” button) (Shiv and Fedorikhin, 1999). Drawing on the stimulus-organism-response (S-O-R) model (Mehrabian and Russell, 1974) and prior marketing literature, we conjecture that the purchase effect of cognitive SCTAs is theoretically ambiguous, which may be positive, negative or null, but affective SCTAs are likely to have a positive association with purchases in ELS.
To empirically examine the proposed relationships, we collected a large-scale product-level purchases and ELS videos dataset from Taobao Live, the leading ELS platform in China. Utilizing speech-to-text and text-mining techniques, we identify and measure the cognitive and affective SCTAs from the corresponding ELS speech-to-text data. Using econometric model estimations with an instrumental variables identification approach, we quantified the relationship between ELS broadcasters’ SCTAs and consumer purchases. Our empirical results uncover a significant positive purchase effect of affective SCTAs but an insignificant effect of cognitive SCTAs on product-level purchase orders in ELS. Our results suggest that a one-unit increase in the weighted count of affective SCTA-related words conveyed by broadcasters per minute is related to increased purchase orders of a product by 0.43%, i.e., implying an average demand growth effect of 1.10% increase in purchase orders per minute. Moreover, the effects of cognitive and affective SCTAs are contingent on product and broadcaster types. Specifically, cognitive SCTAs are found to inhibit purchases of hedonic products, and also curtail sales of products sold by influencer broadcasters, but for affective SCTAs, its positive purchase demand effect is stronger for hedonic (relative to utilitarian) products. We also conduct a mediation effect analysis to empirically evaluate the underlying mechanisms that drive the main effects of cognitive and affective SCTAs. Our results show that cognitive SCTAs impede social engagements and thereby purchases, whereas affective SCTAs can boost social engagements, leading to increased session traffic and ultimately product sales in ELS. Last, we employed alternative identification approaches to address other leftover endogeneity concerns and executed various robustness checks to ascertain the robustness of our findings.
Our research provides several contributions. First, our paper serves as one of the first attempts to investigate the economic outcomes of broadcasters’ social operations in terms of SCTAs in the synchronous and interactive selling environment of ELS, which have been largely overlooked in prior OM research on ELS. Second, we explicate and quantify the distinct purchase effects of cognitive and affective SCTAs in ELS, which contributes to extant research on online sales drivers (Sun et al., 2021), television shopping (Stephens et al., 1996) and call-to-actions (Huang et al., 2021; Jung et al., 2020). Third, we extend the marketing communications literature that identifies the boundary conditions for the effectiveness of ELS marketing communications (Gopinath et al., 2014) by investigating how hedonic-utilitarian product attributes moderate the demand effects of broadcasters’ SCTAs in the ELS context (Schulze et al., 2014). Fourth, we advance the theoretical dialogue on influencer marketing (Leung et al., 2022) by exploring the moderating role of broadcaster-type attributes (i.e., influencers vs. store broadcasters) on the purchase effects of SCTAs. Overall, by identifying the unique roles of cognitive and affective SCTAs in affecting the purchase demand of products and their heterogeneous effects in ELS, we offer an integrative view of how broadcasters can account for product-level and broadcaster-level factors in their ELS operations management, thus contributing to the emerging ELS literature in which these findings are not yet documented (Lin et al., 2024; Pan et al., 2022). Notably, our findings provide actionable guidance to broadcasters, retailers, and other ELS practitioners to better manage the ELS selling process.
Literature Review and Theoretical Background
Live Streaming Commerce
Live streaming commerce can be operated by brands, individual sellers, or online influencers on either social media platforms (e.g., Facebook, TikTok) or e-commerce platforms (e.g., Taobao, Amazon). The former is recognized as “social commerce live streaming” and the latter is known as “e-commerce live streaming”, which is the focus of the current paper. Scholars from the fields of OM, information systems (IS), and marketing have recently paid considerable attention to the live streaming commerce phenomenon in the aspects below.
First, OM researchers have adopted the game-theoretic and empirical approaches to investigate the decision strategies or operations of platforms, manufacturers, and retailers. From the standpoint of platforms, some studies have modeled the platforms’ commission fee strategies (Qi et al., 2020). From the seller side, some studies have investigated the optimal strategies that focal retailers, firms or manufactures could adopt to maximize revenue and profit in the live streaming selling channel (Lin et al., 2024; Pan et al., 2022). Moreover, empirical work has examined how retailers’ adoption of a popular operational lever in live streaming, i.e., lucky draws, affects product sales (Zhang et al., 2024). Despite these studies, the social operations of ELS broadcasters engaged in live stream selling have been largely neglected in the OM literature. Thus, by utilizing a large-scale observational dataset with econometric modeling and text-mining methods, we contribute here by examining the social operations of broadcasters who play a key role in driving purchases during ELS sessions.
Second, IS and marketing researchers have empirically studied various topics on live streaming commerce. One line of behavioral research has studied various antecedents of customer engagement and purchase intentions in live streaming commerce, such as consumer value and attitude (Wongkitrungrueng and Assarut, 2018), technological affordances (Sun et al., 2019), herding messages and interaction texts (Fei et al., 2021). Another line of empirical work has evaluated the causal effect of sellers’ adoption of live streaming on product sales (Chen et al., 2020). Recent studies have also started to recognize the roles of products and broadcasters in influencing the effectiveness of ELS. For example, Chen et al. (2020) found that broadcasters’ product information provision is a key mechanism that explains the sales effect of ELS. Bharadwaj et al. (2022) used artificial intelligence technologies to extract broadcasters’ emotions and uncovered their relations with sales. Chen et al. (2023) studied the effects of product selection, session scheduling, and fan base on ELS success.
Collectively, despite these past studies, no prior rigorous research has examined the synchronous social operations in ELS and how broadcasters’ different types of real-time SCTAs are related to actual purchase outcomes. Notably, our paper is closely related to prior studies that examined broadcasters’ narratives. Wongkitrungrueng et al. (2020) conducted a qualitative analysis of the live streaming selling process in the social commerce context (i.e., Facebook Live). The paper proposed four types of sales approaches, namely transaction, persuasion, content and relationship, and related them to consumer engagements (e.g., views, comments, shares). Nonetheless, due to the flaws of the data (i.e., small scale and lack of sales metrics) and different focal contexts, their descriptive findings are unable to directly answer our research questions. Song et al. (2022) used machine learning techniques to extract social interaction cues from broadcasters’ narratives and study their relationships with sales. However, their study ignored the intricate mechanisms and complex effects of different types of SCTAs in the ELS context and also omitted the moderating roles of product-level and broadcaster-level factors. Unlike prior work examining ELS broadcasters’ narratives, our study uniquely investigates live streaming on a leading e-commerce platform to empirically analyze how and to what extent broadcasters’ SCTAs are related to consumer purchase outcomes, and how these purchase effects are moderated by product and broadcaster-type attributes in a heterogeneous manner. We present our research framework in Figure 1.

Research framework.
The concept of social operations originates from the television (TV) shopping literature. Social operations, or parasocial operations, refer to a type of conversational technique that TV hosts/broadcasters use to foster parasocial relations without an overt act of selling (Stephens et al., 1996). With the emergence of ELS, the conceptualization of social operations requires a careful revisit to incorporate broadcasters’ new form of real-time engagement strategies that are empowered by ELS's social technology features (e.g., chat, share, like and follow). Specifically, we draw on the speech act theory (Searle, 1969) and classify social operations further into two sub-categories based on the directiveness of broadcasters’ speech actions: (1) the traditional non-directive social operations that are analogous to those in the TV shopping context (e.g., greetings), and (2) the emerging ELS-specific directive social operations (e.g., SCTAs).
Traditional social operations, such as warm, intimate greetings and direct addresses, are one of the most frequently used conversational techniques in the TV shopping context (Stephens et al., 1996). Since the TV shopping program is often pre-recorded and aired at a specific time (Stephens et al., 1996), viewers cannot interact with the TV hosts/broadcasters and are only able to call in or visit a website to interact with the seller/broadcaster in an asynchronous, cross-media manner. Therefore, traditional social operations mainly focus on leveraging broadcasters’ non-directive actions to build relationships and engage viewers, without the primary goal of prompting specific actions from them. Like TV hosts, ELS broadcasters can still use these traditional social operations to welcome new audiences and express gratitude in an untargeted and passive manner. However, unlike TV shopping, ELS offers more interactivity through its social technology features and enables broadcasters’ directive speech actions, which focus on prompting viewers to take a specific action or perform a particular task (Searle, 1969). Hence, ELS broadcasters can prompt viewers’ real-time social engagements, while viewers can also respond to broadcasters’ requests in a direct, synchronous manner. Such ELS-specific directive social operations function similarly to Call-to-Actions (CTAs), which refers to a proactive strategy to encourage audiences’ instantaneous responses or immediate actions (Huang et al., 2021). Thus, we draw an analogy from CTAs and conceptualize ELS-specific directive social operations in the form of SCTAs.
Prior literature has examined various forms of CTAs in asynchronous and non-real-time marketing contexts. For example, Jung et al. (2020) found that different framings of the call-to-send-a-referral messages (egoistic, equitable and prosocial) could lead to different word-of-mouth referral outcomes. Vafainia et al. (2019) examined how the CTAs in direct mailings stimulate customers to visit stores or complete purchases. Nevertheless, prior studies of CTAs focused mainly on selling-related actions (e.g., purchase, referral) in marketing or task-oriented actions (e.g., assignment completion) in other contexts. Social technology-enabled actions (e.g., live chats or likes for an ELS session) have been largely neglected.
Furthermore, due to the difficulty in attributing sales to untargeted mass audiences in TV shopping, prior research has mainly used surveys or content analyses to understand consumer behaviors from the customers’ shopping perspective (Grant et al., 1991). Scant research has used actual product sales data to examine the economic value of broadcasters’ SCTAs from the business operation's perspective. Therefore, the focus of our study is to examine the purchase effect of ELS broadcasters’ SCTAs, an ELS-specific form of directive social technology-enabled operation, using actual product sales data.
To explicate the mechanisms behind the purchase effects of SCTAs, we further categorize SCTAs into two types: cognitive SCTAs and affective SCTAs. This categorization is motivated by the two-pathway information processing model that individuals evaluate an external stimulus through the cognitive and affective processing routes (MacInnis and Jaworski, 1989). Applying this model, a number of empirical studies have explored how cognitive and affective marketing appeals or cues elicit consumer attitudes and responses (Tu et al., 2022). Specifically, cognitive appeals or cues can elicit controlled, slow, and deliberative cognitive processing, whereas affective appeals or cues can arouse automatic, rapid, and crude affective processing (Shiv and Fedorikhin, 1999). In the context of ELS, we argue that social engagement actions such as sending live chats or sharing ELS sessions often require significant cognitive processing. This is because viewers need to take multiple steps and expend cognitive efforts to perform these engagement actions in ELS (Brodie et al., 2011; Lim and Rasul, 2022). For example, sending a chat requires a consumer to click on the chat input field, type their message using an on-screen keyboard, and press the “send” button to post it to the live chat area. Likewise, to share an ELS session, consumers have to click the “share” button, choose one receiver from either Taobao contacts or external platforms, and jump off from the focal ELS page to private chats (with a floating window showing ELS content in a corner). On the other hand, other social engagement actions, such as liking a session or following a broadcaster, are dominated by affective processing because viewers can perform these actions in an automatic and rapid manner by effortlessly tapping on the “like” and “follow” buttons (Brodie et al., 2011; Lim and Rasul, 2022). We further illustrate how consumers perform different social engagement actions in ELS in the Online Appendix (see Figure A1-2). Taken together, we classify broadcasters’ actions in stimulating social engagements associated with viewers’ need for cognitive processing as cognitive SCTAs and those associated with affective processing as affective SCTAs. Next, we employed the S-O-R framework to elaborate on the mechanisms driving the purchase effects of cognitive and affective SCTAs.
Purchase Effects of Cognitive and Affective SCTAs
In ELS, broadcasters’ SCTAs are crucial elements that may influence consumers’ psychological processes and purchase responses (Xu et al., 2020). As such, the Stimulus-Organism-Response (S-O-R) model (Mehrabian and Russell, 1974), which posits that environmental cues act as stimuli (S) that affect an individual's responses (R) through internal states (O), can serve as a logical theoretical foundation for elucidating the purchase effects of broadcasters’ SCTAs in ELS. The S-O-R model has been widely used to explain how various stimuli trigger purchase behaviors in different shopping settings. For example, website interactivity as an environmental stimulus can alter consumers’ cognitive and affective involvement and then affect purchase intention (Jiang et al., 2010). Following the S-O-R model, broadcasters’ SCTAs (S), as crucial environmental cues in ELS, can stimulate consumers’ perceptions of SCTAs (O) and subsequently influence their actual social engagements (R) and consequent purchase behaviors in ELS.
On the one hand, intuition suggests that SCTAs, designed to drive social engagements, can positively affect engagement outcomes. Prior marketing research shows that consumers often relate marketing communications (e.g., SCTAs in ELS) to the marketer's efforts in building customer relationships (De Wulf and Odekerken-Schröder, 2003; Vafainia et al., 2019). According to the theory of reciprocity (Bagozzi, 1995), consumers may feel obligated to perform the action prompted by the marketer as a reciprocal action, which is often subtle and automatic. If this is the case, when a broadcaster uses more SCTAs (regardless of type) in ELS, consumers may perceive a higher level of effort by the broadcaster in leveraging social technologies for real-time engagements and communications, thereby increasing the likelihood of consumers’ reciprocal actions or actual engagements.
On the other hand, the widely-held assumption from cognitive psychology suggests that cognitive effort incurs intrinsic cost and humans generally aim to minimize the cognitive effort (Kool et al., 2010; Kool and Botvinick, 2018). Thus, we argue that broadcasters’ cognitive SCTAs may exert a negative effect on consumers’ engagements due to high cognitive efforts and perceived intrinsic costs, while affective SCTAs may not have such a negative effect. Specifically, cognitive SCTAs stimulate engagement actions involving controlled and deliberative thinking processing and multiple action-taking steps on multiple ELS pages (more illustrations are provided in the last subsection), typically demanding high efforts from consumers. For instance, sending chats necessitates linguistic organization and sharing a session may trigger consumers to contemplate to whom they share with. In this case, broadcasters’ cognitive SCTAs could potentially lead consumers to perceive heightened cognitive efforts to be expended, thus resulting in consumers’ intrinsic costs and inhibiting engagements in ELS (Elliot and Covington, 2001). In contrast, affective SCTAs that encourage actions such as liking and following, demand hardly any effort (Pentina et al., 2018) as these actions require just a one-step, rapid, and simple tap on the focal ELS page. Hence, unlike cognitive SCTAs, affective SCTAs may not negatively affect engagements due to negligible perceived effort and cost. These two lines of reasoning which have some support in the literature (Vafainia et al., 2019), thus offer competing predictions for the effect of cognitive SCTAs on engagements while predicting a positive relationship between affective SCTAs and consumer engagement.
Furthermore, building on the prior literature that examines consumer engagement in online shopping environments including ELS (Fei et al., 2021), we propose that consumer engagement can boost purchase outcomes in ELS. We further expect that this effect could be mediated by session traffic, which serves as an indicator of the potential buyer pool. A critical factor influencing session traffic is the platform's traffic recommendation algorithm. This algorithm of today's video streaming platforms often prioritizes sessions based on social engagements (Zhao et al., 2019), thus promoting sessions with higher engagement to a broader audience in real time. Such increased visibility enhances session viewership size, which in turn, is likely to boost sales. Integrating these arguments, we posit the purchase effect of cognitive SCTAs is theoretically ambiguous, which may be positive, negative or null. However, we argue that affective SCTAs are likely to exert a positive effect on consumers’ purchases, which can be driven by increased social engagements and session traffic.
Research Context, Data and Variables
Research Context
By collaborating with Taobao (a leading e-commerce platform in China), we gathered the data for our empirical analysis from Taobao Live, which is the pioneer of ELS service provider that was launched in April 2016. Any Taobao or Tmall stores and individuals who seek to host a live streaming session on Taobao Live need to register an official broadcaster account. Before starting an ELS session, broadcasters are asked to offer a selection of products to be displayed and sold in that session. A variety of product categories, such as clothing, shoes, bags, cosmetics, snacks, and electronics, are sold on Taobao Live. On the consumer side, they can freely join in any live sessions, watch live stream videos, preview product lists, engage in conversations with broadcasters via chats, send “likes” to appreciate broadcasters, share the session with friends or connections, and place orders for products (see Figure A1-3(a) in the Online Appendix). They can also follow broadcasters and view broadcasters’ main pages on which nicknames, profile photos, descriptions, and past live streaming videos are shown (see Figures A1-3(b) and (c) in the Online Appendix). This information can help consumers easily differentiate between the store and influencer broadcasters. According to a report by iResearch, over 50 billion viewers have watched live streaming sessions on Taobao Live since its launch. 4
Sampling Strategy and Data Description
Given the large broadcaster base on Taobao Live, we had to judiciously sample a subset of broadcasters and collect data for these selected broadcasters. To ensure the selected broadcasters can keep to the original share of store and influencer broadcasters on the platform, 5 we opted to use a stratified sampling strategy. The two focal broadcaster types were considered as the strata, and simple random sampling was applied within each stratum, resulting in 10,381 sample broadcasters. We next collected live streaming session information for these broadcasters from 1 July to 31 August 2020. 6 After removing sessions with no listed products, or which are either shorter than 30 minutes or longer than 960 minutes (i.e., thresholds set by the platform to determine valid ELS sessions), 7 we ended up with 227,635 sessions from the 10,381 broadcasters in our initial dataset.
To answer our research questions, we need to extract meaningful product-level attributes from the ELS sessions of the selected broadcasters. This is theoretically important because consumer purchases of a product might be attributed to a broadcaster's social operations in terms of SCTAs that are used to promote consumers’ social engagements during specific segments of ELS. Moreover, this enables us to incorporate product-level observables and account for unobservables in our empirical model. This key data construction step is achieved by using records of broadcaster-annotated timestamps of specific products showcased within an ELS session to segment an ELS video into product-level clips. After that, 6320 focal broadcasters were kept from our initial data. Due to the high computational costs of video analytics, for the analysis reported below, we randomly chose 50% of the ELS sessions that were conducted by these focal broadcasters within the sample period, resulting in 52,303 sessions retained. In total, we obtained a sample of 878,969 product-level videos with varying durations from 55 to 200 seconds. After some data cleaning, 8 our final sample for empirical analysis contains 673,276 product-level clips from 39,654 sessions hosted by 4695 broadcasters. 9
Overall, our dataset is built from five sources: (1) product-level live streaming video clips, (2) product-level records (e.g., within-session product orders, bookmarks, price, category, shipping), (3) store-level records (e.g., type, ratings, locations), (4) session-level records (e.g., duration, title, listed product categories, brand IDs, aggregated prices), (5) broadcaster-level records (e.g., broadcaster type, ELS frequency). We next elaborated our variable operationalizations.
Variable Operationalizations
Our dependent variable, consumer purchase outcome, is operationalized by the number of product orders
To measure our independent variables, SCTAs in ELS, we obtained the transcripts (i.e., speech-to-text data) of ELS videos via Alibaba's Automatic Speech Recognition (ASR) routine and used a lexicon-based text-mining technique (i.e., counting word occurrences based on predefined lexicons), which is one of the most popular textual analysis methods used in prior business research (Ludwig et al., 2022). Given that existing lexicons suffer from limitations (e.g., not specific to ELS) which may cause measurement errors, we develop context-specific and up-to-date lexicons via three steps. First, we contextualize the definitions of cognitive and affective SCTAs in ELS. Specifically, cognitive SCTAs are measured by two operational cues (i.e., CTAs for chatting and sharing), which are broadcasters’ actions in verbally exhorting consumers to send live chats and to share a live session with friends. Affective SCTAs are also determined by two operational cues (i.e., CTAs for liking and following), which are broadcasters’ actions in encouraging viewers to tap on the “like” icon and to click the “follow” button. Second, we rely on a grounded analysis to create seed word lists for the operational cues of SCTAs in ELS (Singh et al., 2020). Third, we expand our lexicons through the Word2Vec unsupervised learning model (Mikolov et al., 2013). Relying on the Word2Vec model, we create word embeddings, identify bigrams (e.g., “follow us”), evaluate the semantic similarity between words, and generate a set of words that are semantically associated with the defined seed words, thus expanding our lexicons to measure the two SCTA variables. To ensure the quality of the expanded lexicons, we only retain the words with similarity scores above the threshold of 0.6 (Rekabsaz et al., 2017). The technical details of the lexicon construction are provided in the Online Appendix A2.1. Using the constructed lexicons, we generate the measures of cognitive and affective SCTAs at the broadcaster-session-product level, as summarized in Table 1. We first compute the idf (inverse document frequency)-weighted count (Loughran and McDonald, 2011) of cognitive SCTA-related words conveyed per minute by a broadcaster i in an ELS session j's specific product p's video clip 10 to quantify CogSCTAijp. Likewise, we measure AffSCTAijp through the idf-weighted count of affective SCTA-related words conveyed per minute by a broadcaster i in a session j's specific product p's video clip. Further, we validate the quality of our expanded lexicons as elaborated in the Online Appendix A2.2.
Operational cues and constructed lexicons for cognitive and affective SCTAs.
Operational cues and constructed lexicons for cognitive and affective SCTAs.
Note: a. The “like” button in Taobao Live is represented as a heart emoji.
We construct our first moderating variable (Hedonicijp) by following Lee and Hosanagar (2021)'s data-driven approach to classify products as either hedonic or utilitarian (refer to the Online Appendix A3 for the details). Considering that products under the same subcategory should share properties with regard to product types, we opt to conduct the product classification at the product subcategory level. Each product subcategory consists of three levels defined by Taobao (e.g., women shoes (level-1)—boots (level-2)—Chelsea boots (level-3)). We eventually classify 2120 three-level product subcategories which are linked with more than 210,000 unique products. Overall, 839 product subcategories are classified as hedonic type (Hedonicijp = 1) and 1281 subcategories as utilitarian type (Hedonicijp = 0). The second moderating variable is measured as Influencersi, which is a time-invariant broadcaster-type indicator. If a broadcaster is an influencer verified by Taobao Live, Influencersi is coded as 1; otherwise, it is coded as 0. We obtained this information about the sample broadcasters from Taobao Live's database on the first day of our sample period. 11
To obtain unbiased estimates of the purchase effects of SCTAs in ELS, we include a multitude of control variables. First, we rely on product video clips to extract the broadcasters’ other operations, and various characteristics of the live streaming content which could affect product demand in ELS. Specifically, we control for broadcasters’ non-directive social operations (Relational) that involve relational tactics and do not prompt a specific action (Stephens et al., 1996). Beyond social operations, we also construct two variables to quantify broadcasters’ directive (BCTA) and non-directive (Informational) selling operations in terms of eliciting buying actions and conveying product information, respectively. Moreover, to control the broadcasters’ speech length effect, we count the number of spoken words per minute (NumSpkWords). To account for the potential voice persuasion effect (Van Zant and Berger, 2020), we employ the speech data to extract the number of pauses (Pauses), articulation rate (ArticRate), average (AvgPitch) and variation (StdPitch) of broadcasters’ voice pitch. Furthermore, to control the effect of visual content characteristics, we measure the average number of displayed products (ProdDisplays) and faces (FaceDisplays) per frame. To account for the video quality effect, we measure the average level of luminance (Luminance), color richness (ColRichness), and clarity (Clarity). Second, we construct a number of product- and store-specific control variables, which include the sequence order of a product showcased within an ELS session (ShowSequence), the number of times a product is exposed to consumers within an ELS session (NumImpressions), free-shipping dummy (FreeShipping), cumulative number of bookmarks (NumBookmarking), selling price that consumers see and pay when placing an order (Price), product category dummies, average ratings for a store's service, quality, and shipping (Rating), store type indicator (Tmall (versus Taobao)), and dummies for store locations. Third, we introduce a set of session- and broadcaster-specific attributes that are measured at the broadcaster i-session j level and may affect consumer purchases, including the total duration in minutes of an ELS session (Duration), an indicator for promotion-related words (e.g., “discount”) in the ELS title (PromoTitle), dummies for ELS channel category, median product price for all listed products (MedianPrice), total numbers of product categories (NumProdCategories) and brands (NumBrands) in an ELS session. To control for time fixed effects of a session's starting time, we include day dummies and dummies for four time periods of a day. To capture the broadcaster learning and social influence effects, we include the number of ELS sessions a broadcaster hosted in the past one week (ElsFrequency) and a cumulative number of fans or followers (NumFans). Last, to account for broadcasters’ demographics, we include their gender (Female) and age (Age). The Online Appendix A4 provides more details about the operationalizations of control variables.
Figure 2 illustrates our empirical strategies for the main model specification and identification method, mechanism evaluations, and alternative identification approaches. Specifically, Section 4.1 specifies our main model specification using a multilevel mixed-effects model with correlated random effects, and Section 4.2 outlines the main identification approach using instrumental variables. With the key results established, Section 5.4 conducts the mechanism evaluations via a mediation effects analysis. Section 6 then details the alternative identification approaches to alleviate other leftover endogeneity concerns. Last, Section 7 presents various robustness checks to affirm the robustness of our findings.

Summary of empirical strategies.
To answer our first research question, we use the broadcaster-session-product level data. Products within the same ELS session may be correlated as they share common session-specific attributes. Moreover, sessions by the same broadcaster may also be correlated since they are more similar than sessions of other broadcasters. Such within-cluster dependence violates the independence assumption of ordinary regression models. Thus, we employ a three-level mixed-effects linear model with random intercepts to disentangle the between- and within-cluster effects by relating the dependent variable to both observed covariates and unobserved heterogeneity at the product, session, and broadcaster levels. This model has also been adopted to estimate the effects of social media content characteristics by prior social science research (Ordenes et al., 2019). We specify the multilevel mixed-effects model for RQ1 in equation (1a):
To evaluate the moderating effects of product and broadcaster types in RQ2-RQ3, we add the interaction terms of Hedonic and Influencers with the independent variables in equation (1a) and specify the following model:
A key identifying assumption of the mixed-effects random-intercept model is that the random effects (RE) (e.g., θ1,i) are uncorrelated with the regressors (e.g., CogSCTA and AffSCTA). However, this RE assumption may be violated in our context. The correlations between the SCTA variables and broadcaster or session RE may arise due to broadcasters’ habitual styles across ELS sessions or within a session. To handle the potential violation of the RE assumption, we employ the correlated random effects (CRE) approach (Joshi and Wooldridge, 2019). The CRE approach originates from the work by Mundlak (1978). It can allow us to statistically test the RE assumption and also relax the assumption by adding the cluster means of covariates in the multilevel mixed-effects model. Following Antonakis et al. (2021), we include the cluster means of product-level independent variables for broadcaster (
Although we control for a comprehensive set of broadcaster-, session- and product-specific factors and employ the mixed-effects models to account for the unobserved heterogeneity, our model estimations may still be subject to endogeneity concerns due to omitted variables and reverse causality. For example, broadcasters may strategically decide on how to employ the two types of SCTAs (e.g., how to use which type of SCTAs) based on their observations, learning, and expectations of product demand, which can influence consumer purchases, are not observed but captured in the model error terms. To address the potential endogeneity issues, we apply the instrumental variables (IVs) approach (Greene, 2003).
We construct instruments based on the attributes of a focal observation's (i.e., a product p in a session j hosted by broadcaster i) competitors (Fan, 2013; Karanam et al., 2023). For each focal observation, we identify a group of potential competitors at the broadcaster-session-product level using the following criteria. First, intuitively, the competitor broadcasters should be other broadcasters. Second, the competitor sessions should occur during different time periods (i.e., beyond a 1-week window but within a 2-week window before or after the day of a focal session).
13
Third, the competitor products should belong to the same product category as the focal product. To ensure that competitors’ operational decisions of SCTAs have a direct association with the focal broadcaster's SCTAs, we further select the top 10% nearest neighbors based on the similarity of broadcasters’ spoken content,
14
from the pool of the potential competitors (Karanam et al., 2023). These top 10% most similar competitors, which share similar products and spoken content with the focal observation, are likely to correlate with the SCTAs of the focal observation (thereby satisfying the relevance condition) and thus chosen as the final set of competitors. We calculate the averages of cognitive SCTAs, affective SCTAs and speaking rate (i.e., number of spoken words per minute) of the competitors (i.e.,
We provide relevant statistics to show the validity of the IVs. By performing the first-stage regression of the IV approach, we find that our IVs are all statistically significant (p < .01) for their respective instrumented variables, which provides evidence of associations between the endogenous variables and the IVs. To check the relevance assumption of IVs, we perform the weak identification test. The Cragg-Donald F statistics range from 616 to 2360 which are all greater than the threshold of 10, indicating that our IVs are not weak ones (Stock and Yogo, 2005). Furthermore, to test the exogeneity of IVs, we employ the Sargan-Hansen test. We obtain p values from 0.151 to 0.179 that fail to reject the null hypothesis that the over-identifying restrictions are valid, which lends credence to our IVs’ validity.
We adopt a control function (CF) approach to leverage the IVs for endogeneity corrections (Petrin and Train, 2010). As reported by prior research (Mallipeddi et al., 2021), the CF approach is inherently similar to the two-stage least squares (2SLS) method, but it has typical merit in being more suited for complex model specifications (e.g., three-level mixed-effects model). To implement the CF approach, we first model the potentially endogenous variables as a function of IVs and other control variables. We then include the fitted residuals (i.e., control functions) generated by the first-stage regressions as additional regressors in equations (1a) and (1b) to control for the unobservables (Papies et al., 2017).
Main Analysis and Results
Descriptive Statistics
Descriptive statistics of the focal variables based on observations kept for empirical analysis are shown in Table 2. 16 Consumer purchase in terms of the number of product orders is highly skewed, with a mean value of 3.418 and a standard deviation of 51.250. For broadcasters’ SCTAs, we note that the average weighted frequency of cognitive and affective SCTA words conveyed by broadcasters per minute, i.e., CogSCTA and AffSCTA means, are 1.904 and 2.550, respectively. For other ELS social and selling operations, we find that broadcasters allocate most of their time to providing product attributes information (Informational mean = 40.285), followed by increasing consumers’ experience of warmth, affiliation and/or affection to build social relationships (Relational mean = 10.708), and encouraging consumers’ buying actions (BCTA mean = 4.069), in terms of the weighted frequency of words conveyed per minute. Among the products retained for empirical analyses, 35.5% of them are classified as Hedonic products. Such product-type distributions may occur because clothing, shoes and bags, which are mostly utilitarian goods, are the major product categories sold on Taobao Live, i.e., accounting for about 65% of our data. For broadcaster types, we observe that influencer broadcasters account for 8% of all broadcasters. Figure A5-1 in the Online Appendix A5 presents the product categorical breakdown of our broadcaster-session-product level observations. We find that influencer broadcasters receive a higher number of orders (mean = 5.481) compared to store broadcasters (mean = 3.095). On average, the store broadcasters however have a larger fan base (mean = 84,198) than influencer ones (mean = 19,520).
Descriptive statistics for focal variables and the model estimation sample.
Descriptive statistics for focal variables and the model estimation sample.
Note: For the model estimation sample, the number of product video clips = 671,415; the number of products = 212,627; the number of broadcasters = 4,658.
The statistics for control variables are omitted for brevity. The complete version of the descriptive statistics is presented in the Online Appendix A5.
As shown in Table A6-1 in the Online Appendix A6, all focal variables and control variables at the broadcaster-session-product level are not highly correlated. To assess multicollinearity, we run ordinary least squares regression models and obtain variance inflation factors (VIF) for each variable. VIF values for the variables are all below 5, indicating that multicollinearity is not a concern in our empirical analysis.
Columns (1) and (3) of Table 3 present the estimation results for equation (1a), which examine the effects of broadcasters’ cognitive and affective SCTAs on consumer purchases. Column (1) reports the three-level mixed-effects model estimation results without endogeneity corrections. Column (3) shows the endogeneity corrected mixed-effects model estimation results using the IVs. The baseline results shown in Column (1) indicate that the coefficients for CogSCTA and AffSCTA are both significant and negative, which are different from the IV results in terms of the significance level and sign, respectively. The differences are potentially due to the endogeneity of broadcasters’ SCTAs. We opt to interpret the IV results in Column (3), which has lower AIC and BIC values as compared to the baseline model. Specifically, the coefficient for CogSCTA is negative but not significant (θ3 = −0.0018, p > .1), implying that the intensity of broadcasters’ actions in encouraging consumers to perform cognitively-oriented social engagements (i.e., chatting and sharing) is not associated with the number of product orders. This result empirically supports the notion that both consumer reciprocity and perceived effort and cost mechanisms may exist and ultimately nullify each other on the purchase effect. However, we find a significant positive coefficient for AffSCTA (θ4 = 0.0043, p < .01), suggesting a significant positive purchase effect of SCTAs for affectively-oriented social engagements (i.e., liking and following) in ELS, and thus supporting our theoretical prediction.
Mixed-effects model results (DV: lnNumOrders).
Mixed-effects model results (DV: lnNumOrders).
Note: All product, store, session, and broadcaster-specific control variables are included in the model. To avoid multicollinearity, all focal continuous variables are mean-centered in the model with the moderation effects. The full results table is presented in the Online Appendix A7. Standard errors are in parentheses.
*** p < .01, ** p < .05, * p < .1.
To evaluate the moderating effects of product and broadcaster types, we reference the estimation results for equation (1b) shown in Columns (2) and (4) of Table 3. The baseline results are shown in Column (2) and the endogeneity-corrected model results are reported in Column (4). By comparing the results without and with endogeneity corrections, we note that the coefficients for the key variables of interest are significantly different in terms of magnitudes and signs, which thus provide evidence that not accounting for endogeneity of social operations in terms of SCTAs leads to biased estimates of key model coefficients. From Column (4), we obtain a negative coefficient for CogSCTA * HD (ω5 = −0.0051, p < .05), implying that the intensity of broadcasters’ cognitive SCTAs to stimulate chatting and sharing engagements is negatively associated with product purchase outcomes only when they sell hedonic goods. Moreover, the positive coefficient for AffSCTA * HD (ω6 = 0.0023, p < .05) indicates that the positive demand effect of broadcasters’ affective SCTAs to solicit consumers’ liking and following is stronger for hedonic goods, compared to utilitarian goods. Furthermore, we find that broadcaster types exert a significant moderating effect on the purchase demand effect of cognitive SCTAs. The negative coefficient for CogSCTA * IF (ω7 = −0.0098, p < .05) indicates that the extent of broadcasters’ cognitive SCTAs in encouraging consumers to chat and share could inhibit consumer purchases of products sold by influencer broadcasters. In addition, based on the coefficient for AffSCTA * IF (ω8 = 0.0013, p > .1), we do not find a significant moderating role of influencer broadcasters on the purchase effect of affective SCTAs. We plot the predictive margins of the number of product orders on cognitive and affective SCTAs for different types of products and broadcasters in Figure 3.

Predictive margin plots for the moderation effects of product type and broadcaster type. (a) CogSCTA * Hedonic; (b) AffSCTA * Hedonic; (c) CogSCTA * Influencers.
To fully understand our findings, we carry out the mediation effect analysis (Hayes, 2017; Zhao et al., 2010) to verify the underlying mechanisms for the main effects of ELS cognitive and affective SCTAs on consumer purchases. As theorized in Section 2.3, the two competing propositions imply cognitive SCTAs may positively or negatively affect consumers’ social engagements, which is theoretically uncertain. However, we propose a positive relationship between affective SCTAs and engagements, given the predominant positive effect driven by the reciprocity mechanism. Moreover, we argue that session engagements can further drive session traffic because of a common practice that the platform algorithms are designed to recommend sessions with more engagements to a broader audience in real time. This increased visibility attracts more views, thereby boosting product sales. If this traffic mechanism holds, consumers’ engagements can indirectly influence sales via session traffic. We test the social engagement and session traffic mechanisms through a mediation effect analysis below.
Mediation-effect model results (joint estimation).
Mediation-effect model results (joint estimation).
Note: All session-level product, store, session, and broadcaster-specific control variables are included in the model. Since we have broadcaster fixed effects, the broadcaster-specific time-invariant variables (i.e., Influencers, Female, Age) are dropped from the model. Robust standard errors are in parentheses.
*** p < .01, ** p < .05, * p < .1.
Since only session-level engagement and traffic data are available, we conduct the mediation analysis at the broadcaster-session level. We measure consumers’ actual engagements (Engagements) with the total number of engagement behaviors via chatting, sharing, liking and following in a session j of a broadcaster i. Likewise, we quantify session traffic using the cumulative number of views (NumViews) in a session j. For the variables that are originally measured at the broadcaster-session-product-level, we calculate the session sums (e.g., NumOrdersij measures the total number of orders of products in a session j) or averages (e.g., CogSCTAij quantifies the average cognitive SCTAs across products marketed in a session j). We first specify the session-level total-effect model in equation (2a), in which we regress the two SCTAs variables on consumer purchases. We then examine whether the changes in the two types of SCTAs could exert influence on the mediator Engagements in equation (2b) and on the mediator NumViews in equation (2c). In equation (2d), we include both the SCTAs and the mediators in the purchase demand model. In these equations, we further include broadcaster fixed effects (i.e., Broadcasteri) to account for time-invariant individual factors (e.g., ethnicity and education), and other control variables (i.e.,
We use both joint estimations (Zhao et al., 2010) and single-equation estimations (Lu et al., 2023) to estimate the mediation effects model. First, we jointly estimate equations (2b)–(2d) as well as four first-stage equations for the purpose of endogeneity corrections. 18 To account for broadcasters’ fixed effects, we use within-transformed data. The joint estimation results are presented in Table 4. We follow the method in Mehmetoglu (2018) to implement Zhao et al. (2010)'s approach, which proposes that the key of the mediation effect test is that the bootstrap test of the mediating effect (e.g., n3 * β4, see equations (2c) and (2d)) is statistically significant.
We start by quantifying the effect of consumer engagements on purchases. Our results show that the indirect purchase effect of consumers’ engagements through the cumulative session views is positive and significant (0.085 * 0.344 = 0.029, p = .001), supporting our traffic mechanism that actual engagements can drive session views, subsequently amplifying purchase outcomes. This is a partial mediating effect since the direct (0.066, p = .001) and total effect (0.095, p = .001) of engagements on purchases are both statistically significant and positive. Next, we test the indirect effect of CogSCTA on lnNumOrders via actual consumers’ engagements. We find a negative effect of −0.005 (−0.054 * 0.095, p = .009), thus suggesting that the negative effect of cognitive SCTAs is predominant, which may be driven by the perceived effort and cost mechanism. Moreover, both the direct (−0.044, p = .001) and total effect (−0.049, p = .001) of CogSCTA on lnNumOrders are statistically significant and negative. Then, for the indirect effect of AffSCTA on lnNumOrders via consumer engagement outcomes, we find a positive effect of 0.005 (0.057 * 0.095, p = .001), which thereby validates our proposed mechanism that broadcasters’ calls for liking and following can drive purchases through actual engagements in ELS. Furthermore, the direct (0.020, p = .019) and total effect (0.025, p = .005) of AffSCTA on lnNumOrders are both positive and significant. Taken together, we assert that consumers’ actual engagements partially mediate the effects of CogSCTA and AffSCTA on lnNumOrders.
Besides, our results show that both types of SCTAs have effects on the session traffic or views, and these effects are largely mediated by consumer engagements. Specifically, 72% of the viewing effect of AffSCTA (0.005, p = .001) is mediated by lnEngagements and this proportion increases to 100% for CogSCTA (−0.005, p = .004). In other words, we find that direct viewing effects of affective and cognitive SCTAs are relatively low or even zero. This is as expected because consumers who have not viewed an ELS session are unlikely to be affected by the broadcaster's SCTAs in that session, and thus their viewing behaviors should not be directly affected by SCTAs. For the positive direct effect of affective SCTAs on lnNumViews, a possible explanation is that broadcasters’ SCTAs could affect a session's existing viewers and their subsequent viewing decisions (e.g., they may exit the session and join it again). This implies affective SCTAs can directly enhance subsequent viewing behaviors of existing viewers for the same ELS session while CogSCTA has no such direct effect.
In addition to the joint estimation, we separately estimate equations (2a)–(2d) using 2SLS regressions. The single-equation estimation approach can alleviate the potential concern about model misspecifications (Lu et al., 2023). We present the results of the single-equation estimation in Table A7-2 in the Online Appendix A7, which are consistent with the joint estimation results in terms of the coefficients’ sign and significance level. Interestingly, from both sets of the broadcaster-session level results, we do find a negative purchase effect of cognitive SCTAs, which is found to be statistically significant only for hedonic products in our main analysis.
Following Lin et al. (2023), we conduct sensitivity analysis to check the robustness of the mediation-effect model results. To invalidate our inferences of mediation effects, 45% to 93% of our sample observations would have to be replaced with null hypothesis cases, which are higher than the benchmark figures of around 30% in Frank et al. (2013). Nevertheless, we acknowledge that our observational data do not allow us to make causal claims about the estimated mediation effects (Peng, 2023).
We employ alternative identification approaches to account for other potential sources of endogeneity bias.
First, to control for the sample selection bias caused by the product-level data construction with product annotations, we adopt a Heckman two-step model. The results of the Heckman model estimation are shown in Tables A8-1 and A8-2 in the Online Appendix A8. The inverse Mill's ratio (IMR) coefficients are significant in the purchase outcome models, implying broadcasters’ selection for product annotation is likely present. After accounting for sample selection bias, we find the focal coefficients are consistent with those in our main results.
Second, to account for the bias in the case of selection-on-observables, we rely on the treatment effect analysis with propensity score matching (PSM) (see details in the Online Appendix A9). We generate two broadcaster-level treatment variables (TreatCogSCTAi and TreatAffSCTAi) based on their frequency of utilizing cognitive and affective SCTAs (FreqCogSCTAi and FreqAffSCTAi), which are defined using the median-split method. 19 To balance the distribution of observed covariates between the treated and control broadcasters, we employ the PSM method. In addition to the broadcaster-level treatment effect analysis, we also perform the session-level and product-level analyses. The OLS estimation results in Table A9-2 of the Online Appendix that use the matched data of different levels reveal the same qualitative insights as those from our main results.
Third, to alleviate the bias in the case of selection-on-unobservables, we adopt the double machine learning (DML) framework (Chernozhukov et al., 2018) which enables us to control the unknown and potential non-linearities in the modeled linear relationships with machine learning methods (see details in the Online Appendix A10). We specify a partially linear instrumental variable model to estimate the effects of cognitive and affective SCTAs on consumer purchases. The results reported in Table A10-1 of the Online Appendix are highly consistent with those of our endogeneity-corrected mixed-effects model estimations.
Robustness Checks
We execute various robustness checks to ascertain the robustness of our findings.
First, we conduct a series of tests to check the robustness and sensitivity of results based on IV estimations. Specifically, we test if our results hold when using alternative sets of competitors (i.e., top 7% and 5% nearest neighbors based on the similarity of broadcasters’ spoken content) to construct the IVs. We then evaluate the sensitivity of IVs to violations of the exclusion restriction with the “plausibly exogenous” method (Conley et al., 2012). Additionally, we use the Robustness of Inference to Replacement (RIR) approach proposed by Frank et al. (2013, 2023) to quantify the robustness of drawing a causal inference from our IV estimations. Taken together, the results of these analyses detailed in Online Appendix A11 suggest that our IV estimation results are robust to alternative IVs, violations of the exclusion restriction of IVs, and other potential endogeneity biases.
Second, we examine whether our focal variables are robust to alternative operationalizations. We use the number of unique buyers of a product and the number of units sold of a product as two alternative dependent variables. Furthermore, we measure the focal independent variables based on alternative lexicons expanded without using the bigram technique or with a lower similarity threshold (i.e., 0.5). The results shown in Online Appendix A12 are largely consistent with our main results, implying our findings are robust to alternative focal measures.
Third, we check the robustness of our findings to alternative model specifications by estimating the fixed-effects panel model using the 2SLS estimation approach as shown in the Online Appendix A13. The results in Columns (3) to (6) of Table A13-1 in the Online Appendix present the same insights as those from our main results. The results obtained from the panel models with both broadcaster and session fixed effects are largely consistent with those from our main model in terms of the coefficients’ sign, magnitude, and significance level.
Discussions and Conclusion
Our study that investigates the relationship between ELS broadcasters’ SCTAs and consumer purchases, and its heterogeneous nature based on two contingent factors provides several insightful findings. First, we quantify the differential purchase effects of cognitive and affective SCTAs in ELS. Specifically, we uncover novel findings of the positive purchase effect of affective SCTAs in ELS, suggesting that a one-unit increase in the weighted count of affective SCTA-related words conveyed by broadcasters per minute is related to increased purchase orders of a product by 0.43%, i.e., equivalently implying an average demand growth effect of 1.10% increase in purchase orders per minute. 20 However, we obtain an insignificant main effect of cognitive SCTAs on product-level purchase outcomes, which corroborates the notion that the purchase effects of cognitive SCTAs are nullified by competing mechanisms of consumer reciprocity and perceived effort and costs.
Second, we document the differential moderating roles of hedonic (relative to utilitarian) products on the purchase effects of broadcasters’ cognitive and affective SCTAs in ELS. Notably, a negative purchase effect of cognitive SCTAs is only found for hedonic goods, implying that broadcasters’ cognitive SCTAs in stimulating chats and shares in ELS could inhibit purchases of hedonic goods. This is because the contradiction between the affective attributes in hedonic products (Schulze et al., 2014) and the cognitive processing driven by cognitive SCTAs could lead to cognitive dissonance, i.e., a perception of psychological discomfort (Harmon-Jones and Mills, 2019). Such existence of dissonance hence can heighten viewers’ perceived effort and costs elicited by broadcasters’ cognitive SCTAs, resulting in a more evident negative purchase effect of cognitive SCTAs for hedonic products. Furthermore, we find that broadcasters’ affective SCTAs in exhorting consumers’ liking and following engagements are more effective for driving purchases of hedonic goods (relative to utilitarian goods). This is because the affective cues in hedonic goods could further activate the affective processing of these cues (Klein and Melnyk, 2016) and hence lead consumers to engender more reciprocal actions.
Third, the broadcaster type attribute is found to moderate the purchase effects of cognitive SCTAs, but not those of affective SCTAs. Specifically, cognitive SCTAs dampen sales for influencer broadcasters. This could be explained by the consumers’ difficulty in getting attention in influencer broadcasters’ ELS sessions. Influencer broadcasters normally have large audience sizes and may inevitably encounter crowded or overwhelming requests for social or selling interactions with their viewers in the ELS sessions (Leung et al., 2022). Therefore, consumers are likely to experience a higher level of effort required to grab the attention of influencer broadcasters (Pratto and John, 1991) and to stay engaged with them, which further increases consumers’ perceived effort and costs to respond to broadcasters’ cognitive SCTAs. On the other hand, we do not find a moderating effect of broadcaster type on the purchase effects of affective SCTAs. One plausible explanation might be that such a moderation effect of influencer broadcasters may only be salient or significant for fan viewers who are likely to have a higher level of affection or affinity to these influencers (unlike the case for a general mix of both fan and non-fan viewers in our current data and model).
Our research offers theoretical contributions in several ways. First, we contribute to the OM literature by exploring the operational decisions of broadcasters in ELS. Though existing OM research on ELS has mainly examined the operational decisions and strategies of platforms, manufacturers and retailers in ELS, research that institutionalizes the broadcasters’ operations in a company's marketing communications is rare. Thus, by objectively measuring broadcasters’ SCTAs from ELS speech-to-text data, our study is among the first to take the operational perspective of broadcasters and demonstrate the actual economic values of broadcasters’ SCTAs in the synchronous and interactive social and selling environment of ELS, which have not been studied in prior OM research on ELS (Lin et al., 2024; Pan et al., 2022).
Second, our novel findings on the consumer purchase effects of broadcasters’ cognitive and affective SCTAs in ELS enrich the existing research on sales drivers in online selling which ignores the intricate mechanisms empowered by different social technology features (Song et al., 2022; Wang et al., 2022). By objectively measuring broadcasters’ different types of SCTAs in terms of their ELS spoken narratives and empirically examining their relationships with consumer purchase outcomes, we also recognize and verify the differential underlying mechanisms of cognitive and affective SCTAs in driving sales, thus contributing to extant research on online sales drivers (Sun et al., 2021), television shopping (Stephens et al., 1996) and call-to-actions (Huang et al., 2021; Jung et al., 2020).
Third, we make contributions to the marketing communications literature that studies product-related contingency factors to identify the boundary conditions for the effectiveness of marketing communications (Gopinath et al., 2014; Kopalle et al., 2017). By quantifying the moderating effects of the product-type classification of hedonic vs. utilitarian products, our study establishes new empirical support for the theoretical dialogue on the contingent role of product-level factors in an ELS setting.
Fourth, we extend research on influencer marketing (Leung et al., 2022) by examining the moderating role of influencer broadcaster type on the purchase effects of broadcasters’ social technology-enabled operations in terms of SCTAs in ELS. Importantly, we document the heterogeneity in purchase effects due to store and influencer broadcasters who execute SCTAs on the same e-commerce platform but could differently implement their social operations in terms of cognitive and affective SCTAs, so as to reap increased product sales.
Our findings offer several managerial implications to practitioners conducting live streaming selling. First, our research has managerial implications for businesses hiring broadcasters to promote their products. Specifically, our findings offer instrumental guidelines for broadcasters in terms of employing a wide range of social operations appealing to consumers, eventually leading to sales conversion. For instance, in order to stimulate social engagement actions and purchases, ELS broadcasters can employ affective SCTAs calling for consumers’ actions associated with rapid and crude affective processing (e.g., tapping on the “like” button). Moreover, our results suggest that ELS broadcasters should bear in mind that SCTAs via their verbal communications do not always result in positive sales outcomes. For example, ELS broadcasters should exercise cognitive SCTAs (e.g., calls for chatting or calls for sharing) cautiously because they might induce negative consumer purchase effects under some contingent factors.
Second, ELS broadcasters are suggested to strategize their cognitive and affective SCTAs depending on the product and broadcaster types to boost or optimize product sales. For instance, when broadcasters sell hedonic products in ELS, they should adopt more affective CTAs. However, they should be mindful of using cognitive CTAs when they sell hedonic products. More generally, influencer broadcasters and also stores or brands who consider collaborating with influencer broadcasters should be mindful that, in order to alleviate consumers’ perceived effort and costs to crowded-out attention in ELS sessions and low product sales, they should refrain from overly encouraging consumers to engage in high-cognitive-effort activities (e.g., chatting with broadcasters or sharing an ELS session).
Third, the ecosystem of Multi-Channel Network (MCN) agencies incubating KOLs (e.g., broadcasters) is continually thriving nowadays, partially because of the penetration of live streaming. Based on our empirical findings, we recommend that MCN agencies and ELS platforms should provide adequate training and guidance to prospective broadcasters and sellers to improve their ELS video or broadcasting production design, social engagement features, and social operation skills for SCTAs.
Finally, we acknowledge a few limitations that can be addressed by future work. First, our empirical analysis results may suffer from measurement issues due to the inherent limitations of speech recognition and text-mining techniques used. Future research may adopt more advanced techniques to operationalize the social and selling operations from ELS broadcasters’ narratives. Second, we were unable to test the full S-O-R model since we were not able to capture consumers’ internal states or decision-making processes in observational data collected from a real-world context. Thus, future research can consider examining the psychological mechanisms underlying the relationship between cognitive and affective SCTAs and consumers’ perceptions of ELS by using a controlled laboratory experimentation approach. Third, we did not exhaust all the possible boundary conditions of the purchase effects of cognitive and affective SCTAs but only focused on selective product-level and broadcaster-level contingency factors based on theoretical relevance. Hence, future research can explore the moderating roles of other product, retailer, and broadcaster factors. Last, our research findings have been established in the ELS context of China. Future research can explore the contextual differences between ELS in China and that in other countries to examine whether and how our findings can be generalized.
Supplemental Material
sj-pdf-1-pao-10.1177_10591478241276131 - Supplemental material for Can Social Technologies Drive Purchases in E-Commerce Live Streaming? An Empirical Study of Broadcasters’ Cognitive and Affective Social Call-to-Actions
Supplemental material, sj-pdf-1-pao-10.1177_10591478241276131 for Can Social Technologies Drive Purchases in E-Commerce Live Streaming? An Empirical Study of Broadcasters’ Cognitive and Affective Social Call-to-Actions by Yutong Guo, Ying Zhang, Khim-Yong Goh and Xixian Peng in Production and Operations Management
Footnotes
Acknowledgment
The authors thank the special issue editors and reviewers for their constructive suggestions and insightful comments in the review process, as well as the workshop and conference participants at the China Summer Workshop on Information Management (CSWIM) and the International Conference on Information Systems (ICIS) for their valuable comments and suggestions.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Singapore Ministry of Education, Academic Research Fund Tier 1, Project Grant Number T1-251RES2034 and the National Natural Science Foundation of China, Project Grant Number 72002193.
Notes
How to cite this article
Guo Y, Zhang Y, Goh K-Y and Peng X (2024) Can Social Technologies Drive Purchases in E-Commerce Live Streaming? An Empirical Study of Broadcasters’ Cognitive and Affective Social Call-to-Actions. Production and Operations Management 34(12): 4039–4059.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
