Abstract
This research introduces behavioral labeling as the use of names or tags that reflect an associated activity, and it proposes that this can induce corresponding behavior. Contrary to the common intuition that descriptions of behaviors emerge as markings for popular actions (i.e., the label is a consequence of the behavior), the authors propose that a description itself might also induce the corresponding action (i.e., the label is an antecedent of the behavior). Building on linguistic relativity theory and based on five studies conducted in the lab and field, the authors show that merely attaching a fictitious name to a behavior can induce that very behavior. The authors also explore a potential explanation for this finding by showing that a behavioral label can evoke mental imagery regarding the associated behavior, which enhances the implementation of the behavior. The results contribute to marketing theory by introducing behavioral labeling and highlighting how language can shape behaviors. Marketers can use behavioral labels to promote their offerings based on the associated behaviors, while public policy makers can use behavioral labels to encourage prosocial and proenvironmental behaviors.
According to the Global Language Monitor (2020), a new English word is created every 98 minutes, amounting to around 15 words per day. Many of these new words describe behaviors (Liu and Liu 2014; Nam and Kannan 2014). For example, the Oxford English Dictionary includes terms like “selfie” (i.e., taking a self-portrait with a smartphone, often shared via social media), “twerking” (i.e., moving or dancing with a twitching or jerking hip motion), “staycation” (i.e., spending a vacation at home instead of traveling), and “glamping” (i.e., glamorous camping). Although many of these new words may stem from popular behaviors preceding and leading to corresponding words, this does not necessarily have to be the case. As we explore in this research, the use of a name or tag to reflect an associated activity—a phenomenon we refer to as “behavioral labeling”—can also induce the behavior.
Marketers, as well as public policy makers, are often concerned about how to encourage people to engage in certain behaviors. For example, businesses rely on positive online reviews and are looking for ways to encourage their customers to review them positively online. Likewise, public policy makers are seeking ways to encourage people to behave responsibly, such as saving energy, reducing plastic waste, or driving safely. In fact, government agencies use activity tags. For example, “Bob” is a word the Belgium Road Safety Institute invented in 1995 to describe a designated nondrinking driver. Bob was portrayed as the hero of the evening, staying sober and driving their friends home safely after a night out. Anecdotal evidence suggests the campaign was quite effective in changing consumers’ attitudes and reducing drunk driving. The European Commission funded the campaign in other European Union countries, and Bob is mentioned as best practice in European transport policy (Slootmans 2019). The word “Bob” has been added to the Dutch and Flemish dictionaries, and the verb “bobben” (or “to bob” in English) describes the act of appointing someone or volunteering as a designated sober driver.
Prior research has approached the phenomenon of the emergence of new words mainly by investigating the antecedents of the diffusion of new words and content. For example, research has focused on understanding semantics within social networks (Berger and Packard 2023; Berger et al. 2020; Nam, Joshi, and Kannan 2017; Ordenes et al. 2017) or uncovering factors that increase the contagiousness of social media postings or recommendations (Gai and Klesse 2019; Ludwig et al. 2013; Packard and Berger 2017; Schellekens, Verlegh, and Smidts 2010).
In contrast, we introduce behavioral labeling and investigate the question of whether the mere presence of a name or tag can affect the adoption of the described behavior. Consider the term “plogging” (a merger of the Swedish verbs “plocka upp” [pick up] and “jogga” [jog]), which reflects the activity of picking up trash while jogging to reduce litter in public spaces. Rather than assuming that a label or term results from the popularity of the corresponding behavior, we explore whether introducing the label contributes to inducing the behavior.
To further illustrate this point and how behavioral labeling is applied in practice (Van Heerde et al. 2021), we compared two newly launched but almost identical services, one of which introduced a behavioral label for the target behavior (for more details, see Web Appendix A). Based on Google Trends data, as illustrated in Figure 1, we approximate the effect of introducing the behavioral label on differences in service usage over time. We applied a difference-in-difference related approach that captures cross-sectional and time-series effects (Murray 2005) and found a significant positive effect of the behavioral label launch on the between-group differences (F(1, 50) = 67.99, p < .001, η2 = .58). While this example is based on correlational data, the results illustrate the potential effectiveness of behavioral labeling for practice.

Behavioral Labeling in an Advertising Context.
In this work, we develop new theorizing about behavioral labels, which we define as names or tags associated with an activity, or, in short as “activity tags.” We build on linguistic relativity theory, which asserts that language not only is an expression of thought but also channels the way people think and act (Kay and Kempton 1984). Attaching a label to a behavior may provide guidance about the corresponding action (Austin 1962) such that even arbitrary behavioral patterns can be induced by behavioral labels. We further propose that this effect might occur because a behavioral label induces mental images of the focal behavior due to the semantic specification of the behavior, which in turn increases individuals’ probability of performing the behavior (Bone and Ellen 1992; Millikan 2005).
With five studies conducted in the lab and field, we provide converging indication for the effects of behavioral labeling by showing that (1) the use of a name or tag to reflect an associated activity can induce a corresponding behavior and (2) one potential explanation of this effect is related to the mental imagery of the behavior induced by the behavioral label, leading people to adopt the behavior.
By exploring behavioral labeling, we contribute to research on how marketing can extract value from language and linguistics (Berger et al. 2020; Gai and Klesse 2019; Hovy, Melumad, and Inman 2021; Ordenes et al. 2019; Zhang and Patrick 2021). Currently, this stream is largely represented by research that derives important insights from social media (Berger et al. 2020; Nam, Joshi, and Kannan 2017) or details the idiosyncrasies of national languages for global communication campaigns (Luna and Peracchio 2001; Noriega and Blair 2008; Wu et al. 2019). We take a complementary approach to the value of language and show that it might be deployed as a tool to induce desired behaviors more directly. Relatedly, linguistic studies have outlined the emergence and adoption of new words and behaviors (Bloom and Keil 2001; Vygotsky 1962). We suggest that this perspective on language is also relevant to marketing, as it can be leveraged to understand and encourage behavioral implementation. Moreover, we address calls in the literature to investigate the effects of labels that refer to intangible entities (Keller 2020; White, Habib, and Hardisty 2019). Specifically, we explore behavioral labels that refer to actions or behavioral sequences, unlike labels related to stable entities such as companies or products. Behavioral labels help connect different behavioral sequences, making them easier for the consumer to imagine as a specific behavior and thus more likely to be implemented by consumers. Finally, our research offers managerial implications for designing social campaigns to encourage behaviors and campaigns to promote products and services associated with specific behaviors.
Conceptual Background
Language and Behavior
The interplay between language and behavior is an important part of linguistic theory. At the heart of this phenomenon is the Sapir–Whorf hypothesis, which asserts that people's individual view of the world is shaped by their native language (Kay and Kempton 1984) such that language not only is an expression of thought but also influences cognition, which in turn channels the way people think and act. The related school of thought is referred to as linguistic relativity. The idea behind linguistic relativity theory and the Sapir–Whorf hypothesis is that language not only is influenced by cognition but also can influence cognition and behavior. More precisely, linguists distinguish two Sapir–Whorf hypotheses: a strong hypothesis, which proposes that cognition depends on language, and a less strong hypothesis, which contends that cognition is influenced by language (Penn 1972). The latter has become prevalent in academic inquiry (Samuel, Cole, and Eacott 2019).
Numerous studies across disciplines have suggested that language impacts how people think and act (Lucy 2016). For example, researchers have identified cross-linguistic differences in event memories (Boroditsky 2001) and time perceptions (Bergen and Chan Lau 2012) that appear to reflect the influence of the direction in which individuals write (e.g., from left to right in English, from top to bottom in Mandarin). It has even been shown that consumers are more likely to lie in a foreign language than in their native language (Gai and Puntoni 2021). Moreover, research in economics has indicated that languages featuring grammatical associations of the future and the present correlate with economies with high national savings rates and retirement assets (Chen 2013). Overall, the linguistic perspective suggests that words can offer guidance or direction to act out a certain behavior (Austin 1962; Ordenes et al. 2019).
The marketing literature also provides evidence that consumers adjust their behaviors in response to words that evoke certain images, such as brand and company labels (Zhang and Patrick 2021). Companies create a brand label (i.e., a name, sign, or symbol) and aim to induce an image in consumers’ minds by linking the label to marketing programs (Keller 1993, 2003). The resulting image leads to brand differentiation that goes beyond the objective features of products or services (Datta, Ailawadi, and Van Heerde 2017). Consumers form emotional ties with brands and strongly identify with them (Batra, Ahuvia, and Bagozzi 2012). Consequently, managers can enhance the business performance of their products and services by building and maintaining strong brands as brand equity increases sales and customer engagement (e.g., Keller and Lehmann 2006). Further, these images and brand labels can even take on nearly religious importance for consumers (Shachar et al. 2011) and shape consumers’ preferences and choices (Keller 2020). Importantly, although these previous studies show that consumers adjust their behavioral responses to words denoting brand or company labels connected with certain product attributes (e.g., product's size, quality), products and services still exist as defined entities independent from the labels attached to them.
In contrast, we argue that a behavioral label may induce a certain behavior by specifying an arrangement or order of actions. That is, while these actions would exist as separate entities without the behavioral label, the label unites them in a connected and graspable manner, making them easier to imagine and comprehend. In this way, behavioral labels unite initially disconnected entities (i.e., separate sequences of actions), unlike labels related to entities that are already defined, such as companies or products (Keller 2020). In other words, behavioral labels encompass behavioral sequences and consequently provide a semantic frame for the corresponding behavior. As a result, behavioral labels may evoke clear images of behaviors due to their semantic specification of these behaviors, as we elaborate on in the next section.
Therefore, in line with findings from linguistic relativity theory and the power of brand labels, we propose that behavioral labels can induce corresponding actions. Specifically, we expect that consumers exposed to a behavioral label are more likely to implement the associated behavior than consumers not exposed to the behavioral label.
Behavioral Labels and Mental Imagery
In line with our conceptualization of behavioral labels, words can create mental representations of abstract concepts and thus make these concepts more accessible to individuals and shape perceptions (Mkrtychian et al. 2019). In particular, research has shown that a word label enables the creation of a mental link between the label and the labeled entity (Althaus and Plunkett 2016). For instance, arbitrary verbal labels can enhance individuals’ ability to learn different types of smells by improving odor-categorization accuracy (Vanek, Sóskuthy, and Majid 2021), and certain words can shape color perceptions (Forder and Lupyan 2019). Similarly, labels have been shown to guide children's visual attention and improve categorization in a memory card game (Barnhart, Rivera, and Robinson 2018). Overall, previous studies demonstrate that word labels can create mental links or anchors, which shape the relationship between exposure to certain stimuli and individuals’ reactions to the stimuli.
These kinds of mental anchors facilitate internally generated representations of objects. One concept that speaks to this account is mental imagery, a form of cognition that enhances information interpretation and processing through the construction of “mental pictures” (Kosslyn 1994). Mental imagery enables consumers to see an object in a new light (Palmiero, Cardi, and Belardinelli 2011). Unsurprisingly, mental imagery has long been studied in the design literature (e.g., Dahl, Chattopadhyay, and Gorn 1999). Similarly, in marketing, mental imagery has been shown to enhance consumers’ attitudes toward products or services (MacInnis and Price 1987; Roggeveen et al. 2015) by enabling them to mentally simulate product usage (Hoeffler 2003; Zhao, Hoeffler, and Dahl 2009).
More central to our conceptual reasoning, mental imagery is also likely to impact behaviors. Specifically, mental imagery produces quasi-pictorial representations that facilitate the generation, interpretation, and enactment of information through spatial representations (Dahl, Chattopadhyay, and Gorn 1999; Pearson 2019). This process leads to two notable outcomes. First, it enables the generation of new associations and responses to stimuli, which may replace previous associations and habits evoked by the same stimuli (Lang 1979). Second, mental imagery allows consumers to visualize themselves performing behaviors. Consistently, research in neuroscience has shown that mental imagery manifests both visually and kinesthetically in humans (Parsons 1994). We therefore expect that mental imagery provides a gateway to understanding behavioral labeling by providing a plausible explanation for its effect on subsequent behavior: a behavioral label evokes mental imagery regarding the corresponding behavior, which increases implementation of the behavior.
Overview of Studies
We explore the effects of behavioral labeling across five studies measuring actual behaviors. In Study 1, we examine the proposed effect on behavior in a controlled laboratory setting. In Study 2, we replicate the effect with a preregistered study. While Studies 1 and 2 investigate behavioral labeling to encourage new behaviors, Study 3 examines behavioral labeling to discourage behavior in a field study. Study 4 examines behavioral labeling in another field study, measuring actual behavior in an online context. The results suggest mental imagery as a potential explanation for the effect, while controlling for popularity as a potential alternative explanation. Building on these results, Study 5 further explores mental imagery as a potential explanation for the proposed effect of behavioral labeling by showing that inducing mental imagery in a no-label condition can create an effect similar to behavioral labeling.
The procedures for these studies were approved by ethics committees. We report two-sided significance values for all studies. Because the effects related to behavioral labeling are novel findings and were tested in various contexts and with different labels, we did not have clear expectations about effect sizes to conduct a priori power analyses across the studies (Scheel et al. 2021), which would have been the preferred option for determining sample sizes (Lakens 2022). In addition, we were interested in observing and measuring actual behavior in our studies, so we faced resource constraints when recruiting participants. More specifically, the main resource constraints were as follows: In Study 1, we had budget constraints on the number of participants we could recruit. In Study 3, we had a time-window constraint in which we were allowed to set up the booth and recruit participants. In Study 4, we were only able to access participants enrolled in a specific course during a specific semester for which we had permission to collect data from the respective university’s ethics committee. Considering these resource limitations, we relied on common practices as documented in previous studies (Krefeld-Schwalb and Scheibehenne 2023) and normative guidelines for sample sizes (i.e., cell sizes of about or at least 30, 50, or 100; e.g., Green [1991] suggests collecting about 50 participants per cell), and we aimed to exceed them when feasible. In the online studies where we were able to recruit participants through accessible panels (e.g., Prolific), we collected larger samples, which is common in online studies (Krefeld-Schwalb and Scheibehenne 2023); that is, we recruited at least 100 participants per experimental condition in Study 5. In addition, in Study 2, we conducted a pilot study to obtain an expected effect size as input for an a priori power analysis that then determined the sample size of Study 2 (Lakens 2022). However, due to budget constraints, we only used this approach in Study 2. We thus acknowledge that the sample size justifications for most of our studies are based on heuristics and resource constraints (Lakens 2022).
We provide p-curve analyses along with a p-curve disclosure table (Simonsohn, Nelson, and Simmons 2014) and post hoc power calculations in Web Appendix I. Data and codes for all studies can be accessed at https://osf.io/4qwsu/?view_only=1dbf66c8b35542019383cfbd2c4dbd65.
Study 1: Behavioral Labeling in a Laboratory Setting
In Study 1, we test the effect of behavioral labeling on an actual behavior using a one-factor between-subjects design with two levels (no label vs. label) in a controlled laboratory setting. Specifically, we test whether exposure to the made-up label “fancing” (i.e., finger dancing) enhances participants’ finger movements.
Participants and Procedure
We recruited 111 students on the campus of a large Western European university but could not record the responses of five participants who had technical issues during the study, leading to a sample of N = 106 (Mage = 22.62 years, SDage = 3.86; 52.8% female, 47.2% male). All participants received monetary compensation for their participation. They indicated no suspicion of the experimental manipulation when debriefed.
The behavior of interest was tapping fingers to a musical beat. We invited one participant at a time to the lab. We used a cover story to inform participants that the topic of study was individual reactions to music and to explain that they had to wear headphones and a glove (to control for the sensory stimulation) while they answered a survey on a computer screen.
For this study, we used the made-up label “fancing” (i.e., finger dancing), which refers to the behavior of moving one's fingers to music. Therefore, we captured actual finger movements across all fingers as the dependent variable. We use an aggregate across five fingers because we reason that people differ in not only which fingers they use to tap to a beat but also the number of fingers they use to tap to a beat (most people use two fingers). Capturing the movements across all five fingers accounts for this heterogeneity and reveals the extent to which participants adopted the tapping behavior (irrespective of which fingers were used). Further, focusing on the frequency of taps also enables us to separate participants’ deliberate behavior in response to the behavioral label from general hand movements that are unrelated to the behavior in question, such as glove adjustments, hand stretching, and general horizontal hand movements on the pad.
Specifically, we measured the frequency of finger taps to the beat of music (i.e., finger taps per second) with a device attached to the glove participants were wearing. This technology has been used in neuroscience research to investigate dexterity after an artery stroke (Nowak et al. 2007). In employing this technology, we followed an established approach from this domain to sensitively measure finger movements. Specifically, the glove contained small ultrasound-emitting markers to measure finger movements with an ultrasonic motion measurement system that uses microphones to calculate the three-dimensional spatial positions of these markers (CMS 20S, Zebris, Isny). The markers were attached to all five fingers of the glove. We used two ultrasonic sensor plates so the measurement sensor did not appear obtrusive to participants. 1 Movement data were recorded by a second laptop that participants could not see. We subsequently matched participants’ responses to the questionnaire with the movement data using an identifier number that each participant was assigned to before starting the experiment. Web Appendix B depicts the experimental setting.
The questionnaire was administered on a computer and started by collecting participants’ demographic information and presenting a filler task. First, we asked about participants’ age, gender, and handedness (right-handed, left-handed, or no preference). For the filler task, we played two songs in sequence (Wham!'s “Last Christmas” and Pharrell Williams's “Happy”) and asked for participants’ first associations when listening. To assess these associations, we displayed three pictures per song (“Last Christmas”: landscape covered with snow, Christmas market at night, wrapped Christmas gifts; “Happy”: a crowd of happy people, a hat, a picture of hands together), and participants had to choose the picture that first caught their eye when listening to each song. We also asked them to rate both songs (“Please indicate to what extent you liked the song” [1 = “not at all,” and 7 = “completely”]), and we used the song they liked best in the next part of the experiment.
After this task, participants were randomly assigned to one of two conditions (label vs. no label), which had similar amounts of text. The label condition read, “Moving your fingers to music has been recently established as fancing (i.e., finger dancing). Fancing means moving your fingertips to match the beat of music. Fancing can have positive effects for relieving stress and enhancing concentration.” The no-label condition read, “Music is part of people's everyday lives, and it is natural that they react to it in different ways. One example is moving your fingertips to match the beat of music. Moving fingers to music can have positive effects for relieving stress and enhancing concentration.” The questionnaire then continued with a note: “On the next page, we will play some music again. You can try it [no-label condition]/You can try fancing [label condition] a bit if you like using your right hand. To do this, place your right hand on the pad next to you such that you do not have your hand in sight. If you do not feel like it (anymore), just click on to the next page.” All participants heard the song they liked better according to their responses to the filler task. Note that the instructions suggested that participants try out the behavior if they chose to engage in it using a black interaction pad next to them. The pad helped us ensure that participants put their gloved hand equipped with the emitting markers in reach of the active ultrasonic sensor. Finally, as a manipulation check, we asked all participants if they could recall a specific word describing the behavior of moving one's fingertips to match the beat of music (yes/no). If they indicated “yes,” they were asked to provide the word. Based on these two questions, we coded if participants indicated the behavioral label “fancing” or not (1 = yes, 0 = no).
Results
Manipulation check
A chi-square test confirmed the effectiveness of our manipulation (χ2(1) = 44.69, p < .001, Cramer's V = .649). In the label condition, 34 participants (61%) recalled “fancing” correctly. In the no-label condition, no participant mentioned “fancing.”
In addition, we tested the stimulus material with a separate online sample obtained from Clickworker to ensure that participants would recognize the behavioral label as a specific expression for the behavior while reading the text. Ninety-nine participants (Mage = 38.27 years, SDage = 11.95; 36.4% female, 63.6% male) were randomly presented with the no-label or label manipulation text and then asked to indicate to what extent they think the text contained a specific expression for the described activity (1 = “definitely no,” and 7 = “definitely yes”). An independent samples t-test showed that participants in the label condition more strongly associated a specific word with the described activity than participants in the no-label condition, confirming the overall effectiveness of the manipulation between the two conditions (t(97) = 9.310, p < .001, d = 1.83; Mlabel = 6.86, SDlabel = .54, nlabel = 50; Mno label = 3.41, SDno label = 2.54, nno label = 49). 2
Behavioral labeling
We tested the proposed effect using an analysis of variance (ANOVA) with condition (no label vs. label) as the independent variable and finger taps per second per participant summed across all five fingers as the dependent variable. By measuring taps per participant, this dependent variable essentially captures to what extent participants implemented the behavior. As Figure 2 illustrates, the results indicate the expected pattern (see also Table 1, Panel A). Specifically, participants exposed to the behavioral label implemented the behavior more intensively (Mlabel = 7.14, SDlabel = 3.34, nlabel = 56) than those in the no-label condition (Mno label = 5.29, SDno label = 3.38, nno label = 50; F(1, 104) = 8.02, p = .006, η2 = .07).

Results Study 1 (“Fancing”).
Main Analysis and Robustness Checks for Study 1.
As a first robustness check, we calculated taps from the two most active tapping fingers in the sample because we reasoned that most people use at least two fingers to tap to a beat (a pattern that we also found in our data). 3 An ANOVA with condition as the independent variable and finger taps per second per participant summed across the two most active fingers as the dependent variable again showed a significant effect (F(1, 104) = 4.15, p = .044, η2 = .04; Mlabel = 4.52, SDlabel = 2.65; Mno label = 3.51, SDno label = 2.46), confirming the results obtained in the main analysis (Table 1, Panel B).
When designing the experiment, we ensured that both songs featured relatively fast beats per minute (“Last Christmas” 108 bpm, “Happy” 160 bpm [obtained from www.songbpm.com]) such that matching the exact beat of the song would result in similar finger taps per second summed across all five fingers (9 for “Last Christmas” and 13 for “Happy”). However, taking into account the remaining difference in song speed in beats per minute, as a second robustness check, we performed an analysis of covariance (ANCOVA) to test the effect of behavioral labeling while controlling for song choice. Whether participants preferred “Last Christmas” or “Happy” and listened to the corresponding song while potentially acting out the behavior did not change the results (Table 1, Panel C). A chi-square test revealed no significant group difference between the label (nlabel(happy) = 29, nlabel(lastchristmas) = 27) and no-label (nno label(happy) = 30, nno label(lastchristmas) = 20) conditions regarding song choice (χ2(1) = .722, p = .395, Cramer's V = .083).
Finally, as a third robustness check, we included participants’ handedness (84.0% right-handed, 12.3% left-handed, 3.8% no preference) as a covariate in an ANCOVA. Including this covariate did not change the results (Table 1, Panel D). We observed no difference across conditions regarding participants’ handedness (χ2(2) = 1.021, p = .600, Cramer's V = .098; Fisher's exact test, p = .698; nlabel(right) = 47, nno label(right) = 42, nlabel(left) = 6, nno label(left) = 7; nlabel(nopreference) = 3, nno label(nopreference) = 1).
Discussion
Study 1 provides initial evidence for the proposed effect of behavioral labeling. Measuring a real behavior, this study demonstrates that a behavioral label can induce a new behavior in a controlled setting. Based on these results, we test behavioral labeling in a preregistered study with a more managerially relevant behavioral label in Study 2.
Study 2: Behavioral Labeling in an Online Review Setting
In Study 2, we apply behavioral labeling to the managerially relevant context of online reviews using a one-factor between-subjects design with two levels (no label vs. label). Specifically, we test if behavioral labeling can be used to encourage consumers to post more positive online reviews.
Businesses depend on favorable online reviews and online word of mouth, actively seeking ways to encourage their customers to review them favorably online (Chintagunta, Gopinath, and Venkataraman 2010; Li and Hitt 2008). For example, many businesses use stickers or banners that state “Review us on Google” or “Leave us a review on Google” in their local stores to enhance online reviews. Thus, we invented the behavioral label “hypeviewing” (i.e., hyping a business with a review) and predicted that participants exposed to the behavioral label (vs. no label) would be more likely to implement the behavior, measured as positivity of a written review. For details on the preregistration of this study, which is based on a pilot study reported in Web Appendix C, see https://aspredicted.org/gc59i.pdf.
Participants and Procedure
In total, we recruited 1,500 U.S. participants (nlabel = 750; nnolabel = 750) from Prolific. We determined the sample size based on a power analysis. In line with the preregistration plan, we excluded seven participants because they left nonsense or filler text answers in their written reviews, leading to a final sample of N = 1,493 (Mage = 37.73 years, SDage = 12.75; 48.8% female, 49.1% male, 2.1% preferred not to declare; nlabel = 746; nnolabel = 747). Web Appendix D provides the written texts of the excluded participants.
Participants were informed that the study was about restaurants and that they would be asked to imagine a specific situation and complete a writing task. We told them to read the instructions carefully and forced them to spend at least 10 seconds on the page containing the scenario and manipulation. All participants read the following scenario: “Imagine you are on a city trip and you have just visited a newly opened café. You had two coffees and a sandwich, that you really liked. However, the place was busy and you also had to wait quite a long time to be served.” Note, we intentionally included one negative and one positive aspect of the café experience as part of the scenario to allow for natural variance in how participants rated the experience and to avoid floor or ceiling effects due to an overly negative or positive scenario.
Next, all participants read, “On your way out, you see this sticker on the door of the café.” Participants in the label condition then saw a picture of a sticker in Google design colors containing the following text: “Hypeview us [hype us with a review] on Google” (for original stimulus material, see Web Appendix D). Participants in the no-label condition saw the same sticker and text but without the behavioral label: “Hype us with a review on Google.” Next, participants read the following instructions: “Imagine you have a few minutes and decide to review the café on Google. Please write a short review for the café on Google in the text box below. [Note: Do not indicate a star rating, but leave a written text review.]” Finally, participants indicated their age and gender and were thanked for their participation.
Results
Manipulation check
We tested the stimulus material with a separate online panel from Amazon Mechanical Turk with the help of CloudResearch (Chandler et al. 2019) to ensure that participants would recognize the behavioral label as a specific expression for the behavior while reading the text. Ninety-nine participants (Mage = 37.20 years, SDage = 10.69; 49.5% female, 49.5% male, 1% preferred not to declare) were randomly presented with the no-label or label manipulation text and then asked to indicate to what extent they think the text contained a specific expression for the described behavior (1 = “definitely no,” and 7 = “definitely yes”). An independent samples t-test showed that participants in the label condition more strongly associated a specific word with the described behavior than participants in the no-label condition, confirming the overall effectiveness of the manipulation between the two conditions (t(97) = 3.224, p = .002, d = .648; Mlabel = 6.33, SDlabel = 1.34, nlabel = 51; Mno label = 5.15, SDno label = 2.24, nnolabel = 48).
Behavioral labeling
In line with the preregistration, we calculated an overall positivity score of the review as the dependent variable. First, we used the Linguistic Inquiry and Word Count software (Pennebaker et al. 2015) to assess the categories “posemo” (positive emotions) and “negemo” (negative emotions). Then, we normalized both variables. Finally, we subtracted the negative emotions from the positive emotions so that higher values indicate more positive reviews overall. The results indicate the expected effect of behavioral labeling (t(1,491) = 3.42, p < .001, d = .177): the reviews of participants in the label condition (Mlabel = .17, SDlabel = .15) were more positive overall than those of participants in the no-label condition (Mno label = .14, SDno label = .13). We also ran the analysis with the full sample (i.e., including the previously excluded seven participants who left nonsense or filler text answers in their written reviews), and the results remained robust (t(1,498) = 3.28, p < .01, d = .169).
Discussion
Study 2 provides further indication for the effectiveness of behavioral labeling by examining actual behavior in an online review setting, and it illustrates how behavioral labeling can be used to stimulate more positive online reviews. We find that behavioral labeling can change how participants write their reviews; that is, participants exposed to the behavioral label “hypeview” wrote more positive reviews about the business overall. In other words, participants exposed to the behavioral label were more likely to tone down the negative aspects, refocus on the positive aspects, or both. This is illustrated by exemplary responses from the label condition, (e.g., “Little bit of a wait, but the food was worth it”). Building on these results, Studies 3 and 4 aim to further generalize the findings to actual behavior in field settings. In Study 3, we explore behavioral labeling for discouraging a behavior.
Study 3: Behavioral Labeling for Discouraging a Behavior
Study 3 investigates whether exposure to the made-up label “lidcotting” (i.e., lid boycotting), which we invented for the purpose of this study, makes consumers refrain from taking plastic lids for their takeaway teacups using a one-factor between-subjects experiment with three conditions (label, no label, and control).
Participants and Procedure
University students were recruited on campus of a large Western European university for a paper-and-pencil survey. We set up a booth on campus and invited participants to take part in a short on-site study about their tea preferences in exchange for a complimentary takeaway cup of tea. The behavior of interest was whether participants refrained from taking a plastic lid for the takeaway cup. One hundred sixty participants, nearly equally distributed across the conditions, did not take a complimentary cup of tea and were thus excluded from the sample because we were not able to assess the dependent variable for them, resulting in a final sample of 503 students (Mage = 22.34 years, SDage = 3.13; 64.9% female, 34.7% male, .04% no answer). None of the participants asked to use their own takeaway cup.
We randomly assigned participants to one of the three conditions (label, no label, or control) by handing out the respective questionnaires as they approached the booth. After some filler questions regarding fun facts about tea and tea preferences (the same across all conditions), participants in the label condition were asked, “Have you heard about ‘lidcotting’? It means refraining from using plastic lids (lid = for lid; cotting = boycotting) when taking beverages in takeaway cups to reduce plastic waste.” Participants in the no-label condition read a similar question without the behavioral label (“Have you heard about refraining from using plastic lids when taking beverages in take-away cups to reduce plastic waste?”). Participants in the control condition, which we included to capture common behavioral conduct, instead encountered another neutral filler question about tea (“Have you heard that New Zealand is one of the regions that grows high-quality tea?”).
Participants filled out the survey at a separate table, so the questionnaire was out of the experimenter's field of sight. Then participants returned the survey and could choose one of three types of tea served in a paper takeaway cup (for the experimental setting, see Web Appendix E). The assistant filled each cup two-thirds full so that the fill volume made it equally logical to take a lid or not, depending on individual preference. Upon returning their questionnaire and receiving their tea, participants were invited to help themselves to sugar, milk, and lids at a separate table, again out of the assistant's direct field of sight to avoid bias that might arise if participants believed they were being observed. After each participant left, the assistant went to the condiments table, checked the lids to determine whether the participant took one, marked the choice on the questionnaire, and refilled the supply of lids if one had been taken.
Results
Manipulation check
We tested the stimulus material with a separate online sample obtained from Clickworker to ensure that participants would recognize the behavioral label as a specific expression for the behavior while reading the text. One hundred fifty participants (Mage = 38.04 years, SDage = 11.49; 38% female, 62% male) were randomly presented with the control, no-label, or label condition manipulation text and then asked to indicate to what extent they think the text contained a specific expression for the described behavior (1 = “definitely no,” and 7 = “definitely yes”). An ANOVA (F(2, 147) = 56.41, p < .001, η2 = .43) with planned contrasts confirmed the expected pattern. Participants in the label condition more strongly associated a specific word with the described behavior than participants in the no-label condition (F(1, 147) = 53.23, p < .001, η2 = .27; Mlabel = 6.49, SDlabel = 1.45, nlabel = 49; Mno label = 3.36, SDno label = 2.89, nno label = 56) and control condition (F(1, 147) = 107.08, p < .001, η2 = .42; Mcontrol = 1.80, SDcontrol = 1.83, ncontrol = 45). There was also a significant difference between the control and no-label condition (F(1, 147) = 12.56, p < .001, η2 = .08). 4 Overall, the effectiveness of our manipulation was confirmed.
Behavioral labeling for discouraging behavior
Our main interest was to explore whether exposure to a question with a behavioral label (label condition) more strongly discourages taking a plastic lid than exposure to the same question without the label (no-label condition) relative to a baseline that assesses the common behavioral conduct (control condition). Thus, we conducted a chi-square test of whether the experimental conditions differ in the proportion of participants who enacted the behavior of interest, namely, not taking a lid. The analysis revealed a significant overall effect of condition (label vs. no label vs. control) on lid choice (χ2(2) = 18.49, p < .001, Cramer's V = .192).
Follow-up chi-square comparisons revealed a significant difference between the label condition and the control condition (χ2(1) = 18.14, p < .001, Cramer's V = .231), as expected. 5 In the label condition (nlabel = 168), 89.3% of participants refrained from taking a plastic lid, but in the control condition (ncontrol = 171), only 70.8% showed this behavior. The difference between the control condition and the no-label condition (nno label = 164), where 81.1% did not take a lid, was also significant (χ2(1) = 4.88, p = .027, Cramer's V = .121). Finally, referring to the key comparison for testing the effectiveness of behavioral labeling, the difference between the label condition and the no-label condition was significant (χ2(1) = 4.42, p = .035, Cramer's V = .115). In other words, compared with the no-label condition, the percentage of participants who took a lid nearly halved in the label condition (no label: 18.9%; label: 10.7%), indicating that behavioral labeling was effective in discouraging participants from taking a lid.
As an additional check, we considered the potential effect of participants’ tea choice (ngreen = 196, nblack = 110, nfruit = 197) on their subsequent decision to take a lid, but we found no significant differences across the three conditions in tea choice (χ2(4) = 5.223, p = .265, Cramer's V = .072). In addition, we found that the type of tea chosen did not affect whether participants took a lid (χ2(2) = .211, p = .900, Cramer's V = .020).
Discussion
Study 3 suggests that a behavioral label can help discourage environmentally unfriendly behavior. Relative to behavioral conduct in the control condition, providing a behavioral label more strongly discourages taking a plastic lid with a takeaway cup. In addition, we also observe a significant difference between the label condition and the no-label condition in discouraging taking plastic lids. Next, we explore mental imagery as a potential underlying process explanation for the proposed effect of behavioral labeling.
Study 4: Mental Imagery as a Potential Underlying Process
Study 4 consists of two parts. The first part tests the effectiveness of behavioral labeling in an online field setting, which we conducted during online tutorials for a graduate marketing course using a one-factor between-subjects design with three levels (label, no label, and control condition). 6 Specifically, we invented the behavioral label “up-smiling” (i.e., cheering others up by using encouraging smiley faces in online chats) and observed students’ behavior during online tutorials. We tested if attaching the behavioral label to a description of the behavior led students to use more encouraging smiley faces (i.e., emojis). Depending on the tutorial, groups of students were exposed to one of the three experimental conditions (label, no label, or control).
In the second part of the study, after participants had the chance to perform the behavior, we conducted a follow-up survey to measure and investigate potential mediators. The goal was to explore mental imagery as one potential explanation for the documented effectiveness of behavioral labeling and investigate perceived popularity as an alternative explanation. Individuals might infer that when a behavioral label is attached to an activity, that activity must be popular, so the behavioral label would indirectly communicate social norms that drive individual behavior (Goedegebure, Van Herpen, and Van Trijp 2020; White, Habib, and Hardisty 2019).
Participants and Procedure
In total, 174 students (59.8% female, 40.2% male) participated in the online tutorials (ncontrol = 25, nno label = 71, nlabel = 78) at an Australian university. To maximize the power for exploring the effect of behavioral labeling on subsequent behavior compared with a no-label condition, we chose to assign most participants to the two experimental conditions, which are used to test the hypothesis, while using a smaller control group sample, given the resource constraints of this study in terms of the number of available tutorials and students (Lakens 2022). Depending on the tutorial, student groups were exposed to one of the three experimental conditions (control, no label, label). The tutor carried out the manipulation to the groups of students. Four different tutors held seven tutorials overall with the same marketing course activity. Due to data protection regulations, we had to keep the study anonymous; therefore, we cannot match individual students to tutorials, but we know which experimental groups and tutors the students were assigned to in part one (tutor 1: one control tutorial, ncontrol = 25; tutor 2: one no-label tutorial, nno label = 23, and one label tutorial, nlabel = 26; tutor 3: two no-label tutorials, nno label = 48 and one label tutorial, nlabel = 26; tutor 4: one label tutorial, nlabel = 26).
The main activity of the tutorials was presentations by students. After a reminder of the presentation rules (e.g., timing), the tutor exposed students to the following slide for the label condition: “We all know an online presentation can be stressful. So, consider to ‘up-smile’, i.e., cheer others up by using encouraging smiley faces in the chat. In this tutorial, feel free to up-smile in the chat throughout the presentations to support each other.” In the no-label condition, students saw a slide with the same behavior without the label: “We all know that an online presentation can be stressful. So, consider to cheer others up by using encouraging smiley faces in the chat. In this tutorial, feel free to use smiley faces in the chat throughout the presentations to support each other.” The control condition did not include an additional slide before the presentations started. All tutorials were conducted online via Zoom, so students did not see each other in person. For the analysis, we captured the behavior of the groups during the online tutorials through the Zoom chats, which enabled us to assess the number of encouraging emojis posted in each group.
For the second part of the study, which was conducted to gain initial insights into potential underlying processes driving participants’ behavior in the first part, participants received a link to a short anonymous online survey at the end of their tutorials. In the survey, we first asked participants to what extent they had considered using encouraging smiley faces to cheer others up in the online chat during their tutorials (1 = “strongly disagree,” and 5 = “strongly agree”). Next, we measured the proposed mediators on seven-point scales (we used seven-point scales for the mediators to facilitate measurement distinctiveness from the dependent variable [see Pieters 2017]). Specifically, we measured popularity with one item adopted from Zhu and Ratner (2015): “To what extent do you think this behavior (i.e., using encouraging smiley faces to cheer others up in the online chat) is generally quite popular?” (1 = “not at all popular,” and 7 = “very popular”). We measured mental imagery with a three-item scale (Cronbach's alpha = .92) adapted from Bone and Ellen (1992): “During this class, I imagined what it would be like to use encouraging smiley faces to cheer others up in the online chat” (1 = “completely disagree,” and 7 = “completely agree”); “While reading about using encouraging smiley faces to cheer others up in the online chat, I had a vivid image of this behavior”; “While reading about using encouraging smiley faces to cheer others up in the online chat, I had a vivid image of myself performing the behavior” (1 = “not at all,” and 7 = “very much”). Finally, participants provided their demographics and were thanked for their participation.
Before conducting the main study (part one), we ran an exploratory pilot study (for details, see Web Appendix F) with a sample of 81 students (55.56% female, 44.44% male) from an Australian university. Students participated in online tutorials involving the same task and the same tutor, and they were exposed to the label (in the pilot study, “chmiling” [i.e., cheering others up by smiling at them or using encouraging smiley faces online]), no label, or control condition. The overall counts per group (i.e., the number of emojis per group expressed by text symbols [e.g., “:)”, “:D”] or emoticons in the group chat) revealed the expected pattern: control = 1, no label = 7, label = 26. Based on this preliminary indication of the expected pattern on the group level, we proceeded with this approach for the main study, where we analyzed individual-level data.
Results
Manipulation check
We tested the stimulus material for the main study with a separate online panel obtained from Prolific to ensure that participants would recognize the behavioral label as a specific expression for the behavior. Eighty-three participants (Mage = 31.05 years, SDage = 11.58; 56.6% female, 42.2% male, 1.2% preferred not to declare) were randomly presented with the no-label or label manipulation text and indicated to what extent the text contained a specific expression for the described behavior (1 = “definitely no,” and 7 = “definitely yes”). An independent samples t-test showed that participants in the label condition more strongly associated a specific word with the described behavior than participants in the no-label condition, confirming the overall effectiveness of the manipulation between the two conditions (t(81) = 2.728, p = .008, d = 1.90; Mlabel = 5.83, SDlabel = 1.829, nlabel = 41; Mno label = 4.69, SDno label = 1.97, nnolabel = 42).
Behavioral labeling (first part)
We counted the number of encouraging emojis (expressed by either text symbols [e.g., “:)”, “:D”] or emoticons) in the Zoom chat for each student as the dependent variable. Because the dependent variable was a count allowing for an excess of zeros in the data set and to acknowledge that randomization was applied at the group level rather than the individual level, we performed a zero-inflated Poisson regression with cluster-robust standard errors at the tutor level. 7 The results confirmed a significant relationship between the conditions and the count outcome (χ2(2) = 33.18, p < .001) and, in particular, a significant difference between the no-label condition and the label condition (Wald χ2(1) = 9.66, p < .01). 8
Mediation by measurement (second part)
For the second part of the study, we received 104 responses (Mage = 24.30 years, SDage = 3.14; 53.8% female, 37.5% male, 8.7% preferred not to declare; nno label = 43, nlabel = 61). Note that the total sample size deviates from the sample size in the first part because we did not send the survey to participants in the control condition, as the survey had questions regarding the behavior those participants did not know about (it was only mentioned in the tutorials of the no-label and label conditions). Moreover, according to ethics committee regulations, we could not force students to participate in this survey, so not all students from the no-label and label conditions participated in this voluntary follow-up survey. The likelihood of responding to the follow-up survey did not differ between the two conditions. 9 The survey was anonymous due to the terms of its ethical approval; therefore, we could not match students’ answers to the survey with their actual behavior in the first part, but we could match responses to their experimental treatment conditions using anonymous codes.
First, we conducted three separate ANOVAs to test the effect of condition (no label vs. label) on the self-reported behavior and the two mediators. Participants in the label condition (Mlabel = 4.44, SDlabel = .83) scored higher than those in the no-label condition (Mno label = 3.95, SDno label = .93; F(1, 102) = 7.99, p = .006, η2 = .07) on the self-reported behavior measurement. The behavioral label enhanced mental imagery (Mlabel = 5.81, SDlabel = 1.22; Mno label = 5.19, SDno label = 1.38; F(1, 102) = 5.89, p = .017, η2 = .06) but not perceived popularity of the behavior (Mlabel = 5.70, SDlabel = 1.36; Mno label = 5.40, SDno label = 1.59; F(1, 102) = 1.14, p = .289, η2 = .01).
We then proceeded with a mediation analysis. As a first step, a factor analysis confirmed discriminant validity among the two mediators and the dependent variable (see Web Appendix G; Pieters 2017). The correlation between the imagery mediator and the dependent variable is r = .674 (p < .001). Discriminant validity was further established with the heterotrait–monotrait ratio of correlations among all constructs below the stipulated .85 cutoff criterion (Voorhees et al. 2016). Parallel mediation analysis was conducted with PROCESS (Model 4; Hayes 2017) by applying bias-corrected bootstrapping and 10,000 subsamples to estimate the indirect effects with a 95% CI.
As illustrated in Figure 3, behavioral labels enhanced mental imagery regarding the behavior but not perceived popularity of the behavior. Further, although both mental imagery and perceived popularity significantly affect behavioral implementation, importantly, the indirect effect of the condition (no label vs. label) on the reported behavior is significant only through mental imagery (indirect effect = .164, BootSE = .109, 95% CI: [.031, .486]), not through popularity (indirect effect = .044, BootSE = .059, 95% CI: [−.021, .261]). The direct effect reaches marginal significance in the parallel mediation model (direct effect = .281, p = .06, 95% CI: [−.010, .573]).

Parallel Mediation Analysis.
Discussion
Study 4 indicates that behavioral labeling can intensify a positive behavior (encouraging others). The same pattern emerges in the small pilot study reported in Web Appendix F where the tutor is held constant. 10 Furthermore, the results provide indication that mental imagery mediates this effect, while popularity does not. However, given the study design had to be adjusted to institutional privacy regulations, it is impossible to identify tutor-specific versus condition-driven effects in our mediation analysis. Therefore, the results must be interpreted with caution given the correlational nature of the mediation analysis (Pieters 2017). Overall, the results provide evidence consistent with mediation (i.e., mental imagery may mediate the effect of behavioral labels on behavior) but do not conclusively demonstrate it. Next, to further explore mental imagery as a potential explanation for the effectiveness of behavioral labeling, we show that inducing mental imagery in a no-label condition can produce an effect similar to behavioral labeling.
Study 5: Inducing Mental Imagery in a No-Label Condition
The goal of Study 5 is to further explore the process by inducing mental imagery. If enhanced mental imagery is indeed one of the reasons behavioral labels induce behaviors, then we should be able to produce an effect similar to that of a behavioral label in a no-label condition by evoking mental imagery. We explore this possibility in a one-factor between-subjects design with three levels: no label, label, and no label with enhanced mental imagery.
We designed Study 5 taking inspiration from the need to address the problem of extremism in online reviews for businesses and the growing research attention to this issue (Brandes, Godes, and Mayzlin 2022; Karaman 2021). Specifically, the behavior of interest is whether exposure to the behavioral label “trollspotting” (i.e., spotting online “trolls” and ignoring their reviews), which we invented for the purpose of this study, causes consumers to evaluate a restaurant more positively based on existing reviews, albeit some include inflammatory, irrelevant, or offensive comments (i.e., troll reviews). We expected that participants in the label (“trollspotting”) condition would evaluate the restaurant more positively than participants in the no-label condition, even though they were exposed to the same set of reviews.
Further, consistent with the idea that mental imagery can increase people's likelihood to engage in a behavior, similar to behavioral labeling, we also expected the same pattern between the no-label with enhanced mental imagery condition and the no-label condition (i.e., no-label effect is weaker than the no-label effect with enhanced mental imagery). We expected no significant difference between the label condition and the no-label condition with enhanced mental imagery.
Participants and Procedure
We recruited 400 (Mage = 37.21 years, SDage = 13.92; 60.3% female, 37.8% male, 2% preferred not to declare) U.S. participants from Prolific. At the beginning of the survey, all participants were informed that the study is about restaurants. Then, all groups read the following text for at least 20 seconds before they could continue the survey: Imagine you are organizing a dinner with friends and you are trying to figure out where to go. You have found a newly opened restaurant and are checking online reviews to determine the overall rating of the restaurant. You have come across an independent website, where consumers leave written comments. Consumers can write what they want. Meaningful reviews relate to the food, the service or the atmosphere in the restaurant and provide reasons for the rating. However, some reviews may also be written by consumers, who simply aim to antagonize the restaurant by deliberately posting inflammatory, irrelevant, or offensive comments. These consumers do not write about objective features of the restaurant but instead provide overly negative reviews. Some users choose to ignore these types of reviews when determining the overall rating.
In addition, participants in the label condition read, “This is also known as trollspotting (i.e., spotting online ‘trolls’ and ignoring their ratings).” In the no-label with enhanced mental imagery condition, where we followed a commonly used approach to prime mental imagery (Jiang et al. 2014; Petrova and Cialdini 2005), the last part of the text was, “It is important that you rely on your imagination to visualize how you would identify such non-genuine reviews before browsing the website. Use your imagination to picture these types of reviews and how you would identify them.”
The dependent variable captures participants’ evaluation of the restaurant after exposure to several online reviews about the place (measured by an open-ended question: “After reading these reviews, please write a few lines on how you would describe the restaurant to your friends”). We measured it after participants read through ten different reviews, which were displayed in random order (see Web Appendix H). We adapted the reviews from common review websites such as Yelp and Google to enhance external validity. Five reviews were positive and two were negative. The five positive reviews and the two negative reviews referred to the food and service and provided reasonable explanation for their assessment. The three remaining reviews, however, were troll reviews, because they did not contain rational negative attributes that could reinforce a factual negative impression but instead referred to inflammatory, irrelevant, or offensive comments. The more participants ignored these troll reviews, the more positively they were expected to evaluate the restaurant, as the overall balance of the reviews was positive (i.e., positive reviews clearly outweighed the negative ones by a ratio of 5 to 2). Thus, as the dependent variable, we analyzed how positively participants felt about the restaurant to capture the behavior of interest. In the last part, participants entered their age and gender.
To pretest the three review categories (positive, negative, and troll reviews), we recruited participants from Prolific (N = 150; Mage = 39.95 years, SDage = 14.17; 44.0% female, 56.0% male). We randomly assigned participants to one of the three review categories, showed them the respective reviews in random order, and asked them to read through the reviews. Participants then indicated to what extent they think the reviews are based on comprehensible criteria (e.g., food quality, service; 1 = “strongly disagree,” and 7 = “strongly agree”). As expected, an ANOVA confirmed significant differences between review categories (F(2, 147) = 196.97, p < .001, η2 = .73), such that participants in the troll-review category (Mtroll = 2.29, SDtroll = 1.32, ntroll = 49) differed significantly from the positive-review category (F(1, 147) = 306.04, p < .001, η2 = .68; Mpos = 6.08, SDpos = .87, npos = 51) and the negative-review category (F(1, 147) = 287.54, p < .001, η2 = .66; Mneg = 5.98, SDneg = 1.02, nneg = 50), confirming the intended troll-review manipulation. As intended, there was no difference between the positive and negative categories (F(1, 147) = .208, p = .649 1, η2 < .01).
Results
Manipulation checks
We tested the stimulus material with two pretests. First, we ensured that participants would recognize the behavioral label as a specific expression for the behavior while reading the text. Participants recruited through Prolific (N = 99; Mage = 34.96 years, SDage = 12.74; 52.5% female, 45.5% male, 2.0% preferred not to declare) were randomly presented with the label, no-label, or no-label with enhanced mental imagery manipulation text and then asked to indicate to what extent they think the text contained a specific expression for the described behavior (1 = “definitely no,” and 7 = “definitely yes”). An ANOVA (F(2, 96) = 13.59, p < .001, η2 = .22) and planned contrasts confirmed that participants in the label condition (Mlabel = 6.71, SDlabel = 1.06, nlabel = 38) associated a specific word or expression more strongly with the described behavior than participants in the no-label condition (F(1,96) = 20.67, p < .001, η2 = .18; Mnolabel = 4.79, SDnolabel = 2.07, nnolabel = 34) and those in the no-label with enhanced mental imagery condition (F(1, 96) = 18.50, p < .001, η2 = .16; Mnolabel-im = 4.78, SDnolabel-im = 2.17; nnolabel-im = 27). There was no difference between the two no-label conditions (F(1, 96) < .01, p = .972, η2 < .01).
Second, we conducted another separate online study with a different Prolific sample (N = 285; Mage = 37.43 years, SDage = 13.57; 55.1% female, 44.2% male, .7% preferred not to declare) to test the enhanced mental imagery manipulation. Participants were randomly assigned to the label, no-label, or no-label with enhanced mental imagery manipulation and asked to what extent the text made them use their imagination (“While reading the text, I imagined what it would be like to ignore reviews written by consumers, who simply aim to antagonize the restaurant by deliberately posting inflammatory, irrelevant, or offensive comments”; 1 = “completely disagree,” and 7 = “completely agree”). An ANOVA (F(2, 282) = 3.48, p = .032, η2 = .02) with planned contrasts confirmed that participants in the label condition (Mlabel = 5.68, SDlabel = 1.27, nlabel = 93) differed significantly from those in the no-label condition (F(1, 282) = 6.14, p = .014, η2 = .02; Mnolabel = 5.18, SDnolabel = 1.56, nnolabel = 97). Likewise, there was a significant difference between the no-label condition and the no-label with enhanced mental imagery condition (F(1, 282) = 4.01, p = .046, η2 = .01; Mnolabel-im = 5.58, SDnolabel-im = 1.34; nnolabel-im = 95). As expected, there was no difference between the label condition and the no-label with enhanced mental imagery condition (F(1, 282) = .23, p = .629, η2 < .01).
Mental imagery and behavioral labeling
An ANOVA showed a significant effect of the condition (no label vs. label vs. no label with enhanced mental imagery) on positivity of the written text as the dependent variable (F(2, 397) = 4.35, p = .013, η2 = .02). To assess positivity of the written text, we used the same dependent variable as in Study 2 (i.e., an overall positivity score). Planned contrasts confirmed the expected pattern. Participants in the label condition (Mlabel = .10, SDlabel = .18; nlabel = 133) indicated higher levels of overall positivity than participants in the no-label condition (Mnolabel = .05, SDnolabel = .19, nnolabel = 132; F(1, 397) = 3.97, p = .047, η2 = .01). Moreover, participants in the no-label with enhanced mental imagery condition (Mnolabel-im = .12, SDnolabel-im = .22, nnolabel-im = 135) indicated higher levels of positivity than participants in the no-label condition (F(1, 397) = 8.31, p = .004, η2 = .02) but not participants in the label condition (F(1, 397) = .78, p = .377, η2 < .01).
Discussion
As expected, we find that inducing mental imagery in the no-label condition causes the condition to move closer to the label condition, such that the no-label with enhanced mental imagery condition and the label condition are equally above the no-label condition in terms of the intensity of the behavior of interest. These results suggest that inducing mental imagery in a no-label condition can create an outcome similar to that of behavioral labeling. While the results do not causally show that mental imagery is the only or the most important contributor to the proposed effect of behavioral labeling, the results are consistent with the idea that directly inducing mental imagery can increase the likelihood that consumers engage in a certain behavior in a similar way as does behavioral labeling.
General Discussion
This research introduces behavioral labeling and shows that the use of a name or tag to reflect an associated activity can induce a corresponding behavior. Across five studies, we provide converging indication for the effectiveness of behavioral labeling in inducing corresponding actions across various labels, behaviors, and contexts. We started by examining behavioral labeling in a controlled lab setting. In online and field studies, we showed that behavioral labeling can encourage positive new behaviors and discourage negative existing ones. Finally, we provided initial process evidence by measuring mental imagery as a potential mediator and inducing it in a no-label condition, which revealed effects similar to those induced by behavioral labeling.
Theoretical Implications
This research contributes to the emerging literature on how marketing can extract value from language and linguistics (e.g., Berger et al. 2020; Gai and Klesse 2019). Prior literature on brand linguistics and linguistic studies of consumer behaviors have explored how language influences consumers’ preferences and decision making (Argo, Popa, and Smith 2010; Carnevale, Luna, and Lerman 2017; Stoner, Loken, and Blank 2018) and revealed important effects of different linguistic devices, such as phonetic symbolism (Lowrey and Shrum 2007), on consumers’ evaluations of and preferences for certain brand names (Pogacar, Shrum, and Lowrey 2018). Although marketing research on labeling effects beyond the brand management context is scarce, a notable exception is Summers, Smith, and Reczek’s (2016) work on social labeling. Their research suggests that consumers adjust their self-perceptions when they are given a label based on their behavior (in any way, not just when a new word is invented for the behavior), and subsequently they begin to behave consistently with the label. While these findings demonstrate the effects of social labels (i.e., a label assigned to a person) on individual behavior, our research is about labels attached to a behavior (not to a person) and its effects on individual behavior.
Importantly, our research addresses a call in the literature to investigate the effects of labels that refer to intangible entities (Keller 2020). Namely, we shift focus from brand labels to labels attached to specific behaviors or actions and intangible entities, which have not yet received attention in labeling research (Keller 2020). We suggest that behavioral labeling can prompt consumers to behave in particular ways by attaching an activity tag to a sequence of actions.
Practical Implications
Our results offer insights for public policy makers and marketers. Policy makers can utilize behavioral labeling to encourage particular behaviors for social and environmental benefits. This is illustrated by the success of the “Bob”/“to bob” drunk driving campaign that created a word to denote the behavior of volunteering as the designated driver for a group. As we show through the invented term “lidcotting,” behavioral labeling has the potential to reduce the use of single-use plastics, which is beneficial for the environment by reducing waste.
Marketing managers could leverage behavioral labeling to promote their products/services. As the natural experiment–based data for the grocery delivery services discussed in the introduction suggests, using a behavioral label could create a commercial advantage for one brand compared with a competitor that does not use a behavioral label. A further example is Ariel, a leading laundry detergent and fabric care brand marketed by Procter & Gamble. Ariel recently introduced its “All-in-1 PODS,” advertising the product as a laundry detergent. A single pod can be dropped into the washing machine before clothes are added to be washed. To market the product, Ariel introduced the verb “to pod” (or “podding”), representing a behavioral label, to encourage the behavior of using Ariel pods.
Furthermore, the results on “up-smiling” (Study 4) suggest that behavioral labels can induce more supportive and constructive behavior in online settings, which lends itself to applications in many societal contexts, from online classrooms to consumer discussion groups or forums that all can benefit from more supportive behavior, especially during stressful post-COVID times. Relatedly, behavioral labels like “trollspotting” (Study 5) may help consumers become more resistant toward information coming from internet trolls. At a more general level, such labels, if adopted, may have the potential to break a “negativity spiral” on social media (Hewett et al. 2016) and contribute to making the online world better (Chandy et al. 2021).
Limitations and Future Research
Our research objective has been to introduce behavioral labeling and begin to shed light on its effects and underlying processes, an effort that can produce many additional questions and avenues for future research. We have only just begun to scratch the surface of this interesting new area, so we do not know much about it yet, and there are likely many moderators that still need to be uncovered to better understand when behavioral labeling is or is not effective. Undoubtedly, there are powerful moderators pertaining to (1) the label, (2) the behavior, (3) the context, and (4) the consumers that could amplify or attenuate the effects of behavioral labels on corresponding behaviors.
For example, in terms of potential moderators pertaining to the type of label, all the behavioral labels we investigated have positive valence, but behavioral labels can also apply to negative behaviors, such as consuming indulgent food. Do such labels encourage or inhibit more negative behaviors? Another interesting boundary condition to explore relates to the construction of labels in that the effects of one- versus multiple-word labels might vary. Different kinds of behavioral labels (e.g., portmanteaus, oxymorons) might also arguably be more effective than others. Still, as the “Bob”/“to bob” drunk driving campaign example suggests, a label does not necessarily need to be meaningful per se to induce a behavior. Moreover, recent research has explored consumer responses to unconventionally spelled brand names (Costello, Walker, and Reczek 2023); future research could further explore the impact of behavioral labels as potentially unconventional descriptions of behavior.
With regard to the behavior, our studies can be described as focusing on relatively novel, unusual, unfamiliar, or innovative behaviors. It is plausible that behavioral labeling is less effective, disappears, or even backfires for familiar, common, or routine behaviors. For example, if a behavioral label is applied to a behavior that a consumer regularly engages in and is fairly familiar with, the behavioral label could be perceived as a threat to the consumer's behavioral freedom and induce reactance (Brehm 1966; White, Habib, and Hardisty 2019).
Relatedly, it is easy to imagine contexts in which a behavioral label might seem overengineered or inappropriate, such as for serious medical treatments. Thus, research is needed to identify specific contexts in which behavioral labeling is ineffective or might even backfire. In addition, we assessed behavior at the individual level, but we cannot fully exclude potential spillover effects between participants in Studies 3 and 4. However, we also believe that potential contagion effects in behavioral adoption based on behavioral labeling would be an interesting area for future research.
With respect to consumer characteristics, future research could explore whether the effectiveness of behavioral labels depends on how much they align with consumers’ specific vocabularies (Hovy, Melumad, and Inman 2021). Furthermore, consumer personality and cultural differences also likely play a role, such as whether consumers are more versus less sensitive to social norms cues, which the presence of a behavioral label could indirectly indicate.
Relatedly, future research should also address alternative, concurrent, or complementary process explanations. Regarding additional processes, although we show in Study 5 that mental imagery can produce an effect similar to that of a behavioral label, more research is needed to determine whether mental imagery is the main contributor to the effects of behavioral labeling across contexts. Specifically, we show in Study 5 that stimulating mental imagery for the no-label condition elevates the corresponding behavior to the level of the label condition; thus, future studies can test whether the opposite applies, namely, whether limiting mental imagery in the label condition reduces the effectiveness of behavioral labeling. For example, if behavioral labeling is driven mainly by mental imagery, blocking mental imagery (e.g., when consumers are distracted by a cognitive task or under time pressure) or using a label that does not allow for quasi-pictorial representations (e.g., because it is too vague) should make behavioral labeling less effective or ineffective. Such studies could provide important managerial implications regarding the context and type of label that increase or decrease the effectiveness of behavioral labeling and which processes underly the related downstream consequences. Relatedly, recent research has shown that unconventionally spelled brand names may backfire, but not if consumers seek memorable experiences with or through a brand, for which an unconventional brand name may serve as a more effective memory marker than a conventional name (Costello, Walker, and Reczek 2023). This might point to memory effects related to the effectiveness of behavioral labeling.
In addition, other sequential or parallel processes are also possible. For example, mental imagery may help establish a script in consumers’ minds that they use in relation to the behavior (MacInnis and Price 1987). Although we believe there are alternative mediators that remain to be explored, we expect that the effects of behavioral labels are strongly related to mental anchors, pictures, or narrative representations of a behavior that a behavioral label induces, which in turn may lead to additional process explanations. For example, mental imagery induced by behavioral labels may lead to perceptions of familiarity or meaningfulness about the behavior (Escalas 2004). Alternatively, the label's ability to create a vivid image of a behavior could induce a promotion-like perception of the behavior description, which in turn increases novelty perceptions of the behavior, leading to behavioral adoption, as the desire for novelty is a basic psychological need that can drive consumer behavior (González-Cutre et al. 2016). For example, presenting information in a novel way (e.g., gamified communications) enhances corresponding behaviors, such as the adoption of innovations (Müller-Stewens et al. 2017).
However, there might be contexts and boundary conditions, where mental imagery could be obviated by other, potentially conflicting, mechanisms. For example, if a behavioral label is applied to a well-known behavior, the behavioral label could be perceived as an intrusion to a person's habits and the potentially identity-related importance assigned to those habits, thus creating identity threats. Moreover, Study 4 ruled out perceived popularity of the behavior as an alternative mechanism, but this process may become more important in a context where the behavior could signal group association, status, or expertise.
Overall, we encourage future research to explore behavioral labeling and address its replicability and generalizability across different contexts and samples. In line with the proposed concept of behavioral labeling, we focused on capturing actual behavior in all studies, which inherently limited the number of people we could reach to participate in the study. Therefore, we must acknowledge the limitation that some of the studies could be considered underpowered or are context or sample specific. More diverse samples would also enable future research to uncover and investigate potential moderators or mediators and further test the generalizability of behavioral labeling (Edlund et al. 2022).
Finally, the long-term effects of behavioral labeling, as well as their potential spillover effects on other behaviors or marketing indicators (Chandy et al. 2021) also require further investigation to establish their robustness over time. Do they fade or become stronger? What happens if behavioral labels are changed over time? We hope future studies address these and other questions related to behavioral labeling to explore additional conditions under which it exerts its influence and further processes by which it does so.
In conclusion, we introduced behavioral labeling and showed that attaching an activity tag to a behavior can induce the corresponding behavior. We hope our findings spark additional research and new marketing practices in this intriguing domain.
Supplemental Material
sj-pdf-1-jmx-10.1177_00222429231213011 - Supplemental material for Behavioral Labeling: Prompting Consumer Behavior Through Activity Tags
Supplemental material, sj-pdf-1-jmx-10.1177_00222429231213011 for Behavioral Labeling: Prompting Consumer Behavior Through Activity Tags by Martin P. Fritze, Franziska Völckner and Valentyna Melnyk in Journal of Marketing
Footnotes
Acknowledgments
The authors thank Marc Fischer, Stefano Puntoni, John Roberts, Christophe van den Bulte, and Harald J. van Heerde for their feedback and valuable input on earlier versions of the manuscript. The authors would also like to thank research seminar participants at Rotterdam School of Management, Tilburg University, WU Vienna, The Wharton School, University of Auckland, and UNSW Sydney. Further, the authors thank several research assistants at the University of Cologne as well as research assistants and colleagues at UNSW for their support during this research project.
Coeditor
Christine Moorman
Associate Editor
Martin Schreier
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
