Sage Journals: Discover world-class research

Abstract

This study investigates how the terminology used to describe algorithmic systems—specifically “algorithm” versus “chatbot”—influences user preferences, behavior, and the dynamics of human–artificial intelligence (AI) decision support. Additionally, it examines whether the objectivity of the task moderates this effect. Across two preregistered studies, participants consistently preferred the term “algorithm” over “chatbot” and were more likely to follow advice labeled as coming from an “algorithm” than from a “chatbot,” regardless of task objectivity. In the first study, 115 participants indicated their preference for either human advice or various AI system terms across two tasks: predicting stock prices (objective) and measuring romantic attraction (subjective). In the second study, 568 participants were randomly assigned to one of several AI system terms and tasks, assessing their preference and advice-taking, the latter measured by the Weight of Advice. The findings reveal that terminology significantly impacts user preference, behavior, and trust in algorithmically generated outputs, illustrating the broader implications for human cognitive and emotional responses in algorithmically supported decision-making contexts. These results underscore the importance of language in shaping human–AI interactions and provide practical guidance for organizations aiming to foster AI adoption through careful choice of terminology. They also extend research on algorithm aversion and human–AI interaction by offering behavioral evidence for critical data studies, showing that algorithmic labels influence trust and compliance beyond system performance.

Keywords

Algorithm chatbot algorithm aversion algorithm use algorithm terminology AI terminology

Introduction

The rapid development of artificial intelligence (AI) is reshaping how humans interact with technology. Among the most significant advancements is the rise of generative AI systems, such as chatbots and large language models like ChatGPT. These systems are increasingly applied across various domains, including healthcare, customer service, and creative industries (Bigman and Gray, 2018; Jovanovic et al., 2020; Rapp et al., 2021). This growing reliance on AI has generated both excitement and concern, particularly regarding its influence on human behavior, decision-making, and cognitive processes.

A critical aspect of this transformation is the evolving nature of human–AI interaction. Interactions with AI systems are now a significant research focus due to their potential to enhance productivity, creativity, and efficiency. However, the success of AI implementation depends significantly on user adoption, which is often hindered by algorithm aversion—the tendency to distrust or avoid using algorithms after seeing them err, even when they outperform human judgment (Dietvorst et al., 2015). Understanding the factors that drive aversion or foster appreciation of AI systems is essential to promoting their effective use across diverse domains.

The terminology used to describe AI systems plays a pivotal role in shaping user perceptions and behaviors. Research has shown that terms such as “algorithm,” “artificial intelligence,” and “chatbot” elicit distinct user reactions, which significantly impact trust and engagement (Langer and Landers, 2021). For example, “algorithm” often conveys precision and objectivity, making it more suitable for tasks perceived as technical or data-driven, such as financial forecasting (Jussupow et al., 2020). Conversely, “chatbot” implies a conversational, human-like interaction but may undermine trust when applied to critical or technical tasks. These terminological distinctions are more than semantic; they influence user interactions, potentially affecting comfort, trust, and reliance (Candrian and Scherer, 2024).

If terminology influences user interactions with AI systems, it could also shape research outcomes, emphasizing the need for careful terminological selection in experimental design. Existing studies shed light on the role of terminology in algorithm aversion and suggest its potential to influence experimental findings. Moreover, the variety of terms employed in prior research (Langer and Landers, 2021) underscores the critical impact that terminology may have on future studies and their results.

Labels such as “chatbot,” “algorithm,” or “AI” may evoke different expectations, mental models, and trust levels, even when system performance is identical (Candrian and Scherer, 2024; Chacon and Larrain, 2026; Langer et al., 2022). Our study extends this research by examining how subtle intra-algorithmic labels, not just human versus machine comparisons, influence user preference and behavior.

Therefore, our first research question is: Does terminology influence the use or preference for AI systems? To address this, we study how the terminology used to describe AI systems affects user preferences and behaviors in human–AI decision support contexts, where AI is positioned as an advisory tool rather than a collaborative partner. Specifically, we focus on the terms “algorithm” and “chatbot,” which are prominently featured in discussions surrounding algorithm aversion and generative AI technologies. Both terms, “algorithm” and “chatbot,” may be used interchangeably because a chatbot is typically powered by an underlying algorithm. As a result, practitioners may use either term when discussing the technology. For example, in a customer service context, someone might say: “We developed an algorithm/chatbot to answer FAQs.”

Additionally, terminology's ability to evoke different characteristics and levels of trust (Langer et al., 2022) invites further inquiry into its interaction with other factors associated with algorithm aversion. Among these factors, the nature of the task—whether objective or subjective—may best reflect the perceived characteristics of a system, which leads to a second research question: How does terminology interact with the nature of the task to influence user preferences and behavior? The study seeks to answer the question by examining whether the task's objectivity moderates the relationship between terminology and user trust or engagement.

This research contributes to a deeper understanding of human–AI interactions by investigating these dynamics, shedding light on cognitive and behavioral shifts as humans increasingly use AI technologies. Ultimately, it aims to inform the design of AI systems that better align with user expectations and foster effective human–AI partnerships. Along these lines, Shin (2024) provides a comprehensive framework for how human–algorithm interaction—shaped by system design and perceived transparency—can influence user trust and susceptibility to misinformation. This reinforces the relevance of examining not only how users behave in response to AI systems but also how they interpret these systems within broader sociotechnical contexts.

Although our research draws from cognitive and social psychology—particularly work on algorithm aversion and advice-taking—it directly contributes to broader debates in critical data/algorithm studies (CDS) and digital society (Dalton et al., 2016; Kitchin, 2014). We treat terminology itself as part of the sociotechnical infrastructure through which algorithmic systems acquire legitimacy and shape user conduct. The CDS shows that algorithms do cultural and organizational work by configuring visibility, authority, and accountability (Boyd and Crawford, 2012; Gillespie, 2014; Nick, 2017). Our work demonstrates that merely calling the same system an “algorithm,” rather than a “chatbot,” measurably increases preference and compliance (advice-taking), identifying a discursive channel of algorithmic power that operates independently of performance. In CDS terms, labels are not neutral descriptors; they perform governance by calibrating trust, consent, and reliance in everyday encounters with algorithmically mediated systems.

Across two experimental studies, we examine how AI system terminology affects user preference and advice-taking, focusing on the terms “algorithm” and “chatbot.” We further test whether task objectivity moderates these effects. We situate this work within the context of prior research on algorithm aversion, chatbot design, and the psychological effects of labeling AI systems.

Related literature

Algorithm aversion

An algorithm can be defined as a set of rules to be followed in order to solve a problem (Lee, 2018), whereas algorithm aversion refers to the reluctance to use algorithms despite their superior performance compared to human counterparts, especially after users observe algorithmic errors (Dietvorst et al., 2015). This aversion is driven by various factors of special interest to this research, such as user and algorithm characteristics (Mahmud et al., 2022).

The characteristics of algorithms themselves also significantly influence user acceptance, highlighting the learning ability of algorithms moderate aversion (Berger et al., 2021; Chacon, Kausel et al., 2025). Transparency, mode of delivery, and accuracy are pivotal in diminishing the aversion felt by users (Mahmud et al., 2022). Another characteristic that may influence algorithm aversion is the level of autonomy in decision-making. Algorithms that operate autonomously and make decisions without human input (performative algorithms) are more likely to cause user aversion, especially when they fail or perform poorly (Jussupow et al., 2022). In contrast, advisory algorithms, which provide recommendations but leave the final decision to the user, tend to generate less aversion due to retained human control. Therefore, the balance between the algorithm's control and its effectiveness is crucial in mitigating user aversion (Jussupow et al., 2020).

While prior work on algorithm aversion has focused primarily on the contrast between human and algorithmic decision-makers, our study explores a different yet related dimension: the framing of algorithmic systems through terminology. We argue that even when users are interacting with AI rather than human agents, the label used to describe the AI (e.g., “algorithm” vs. “chatbot”) can activate different user expectations, levels of trust, and behavioral responses. This is consistent with Shin's (2024) broader argument that human–algorithm interactions are deeply shaped by how users construe AI intentions and interpret its communicative framing, particularly in situations of uncertainty and opacity. In this way, we extend the algorithm aversion literature beyond source comparison to investigate how intra-algorithmic terminology may influence the acceptance and use of algorithms.

Modern chatbots

Chatbots are intelligent conversational agents designed to simulate human dialogue, leveraging advancements in AI and natural language processing (NLP) (Bigman and Gray, 2018). Chatbots have been widely adopted across various industries, including customer service, healthcare, and e-commerce (Bigman and Gray, 2018; Go and Sundar, 2019). The integration of NLP enables chatbots to understand user intent and deliver relevant information, making them valuable tools in sectors like customer service (Luo et al., 2022). Additionally, many chatbots utilize anthropomorphic cues—such as human-like language, empathy, and social presence—which can enhance user engagement and compliance by creating a more human-like interaction (Bigman and Gray, 2018; Feine et al., 2019; Go and Sundar, 2019).

The CDS scholarship argues that classifications, interfaces, and labels are constitutive elements of algorithmic systems, not just wrappers (Bowker and Star, 2000; Pasquale, 2015). Labels mediate users’ “algorithmic imaginaries” (Gillespie, 2014; Nick, 2017) and the conditions under which systems become credible or contestable (Ananny and Crawford, 2018). Our focus on intra-algorithmic labels extends this line by showing that naming alone reconfigures perceived authority and appropriate deference, even when functionality and advice are held constant.

Terminology influence over algorithm preference and use

Terminology research has focused on how it affects user perceptions, algorithm characteristics, and tolerance for errors. However, to the best of our knowledge, little attention has been paid to the influence of terminology on the actual use and preference for algorithms. Existing studies suggest that terminology influences perceptions of system properties and valuations of automated decision-making (ADM) systems across different application contexts (Langer et al., 2022). Furthermore, comparisons between terms such as “AI” and “algorithm” reveal that people react more favorably to “AI” errors than to “algorithm” errors, suggesting that terminology can mitigate or exacerbate aversion (Candrian and Scherer, 2024). Although a broad range of terms has been examined in prior literature, the present study focuses specifically on the contrast between “algorithm” and “chatbot.” The other labels included in our design serve primarily as contextual benchmarks, but our central interest lies in these two widely used terms.

While most prior research on terminology has focused on perceived trust, competence, or system transparency, there is growing evidence that labels can directly influence user preferences. For example, Langer et al. (2022) suggested that users evaluate automated systems differently based solely on terminology. Similarly, Candrian and Scherer (2024) found that labels influence both interpretations of system failure and willingness to continue using the system. Teoh (2020) demonstrated that the names assigned to driving features (e.g., “Autopilot” vs. “Driving Assistant”) influenced perceived capabilities and driver behavior. Smale et al. (2024) reported that consumers preferred products labeled “smart” over “AI-powered,” despite functional equivalence. Additionally, Park et al. (2024) found that human-like labels increased chatbot preference and engagement.

These findings suggest that terminology activates mental models, expectations, and affective responses that shape user preferences, even when functionality remains constant. This is consistent with the AI debiasing literature, which emphasizes the role of framing and labeling in shaping trust and reliance (Shin, 2025). Terminology may also influence perceived usefulness and ease of use, key components of the Technology Acceptance Model (Davis, 1989), by framing systems as either competent tools or simplistic conversational agents. Based on this literature, we predict users will prefer the term “algorithm” over “chatbot” due to associations with objectivity, competence, and technical credibility.

Beyond psychological outcomes, terminology functions as a meaning-making device. From a sociolinguistic perspective, labels are not neutral descriptors; they act as signifiers that shape interpretation through cultural and contextual associations (Gee, 2014; Halliday, 1978). Communication theory likewise highlights how framing devices guide expectations, authority, and meaning (Entman, 1993; Tannen, 1993). Thus, terms such as “algorithm” and “chatbot” frame the interaction itself, invoking distinct scripts and assumptions.

Importantly, terminology also influences behavior. Prior studies have shown that framing affects advice-taking and compliance with system recommendations (Berger et al., 2021; Candrian and Scherer, 2024). For example, people were more likely to follow advice labeled as algorithmic than human (Logg et al., 2019), and less likely to engage with services labeled as chatbots, unless a failure occurred, in which case the label softened the negative response (Mozafari et al., 2022). Similarly, Park et al. (2024) found similar framing effects on compliance in mental health settings.

These patterns are consistent with the literature on framing effects (Keysar et al., 2012; Kim et al., 2022; Tversky and Kahneman, 1981), which posits that the presentation of information affects interpretation and response. Terminology acts as such a frame, shaping mental models of system reliability, competence, and human-likeness (Gentner and Stevens, 2014; Johnson-Laird, 1983). From a social cognition standpoint, users rely on cues such as labels to form rapid judgments about AI systems (Fiske and Taylor, 1991). Decades of cognitive psychology research have documented that exposure to a term can activate associated schemas and expectations, thereby shaping interpretation and behavior (Neely, 1991; Huete-Perez et al., 2025). Thus, exposure to terms such as ‘algorithm’ may activate schemas associated with analytical rigor, which not only shape interpretation but also influence behavior through rapid heuristic judgments—an important insight for understanding framing effects in human–AI decision-making.

In this sense, “algorithm” and “chatbot” act not only as frames but also as semantic primes that cue distinct mental pathways—analytic competence in the former, conversational in the latter. A term such as “chatbot” may prime associations with scripted or limited tools, while “algorithm” evokes technical accuracy. These cognitive and affective heuristics, in turn, influence both attitudes and behavior toward the system. More broadly, different terms evoke different levels of trust and acceptance. For instance, “AI” may suggest a more anthropomorphized and competent system compared to “algorithm,” which might be perceived as more familiar (Langer and Landers, 2021). These perceptions are crucial in determining user trust and reliance on algorithmic systems, particularly in contexts where the system's perceived capabilities are critical. Moreover, it has been proposed that individuals might feel uneasy about terms such as “algorithm” or “artificial intelligence.” Nonetheless, they often appreciate the results of these systems, provided they are unaware of their algorithmic origins (Mariadassou et al., 2024). This suggests that aversion may stem from the terminology itself rather than the quality or functionality of the system's output, reinforcing the importance of how these systems are framed and communicated to users.

In this context, the term “chatbot” has gained prominence due to its widespread application across various domains (Rapp et al., 2021). Despite this prevalence, to the best of our knowledge, no previous research has examined how the term “chatbot” influences user perceptions compared to the term “algorithm.” Chatbots, often powered by probabilistic models based on NLP, may present challenges in terms of explainability and transparency because their complex and adaptive behaviors are less predictable, despite recent advancements in their capacity to engage with humans. In contrast, systems referred to as “algorithms” are typically perceived as deterministic and more transparent, potentially making them more understandable and trustworthy to users. This contrast raises the question of how “chatbots” are evaluated in comparison to “algorithms,” and whether one is preferred or more frequently used by individuals. Investigating these perceptions is crucial for designing technologies that users are willing to adopt and trust.

Building on the questions outlined above, this study employs a multistep analysis that focuses on the impact of terminology on user preference and usage. Specifically, we investigate whether “algorithm” or “chatbot” is more favorably received and whether this influences the likelihood of users following advice provided by these systems. Accordingly, we propose the following hypothesis:

Hypothesis 1: Terminology influences both use and preference, reflecting intra-algorithm framing effects, such that (a) individuals prefer the term “algorithm” over “chatbot,” and (b) individuals are more likely to follow advice when it is referred to by the term “algorithm” rather than “chatbot.”

The moderating effects of task over algorithm use

Task characteristics are a crucial and extensively researched factor influencing algorithm aversion (Castelo et al., 2019). A task's complexity, importance, and context can significantly affect user aversion for algorithms (Mahmud et al., 2022). Algorithm aversion is present in at least 75% of tasks, as shown in a study by Kaufmann et al. (2023). This aversion is observed across a range of domains, such as finance and healthcare, and is particularly influenced by task type, user expertise, and feedback on algorithm performance. Furthermore, users rely more on algorithms for objective and quantifiable tasks, whereas tasks requiring moral judgments elicit aversion due to the algorithms’ perceived lack of human-like qualities (Bigman and Gray, 2018). Additionally, people's perceptions of fairness, trust, and emotional responses vary depending on whether decisions are made by humans or algorithms and the nature of the task (Lee, 2018; Starke et al., 2022).

Rational decision-making is typically preferred for tasks with clear, evaluable outcomes. In contrast, intuitive decision-making is favored for subjective and simpler tasks (Inbar et al., 2010). Trust in algorithms decreases for subjective tasks but increases when these tasks are framed as objective (Castelo et al., 2019). Moreover, the alignment between the agent and the task type influences user preferences, with human agents being favored for social tasks and nonhuman agents for analytical tasks (Hertz and Wiese, 2019). This is further supported when considering that algorithm aversion is more prevalent in subjective domains (Logg, 2017).

We examine the potential differences that terminology might generate across two tasks. We consider that the term “chatbot” may follow recent trends of anthropomorphizing technology and may elicit more human-like characteristics, as its interface is designed to assist and interact with users in a more personal and engaging manner, thus potentially performing better than an “algorithm” in subjective tasks. Therefore, we propose:

Hypothesis 2: Terminology has differential effects depending on the task's objectivity, (a) the relationship between the chosen terminology and preference for an AI system will be moderated by task objectivity, with “algorithms” being preferred for tasks with high objectivity than “chatbots” and less preferred for tasks with lower objectivity, and (b) the relationship between the term and use will be moderated by task objectivity, with “algorithms” being used more for tasks with high objectivity than “chatbots” and less used for tasks with lower objectivity.

These hypotheses aim to investigate the impact of terminology on algorithm use and preference, particularly in tasks with varying degrees of objectivity.

Overview of studies

The primary objective of this research is to understand how terminology influences the preference for and use of algorithms and chatbots, compared to other terms, in tasks with varying perceived objectivity. The research was conducted through two experiments, and although our study's users do not interact directly with the AI, the setting reflects a common and realistic decision-support scenario where individuals evaluate or follow AI-generated advice.

The research will focus on the terms “algorithm” and “chatbot,” while also incorporating other commonly used terms from the literature, such as “artificial intelligence,” “automated system,” “expert system,” “robot,” and “statistical model” (Langer et al., 2022; Langer and Landers, 2021). The terms were selected to ensure both contextual relevance and alignment with previous research. Terms such as “automated system,” “robot,” and “statistical model” were drawn from Langer et al. (2022) while others, including “algorithm,” “chatbot,” “artificial intelligence,” and “expert system,” were chosen for their prevalent use in both research and real-world applications. Although several terms were included in our design, our empirical focus centered on “algorithm” versus “chatbot” for both theoretical and empirical reasons: the former suggests precision and formality, the latter informality and limited capability. This distinction, as supported by Langer and Landers (2021), enabled us to examine the most meaningful contrast while maintaining analytical clarity.

Two tasks were selected to represent subjective and objective scenarios. Specifically, the tasks presented to all participants involved measuring the romantic attraction a given person would feel for another (Logg et al., 2019) and predicting a future stock price (Chacon, Kausel et al., 2025; Önkal et al., 2009). Based on a study from Castelo et al. (2019), selecting a possible romantic partner is a subjective task, and predicting stocks is an objective task. This is further supported by Logg et al. (2019), who used a romantic task to represent a subjective decision-making context. The distinction between the two tasks follows Prahl and Van Swol's (2021) demonstrability continuum: Tasks such as stock predictions are considered objective, with verifiable answers, whereas romantic attraction tasks involve subjective judgment and do not have a single correct outcome. Schaap et al. (2024) reinforce this by showing that people perceive finance as more objective and better suited for algorithms than dating-related decisions.

All supplementary materials are stored on an OSF project and are available at: https://osf.io/gwv6n. The OSF project includes surveys, code, data, sample size calculation using G*Power, and preregistration for each study.

Study 1

Sample and procedure

The first study involved 115 undergraduate students from a midsized university. Seventy percent of participants identified as male, with an average age of 24 (SD = 4). Participation was voluntary, and students were incentivized to complete the survey with course credits, regardless of their responses. The study was conducted online. Two attention checks were included: the first attention check allowed participants to continue with the experiment, but only after a warning was given. Participants who failed either of these checks were excluded, resulting in a final sample of 113 participants.

This study utilized a two-factor within-subjects design, with one factor being the term used to name the system and the other being the task. Participants were exposed to seven different terms across two distinct tasks with varying levels of objectivity. They were asked to indicate their preference for either a human performing the task or the corresponding term (e.g., “algorithm,” “chatbot”), using a scale from 1 to 7, where 1 indicated a complete preference for a human, and 7 indicated a complete preference for the term.

This experiment drew partial inspiration from the third experiment conducted by Longoni et al. (2019). The similarity lies in the structure of the questions posed to participants. Longoni and colleagues asked participants to express their preference between a human provider and an AI provider. In contrast, our experiment expanded this approach by asking participants to state their preference between a human provider and several specific terms. The order of the terms shown was randomized to avoid potential biases. Participants were required to express their preference on a scale from 1 to 7 for each term in various scenarios, with 1 indicating a strong preference for the human provider and 7 indicating a strong preference for the term.

Participants were presented with two tasks to assess their decision-making and preferences in different contexts. The first task involved a participant measuring the romantic attraction a given person might feel for another person, a subjective decision-making task, while the second task required predicting a future stock price, which represents a more objective task. Attention checks and demographic questions were included to ensure data quality and gather participant characteristics.

Variables

Several variables were used to test the different hypotheses. The first variable is preference, the dependent variable of this experiment. It is a Likert scale variable ranging from 1 to 7, where 1 indicates a complete preference for a human performing the task, and 7 indicates a complete preference for the term (e.g., “algorithm,” “chatbot,” etc.). This scale was based on a study from Longoni et al. (2019).

The second variable is term, a categorical independent variable that indicates the terminology used to describe the system that provides the advice. It has seven possible values, not only focusing on “algorithm” and “chatbot” but also including other common terms (i.e., “artificial intelligence,” “automated system,” “expert system,” “robot,” “statistical model”) to make a more comprehensive study of differences brought by terminology.

The third variable is task, an independent binary variable representing one of two possible tasks. The tasks presented to all participants involved measuring the romantic attraction a given person would feel for another and predicting a future stock price. The choice of tasks intentionally highlighted the differing levels of objectivity and examined how term preferences might shift across these contexts. Attention checks and demographic questions were included to ensure data quality and to gather participant characteristics.

Analytical strategy

In this study, we tested different hypotheses through several procedures. First, a mixed-effects regression was performed with preference as the dependent variable and six dummy encoded variables representing term as the main independent variables. This analysis allows us to test Hypothesis 1.a. Afterward, we conducted a mixed-effects regression with preference as the dependent variable and task, six dummy encoded variables representing term, and their interaction as the main independent variables to test Hypothesis 2.a. Finally, a mixed-effects regression was performed by splitting the data between tasks to more directly visualize possible differences in preference for terms within each task. All tests used “algorithm” as the term reference level and financial task as the task reference level. This study only measured preference, not use; therefore, no hypotheses or conclusions about usage were tested.

Results and discussion

The results of the three analyses are presented in Table 1. In model (1), we report the results of the first mixed-effects regression, where preference is the dependent variable and six dummy variables representing term are the independent variables. Model (2) shows the second mixed-effects regression, which includes preference as the dependent variable and both task and six dummy variables representing term as independent variables; interaction results are displayed in the lower rows of the table. The results of model (2) can be visualized in Figure 1, which displays the distribution of the predicted preference over both tasks. Lastly, models (3) and (4) present the regressions performed separately for each task, with model (3) showing the results for the financial task and model (4) for the romantic task.

Figure 1.

Dot plots of predicted preference distribution for the financial task (left panel) and the romantic task (right panel).

Table 1.

Mixed-effects regression results on the impact of terms and task on preference in Study 1.

	(1)		(2)		(3)		(4)
Variables	b	SE	b	SE	b	SE	b	SE
Intercept	4.425***	0.143	5.283***	0.171	5.283***	0.153	3.566***	0.187
Artificial intelligence	0.146	0.167	0.018	0.211	0.018	0.176	0.274	0.190
Chatbot	−1.451***	0.167	−2.053***	0.211	−2.053***	0.176	−0.850***	0.190
Statistical model	0.341*	0.167	0.416*	0.211	0.416*	0.176	0.265	0.190
Robot	−1.062***	0.167	−1.239***	0.211	−1.239***	0.176	−0.885***	0.190
Automated system	−0.434**	0.167	−0.504*	0.211	−0.504**	0.176	−0.363	0.190
Expert system	0.088	0.167	0.177	0.211	0.177	0.176	−0.001	0.190
Romantic task	–	–	−1.717***	0.211	–	–	–	-
Interaction (Artificial intelligence × Romantic)	–	–	0.257	0.298	–	–	–	–
Interaction (Chatbot × Romantic)	–	–	1.204***	0.298	–	–	–	–
Interaction (Statistical model × Romantic)	–	–	−0.150	0.298	–	–	–	–
Interaction (Robot × Romantic)	–	–	0.354	0.298	–	-	–	–
Interaction (Automated system × Romantic)	–	–	0.142	0.298	–	–	–	–
Interaction (Expert system × Romantic)	–	–	−0.177	0.298	–	–	–	–
Random effects	0.728	0.074	0.774	0.083	0.885	0.124	1.901	0.222
Model Fit Statistics
Conditional R²	0.263		0.413		0.471		0.049
Marginal R²	0.092		0.232		0.205		0.509
Log-Likelihood	−3237.394		−3073.455		−1433.890		−1519.898

Note: N = 113. Term was treated as a dummy encoded variable representing seven possible categories: artificial intelligence, algorithm, automated system, robot, chatbot, statistical model, and expert system. For term, algorithm was the base category. Task was treated as a dummy variable, where 0 indicates financial and 1 indicates romantic. * p < .05; ** p < .01; *** p < .001. Models (1) and (2) use observations from both tasks, whereas Model (3) uses only observations from the financial task, and Model (4) relies solely on observations from the romantic task.

Given the conceptual contrast and consistent response divergence between “algorithm” and “chatbot,” our analysis focused on these two terms to test our primary hypotheses. The results for model (1) show that “algorithm” had a significantly higher preference over “chatbot” (b = −1.451, p < .001), strongly supporting Hypothesis 1.a. Additionally, “algorithm” was significantly preferred over “automated system” (b = −0.434, p = .009) and “robot” (b = −1.062, p < .001). Conversely, “statistical model” was preferred over “algorithm” (b = 0.341, p = .041).

The results for model (2) show a significant interaction between “chatbot” and the “romantic” task (b = 1.204, p < .001), showing that the difference in preference for “chatbot” versus “algorithm” varies by task. However, “algorithm” was preferred over “chatbot” in both the financial task (b = −2.053, p < .001) and the romantic task (b = −0.850, p < .001). These results do not support Hypothesis 2.a.

Models (3) and (4) provide further insights into the preference for “algorithm” compared to other terms, besides “chatbot,” within each task. In model (3), “statistical model” was preferred over “algorithm” (b = 0.416, p = .018), while “algorithm” was preferred over “automated system” (b = −0.504, p = .004) and “robot” (b = −1.239, p < .001). In model (4), “algorithm” was also preferred more than “robot” (b = −0.885, p < .001).

Throughout Study 1, participants showed a consistent preference for “algorithm” over “chatbot.” This supports Hypothesis 1.a and provides support for determining that terms can have a significant impact on preference. Finally, the results of all models are robust after controlling for age and gender. To further study preference and actual use, Study 2 was conducted with a larger sample.

Study 2

Sample and procedure

The second study recruited 568 participants from Prolific, focusing on native English speakers with high prior performance ratings on the platform. The participants had an average age of 36.3 years (SD = 13) and a diverse range of occupations. The experiment was conducted online, and participants were rewarded with a monetary payment upon completion. Two attention checks were included: the first attention check allowed participants to continue with the experiment, but only after a warning was given. Participants who failed both attention checks were excluded, which resulted in a final number of 553 participants. Furthermore, a measure of tech affinity was added (Langer et al., 2022). This allowed us to control the results based on familiarity with technology (Tully et al., 2025).

Participants were randomly assigned to one of five terms and one of two tasks. While Study 1 included seven terms, Study 2 used a reduced set of five to improve clarity, power, and reduce cognitive load. “Automated system” and “expert system” were excluded based on a combination of factors such as the conceptual distinctiveness of each term, their limited variation in user preferences in Study 1, and the need to reduce cognitive and analytical complexity in the between-subjects design of Study 2. In contrast, we retain the term “artificial intelligence” due to its high salience in public discourse and relevance in the human–machine interaction literature. This streamlined set of labels also sharpened our focus on the core comparison between “algorithm” and “chatbot,” which remains central to our theoretical and empirical contribution.

The tasks remained identical to those in the earlier experiment. Participants were asked to answer two questions. First, they had to indicate their preference for a human or the assigned terms to perform the task. This question was similar to that in Study 1. However, only the assigned term was used to ask for participant preference. Considering this, the question posed by Longoni et al. (2019) was similar.

As a second question, they were required to make a prediction related to their assigned task. After making an initial prediction, participants received advice from the assigned term and submitted their final prediction. The recommendation was the same among all participants of the same task, only the term was different. This experiment was similar to that of Chacon, Kausel et al. (2025). This framework allowed us to compare differences elicited solely by changes in terms and to compare differences between tasks. Both questions were asked twice to all participants, and analyses were performed using the average result between them.

This study employed a two-factor between-subjects design with task and term as independent variables. Demographic variables were also collected, and two attention checks were included to ensure response quality. Responses were excluded if participants failed both attention checks.

Variables

Several variables were used to test the different hypotheses. The first variable is preference. This is the first dependent variable and is a Likert scale variable ranging from 1 to 7, where 1 indicates a complete preference for a human performing the task, and 7 indicates a complete preference for the term. This variable was asked twice, and the average of both responses was used for analysis.

The second variable is Weight of Advice (WOA). The use of advice was assessed using this WOA measure, which has been utilized in previous studies (Chacon, Reyes et al., 2025; Logg et al., 2019). The WOA works as a proxy for behavioral inclination rather than actual sustained usage or reliance, but is sufficient to provide evidence on the difference in use caused by terminology.¹ This variable corresponds to a dependent variable that measures the extent to which a person takes advice from the corresponding term:

W O A = \frac{(R e v i s e d e s t i m a t e - I n i t i a l e s t i m a t e)}{(A d v i c e - I n i t i a l e s t i m a t e)}

The initial estimate corresponds to the participant's initial prediction. Advice is the value recommended by the term. The revised estimate is the participant's final value. When advice and the initial estimate were equal, the value was discarded. Values closer to 1 represent higher advice-taking. Conversely, values closer to 0 represent less advice-taking. WOA values higher than 1 or lower than 0 were discarded to perform the analysis, which resulted in 438 observations for WOA.

The correlation between the two dependent variables, preference and WOA, was modest but significant, with a Pearson correlation of 0.305 and a Spearman correlation of 0.281 (both p < .001), indicating a consistent but partial relationship between stated preference and behavioral reliance on the assigned term's recommendation.

The third variable is term, an independent categorical variable with five values: “artificial intelligence,” “algorithm,” “robot,” “chatbot,” and “statistical model.” Each participant was randomly assigned one of these terms.

The fourth variable is task, an independent binary variable representing one of two possible tasks: romantic or financial. Each participant was randomly assigned to one of these tasks. The study also included demographic data, such as age and gender.

Analytical strategy

This study followed a similar approach to Study 1 but employed a two-factor between-subjects design, utilizing ordinary least squares (OLS) regression instead of a mixed-effects regression. The first analysis used two OLS regressions to test hypotheses 1.a and 1.b. The first regression used preference as the dependent variable and term as the independent variable, and the second regression had WOA as the dependent variable and four dummy variables representing term as the independent variables.

The second analysis involves interaction effects. Two OLS regressions were conducted to check for potential interactions. The first was between preference as a dependent variable, and four dummy variables representing term and task as the independent variables. The second used WOA as the dependent variable and four dummy variables representing term and task as independent variables. These analyses tested hypotheses 2.a and 2.b.

The third and final analysis was a task-specific analysis. Separate OLS regressions were performed for each task to visualize differences between terms within each task. All tests used “algorithm” as the term reference level and financial task as the task reference level.

Results and discussion

The results for both preference and WOA follow the same structure and are each presented in separate tables. Table 2 presents the results for preference, and Table 3 presents the results for WOA; both tables share the same structure. In Table 2, model (1) reports the results of the first OLS regression, where preference is the dependent variable and four dummy variables representing term are the independent variables. Model (2) shows the second OLS regression, which includes preference as the dependent variable and both task and four dummy variables representing term as independent variables; interaction results are displayed in the lower rows of the table. Lastly, the regressions performed separately by task are presented in models (3) and (4). Model (3) shows the regression with preference as the dependent variable and four dummy variables representing term as the independent variables, considering only values for the financial task. Model (4) presents the same regression but only for the romantic task. This structure is repeated in Table 3, but with WOA as the dependent variable. The distribution of predicted preference and WOA values across different tasks for all terms is visualized in Figure 2.

Figure 2.

Distribution of predicted preference (upper row) and WOA (lower row) for the financial task (left column) and romantic task (right column) across different terms in study 2.

Table 2.

Regression results on the impact of terms and task on preference in Study 2.

	(1)		(2)		(3)		(4)
Variables	b	SE	b	SE	b	SE	b	SE
Intercept	3.500***	0.160	4.144***	0.207	4.144***	0.216	2.826	0.202
Artificial intelligence	−0.356	0.223	−0.416	0.291	−0.416	0.304	−0.242	0.278
Chatbot	−0.592**	0.227	−0.815**	0.299	−0.815*	0.313	−0.293	0.281
Robot	−0.350	0.232	−0.503	0.303	−0.503	0.317	−0.143	0.289
Statistical model	0.101	0.226	0.318	0.301	0.318	0.315	0.072	0.277
Romantic task	–	–	−1.319***	0.296	–	–	–	–
Interaction (Artificial intelligence × Romantic)	–	–	0.174	0.411	–	–	–	–
Interaction (Chatbot × Romantic)	–	–	0.522	0.420	–	–	-	–
Interaction (Robot × Romantic)	–	–	0.361	0.428	–	–	–	–
Interaction (Statistical model × Romantic)	–	–	−0.246	0.418	–	–	–	–
Model Fit Statistics
R ²	0.028		0.182		0.070		0.012
Adjusted R²	0.019		0.165		0.052		0.006
Akaike Inf. Criterion (AIC)	1604.547		1538.888		760.298		776.862

Note: N = 553. Term was treated as a dummy encoded variable representing five possible categories: artificial intelligence, algorithm, robot, chatbot, and statistical model. For term, algorithm was the base category. Task was treated as a dummy variable, where 0 indicates financial and 1 indicates romantic. * p < .05; ** p < .01; *** p < .001. Models (1) and (2) use observations from both tasks, whereas Model (3) uses only observations from the financial task, and Model (4) relies solely on observations from the romantic task.

Table 3.

Regression results on the impact of terms and task on WOA in Study 2.

	(1)		(2)		(3)		(4)
Variables	b	SE	b	SE	b	SE	b	SE
Intercept	0.445***	0.028	0.544***	0.037	0.544***	0.041	0.34***	0.034
Artificial intelligence	−0.116**	0.039	−0.119*	0.052	−0.119*	0.058	−0.104*	0.047
Chatbot	−0.143***	0.04	−0.133*	0.054	−0.133*	0.059	−0.136*	0.047
Robot	−0.095*	0.04	−0.122*	0.054	−0.122*	0.060	−0.056	0.048
Statistical model	−0.053	0.039	−0.118*	0.054	−0.118*	0.060	0.023	0.046
Romantic task	–	–	−0.204***	0.053	–	–	–	–
Interaction (Artificial intelligence × Romantic)	–	–	0.015	0.074	–	–	–	–
Interaction (Chatbot × Romantic)	–	–	−0.003	0.075	–	–	–	–
Interaction (Robot × Romantic)	–	–	0.063	0.077	–	–	–	–
Interaction (Statistical model × Romantic)	–	–	0.141	0.075	–	–	–	–
Model Fit Statistics
R ²	0.036		0.139		0.034		0.072
Adjusted R²	0.027		0.121		0.015		0.055
Akaike Inf. Criterion (AIC)	72.719		33.097		58.273		−35.154

Regarding the preference dependent variable, the results of the first analysis in Table 2, model (1), revealed that “algorithm” was preferred over “chatbot” (b = −0.592, p = .009), supporting Hypothesis 1.a. The interaction effect in the second analysis, shown in Table 2, model (2), between “chatbot” and “algorithm” term and task with preference as the dependent variable, showed no significant difference (b = 0.522, p = .214), thus, it did not support Hypothesis 2.a. The final analysis, performed separately by task and presented in Table 2, models (3) and (4), provides specific results: “algorithm” was significantly preferred over “chatbot” in the financial task (b = −0.815, p = .010). However, in the romantic task, there was no significant difference in preference between “algorithm” and “chatbot” (b = −0.293, p = .298).

For the WOA dependent variable, the first analysis shown in Table 3, model (1), indicates that “algorithm” was significantly more used than the terms “chatbot” (b = −0.143, p < .001), “artificial intelligence” (b = −0.116, p = .003), and “robot” (b = −0.095, p = .019). These results support Hypothesis 1.b. In the second analysis, concerning the interaction between the “chatbot” and “algorithm” term and task, no significant effect was found, as shown in Table 3, model (2); therefore, they did not support Hypothesis 2 b. This finding is consistent with the results in Table 3, models (3) and (4), where “algorithm” was significantly more used than “chatbot” (b = −0.133, p = .059); and also “artificial intelligence” (b = −0.119, p = .040), “robot” (b = −0.122, p = .043), and “statistical model” (b = −0.118, p = 0.049). In the romantic task, “algorithm” was significantly more frequently used than “chatbot” (b = −0.136, p = .004), as well as “artificial intelligence” (b = −0.104, p = .026). These results further demonstrate that “algorithm” was more used than “chatbot” in both tasks; however, there was no significant difference between tasks.

Finally, these results are robust after controlling for age and gender. Furthermore, tech affinity showed a significant effect when used as a control variable over model (1) for preference, but the difference between “algorithm” and “chatbot” remained significant. The variable tech affinity showed no significance in all other models.

General discussion

The present research investigated the influence of terminology on user preference and behavior, specifically comparing the terms “algorithm” and “chatbot” and how these effects vary based on task objectivity. The studies tested whether individuals prefer one term over the other and whether terminology affects the likelihood of users following advice provided by these systems. We also explored whether task objectivity moderates these patterns. The results underscore the importance of terminology—particularly the contrast between “algorithm” and “chatbot”—in shaping user trust and behavior.

Two experiments were conducted to explore these questions. Study 1 involved 115 undergraduate students who were presented with two tasks: predicting stock prices (objective) and measuring romantic attraction (subjective). Participants indicated their preference between a human and various terms, including “algorithm” and “chatbot,” for performing each task. These two terms were of particular interest due to their theoretical and practical contrast in perceived technicality and formality. Study 2 involved 568 Prolific participants randomly assigned to one of five terms, again including algorithm and chatbot, and one of the two tasks. Participants expressed their preferences and engaged in a decision-making exercise where they received advice from the assigned term.

The results consistently supported Hypotheses 1.a and 1.b: individuals preferred the term “algorithm” over “chatbot,” and they were more likely to follow advice when it was referred to as an “algorithm.” However, Hypotheses 2a and 2b, which posited that task objectivity would moderate these effects, were not supported. The preference for “algorithm” over “chatbot” was consistent across both objective and subjective tasks, indicating that terminology influences user preference and behavior regardless of task type.

Theoretical contributions and implications

The findings contribute to the expanding literature on algorithm aversion (for a recent review, see Chacon and Kaufmann, 2025), particularly within human–AI interaction. While prior studies have explored how terminology influences user responses to system failures (Candrian and Scherer, 2024) and the attributes users ascribe to AI systems (Langer et al., 2022), this research introduces a new layer by demonstrating that the terminology used to describe generative AI systems has a significant impact on user preferences and behaviors during their interactions with these systems. Specifically, the findings reveal that “algorithm” is consistently preferred over “chatbot” across tasks with varying levels of objectivity, underscoring the importance of linguistic framing in deploying generative AI systems.

Our findings extend the concept of the framing effect (Keysar et al., 2012; Kim et al., 2022; Tversky and Kahneman, 1981) by demonstrating that differences in system terminology—such as labeling a system an “algorithm” versus a “chatbot”—can activate distinct mental models (Gentner and Stevens, 2014), influencing both preference and behavioral engagement. This indicates that users’ responses to AI are influenced not only by its functionality but also by the language used to present the system. This perspective aligns with sociolinguistic and framing perspectives, which suggest that terminology does not passively reflect meaning but actively constructs it through shared interpretive frameworks (Entman, 1993; Gee, 2014).

Interestingly, the present study hypothesized that task objectivity would moderate the effect of terminology, expecting greater acceptance of “algorithm” in objective tasks and more skepticism in subjective ones. However, preferences remained consistent across both task types. This null finding suggests that framing may override task context: “chatbot” likely carries negative connotations (Gnewuch et al., 2017), while “algorithm” may signal competence and neutrality (Logg et al., 2019), leading to stable preferences regardless of domain. Rather than depending on mere task characteristics, the influence of terminology appears to arise from enduring interpretive biases. This highlights the role of labels in steering trust and compliance with algorithmic systems, even when functionality remains constant.

Another explanation is that users, regardless of the task type, perceive AI as lacking emotional or intuitive judgment and thus prefer labels that sound more analytical (Sundar, 2020). This suggests that the framing effect reflects not just task-specific reasoning but deeper heuristics related to anthropomorphism and trust in AI (Waytz et al., 2014). These findings highlight how linguistic biases can persist even when task context should, in theory, shift user expectations, extending research on AI adoption and communication.

Beyond framing effects, terminology can also be understood through the lens of semantic priming. Cognitive psychology research shows that words automatically activate related concepts in memory, shaping expectations and judgments even before conscious interpretation occurs (Neely, 1991). For example, the term “algorithm” is likely to evoke associations with precision, data, and objectivity, whereas “chatbot” primes associations with informality, anthropomorphism, or superficiality. This perspective suggests that the mere choice of label may bias user responses at an automatic, associative level, not only through interpretive framing.

From a theoretical standpoint, these results have relevant implications for future research. First, they emphasize the importance of researchers carefully selecting the terminology used to describe AI systems, as these choices can directly impact study outcomes. The results of our studies may be explained by the characteristics elicited by the terminology or the transparency; therefore, aligning the terminology with the research objectives and the system's intended characteristics is essential to ensuring the validity and reliability of findings. Second, these insights are particularly relevant for organizations aiming to enhance human–AI interactions. The terminology used in system design and deployment plays an important role in fostering trust, engagement, and productivity, ultimately shaping the effectiveness of these partnerships.

Beyond these cognitive and communicative mechanisms, our findings also contribute to CDS. This field emphasizes that algorithms and data systems are not only technical artifacts but also cultural and organizational constructs that acquire legitimacy through discourse, classification, and labeling (Bowker and Star, 2000; Gillespie, 2014; Nick, 2017; Pasquale, 2015). By showing that simply naming an identical system “algorithm” rather than “chatbot” increases user preference and compliance, our results empirically demonstrate how labels function as performative elements of governance. In this sense, terminology is not a neutral wrapper but a constitutive part of algorithmic power: it configures visibility, credibility, and trust in ways that shape how humans interact with ADM systems. This insight complements CDS debates on algorithmic imaginaries, legitimacy, and accountability by grounding them in behavioral evidence from controlled experiments.

With the increasing prevalence of AI chatbots (Kim and Hur, 2024; Wang et al., 2024), an intriguing direction for future research lies in investigating user preferences and behaviors across a broader spectrum of the terminology used to describe AI chatbots. Expanding this line of inquiry could explore the nuanced effects of alternative labels (e.g., “virtual assistant,” “AI companion”) on user expectations, trust, and willingness to engage with these systems. Moreover, examining how these preferences evolve across different domains, such as education, customer service, or healthcare, could provide valuable insights for tailoring the design and deployment of generative AI technologies to specific contexts.

These findings extend work on algorithm aversion and advice-taking in psychology, while also offering insight into how language mediates human–AI interaction in sociotechnical systems. In this way, the study contributes to ongoing research in algorithmic accountability, AI framing, semantic priming, and user behavior within data-driven environments, thereby supporting interdisciplinary inquiry into how people interpret and respond to opaque, ADM systems.

Practical implications

The results have significant implications for designing and implementing algorithmic systems in various industries. Organizations aiming to promote user adoption of algorithmic systems should consider the terminology they use to describe these technologies. By opting for terms that users prefer, such as “algorithm,” companies can enhance user engagement and increase users’ likelihood of following system advice. For example, in financial services, presenting a recommendation system as an “algorithm” rather than a “chatbot” could lead to higher acceptance of automated investment advice. However, practitioners should be mindful that while terminology influences advice-taking, WOA does not equate to sustained system usage or long-term behavioral reliance; therefore, further monitoring of continued use is warranted.

Organizations should leverage the preference for “algorithm” by framing their algorithmic systems accordingly. Conversely, if a system is labeled as a “chatbot,” users might be less inclined to follow its recommendations, potentially undermining the system's effectiveness. These findings align with recent guidance in the AI design and policy literature, which emphasizes that promoting calibrated trust in algorithmic system requires not only technical accuracy but also thoughtful framing, explainability, and transparency in user communication (Shin, 2025).

Although task objectivity did not moderate the terminology effect in this study, understanding the role of tasks remains important in real-world applications. Organizations should still consider the nature of the tasks when deploying algorithmic systems. While terminology might have a consistent impact, tailoring the system's presentation and communication style to the task context can further enhance user experience and acceptance.

Limitations

One limitation of this research is the focus on only two tasks: predicting stock prices (objective) and measuring romantic attraction (subjective). These tasks were intentionally selected based on prior studies that differentiate between relatively objective forecasting scenarios and highly subjective interpersonal judgments (e.g., Castelo et al., 2019; Logg et al., 2019; Prahl and Van Swol, 2021; Schaap et al., 2024). However, we did not include a direct manipulation check to empirically confirm whether participants perceived these tasks as differing in objectivity. Incorporating such a measure in future studies could offer additional insights by capturing individual variation in task interpretation. Expanding the task set and validating perceived objectivity would further enrich our understanding of how task characteristics shape responses to algorithmic advice.

A further limitation of this study is that we did not assess participants’ mental models or prior interpretations of the terms used. No manipulation check was made to confirm how participants interpret terms such as “algorithm” versus “chatbot.” Individuals likely held varied understandings of labels, which may have influenced their preferences and behaviors. For example, the term “algorithm” might signal precision to some people but opacity to others, while “chatbot” may suggest either helpfulness or superficial automation. To further strengthen the contribution, a direct manipulation check, such as a semantic differential scale, could be added in further studies to capture better how participants construe AI-related terminology. As Shin (2024) argues, users construct representations of algorithmic systems through social cues, framing, and past experience—mental models that evolve rather than remain fixed. Future research should investigate how these models influence trust and usage, as well as how system design can support more accurate and calibrated expectations. It may also be useful to explore how individual traits—such as personality or cognitive style—moderate these effects.

Conclusion

This research found that the terminology used to describe algorithmic systems (e.g., generative AI systems)—specifically “algorithm” versus “chatbot”—significantly influences user preferences and behavior in human–AI interactions. Across two studies, participants consistently preferred the term “algorithm” over “chatbot” and were more likely to follow advice labeled as coming from an “algorithm,” regardless of whether the task was objective or subjective. This suggests that the influence of terminology is robust across various types of tasks, highlighting how language can impact the dynamics of interactions between humans and AI.

The findings underscore the crucial role of language in shaping human perceptions, engagement, and acceptance of algorithmic systems. Organizations should consider using preferred terms such as “algorithm” when introducing AI systems to foster trust and enhance user engagement, thereby maximizing the potential of AI–human partnerships. As humans increasingly interact with AI in both professional and personal contexts, carefully selecting terminology can facilitate more productive and positive interactions. Future research should examine a broader range of tasks and explore users’ mental models and preconceived notions about generative AI terminology to better understand the underlying mechanisms behind these effects and their implications for human adaptation and behavioral change in AI interactions.

Footnotes

ORCID iDs

Alvaro Chacon

Roberto Montecino

Tomas Reyes

Edgar E Kausel

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: ANID/FONDECYT/Iniciación 11250210 and ANID/FONDECYT/Regular 1251286.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Notes

References

Ananny

Crawford

(2018) Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media & Society 20(3): 973–989.

Berger

Adam

Rühr

, et al. (2021) Watch me improve—algorithm aversion and demonstrating the ability to learn. Business & Information Systems Engineering 63(1): 55–68.

Bigman

Gray

(2018) People are averse to machines making moral decisions. Cognition 181(1): 21–34.

Bowker

Star

(2000) Sorting things out: Classification and its consequences. Cambridge, MA: MIT Press.

Boyd

Crawford

(2012) Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society 15(5): 662–679.

Candrian

Scherer

(2024) How terminology affects users’ responses to system failures. Human Factors: The Journal of the Human Factors and Ergonomics Society 66(8): 2082–2103.

Castelo

Bos

Lehmann

(2019) Task-dependent algorithm aversion. Journal of Marketing Research 56(5): 809–825.

Chacon

Kaufmann

(2025) An overview of the effects of algorithm use on judgmental biases affecting forecasting. International Journal of Forecasting 41(2): 424–439.

Chacon

Kausel

Reyes

, et al. (2025a) Preventing algorithm aversion: People are willing to use algorithms with a learning label. Journal of Business Research 187: 115032.

10.

Chacon

Larrain

(2026) AI term aversion in career decision-making: Contextual reactions to algorithmic labels. Journal of Decision Systems 35(1): 2599916.

11.

Chacon

Reyes

Kausel

(2025b) Are engineers more likely to avoid algorithms after they see them err? A longitudinal study. Behaviour & Information Technology 44(4): 789–804.

12.

Dalton

Taylor

Thatcher

(2016) Critical data studies: A dialog on data and space. Big Data & Society 3(1): 2053951716648346.

13.

Davis

(1989) Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly: Management Information Systems 13(3): 319–339.

14.

Dietvorst

Simmons

Massey

(2015) Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General 144(1): 114–126.

15.

Entman

(1993) Framing: Toward clarification of a fractured paradigm. Journal of Communication 43(4). John Wiley & Sons, Ltd: 51–58.

16.

Feine

Gnewuch

Morana

, et al. (2019) A taxonomy of social cues for conversational agents. International Journal of Human-Computer Studies 132: 138–161.

17.

Fiske

Taylor

(1991) Social cognition. 2nd ed. New York, NY, USA: Mcgraw-Hill Book Company.

18.

Gee

(2014) An introduction to discourse analysis: Theory and method. Abingdon, Oxfordshire, UK: Routledge.

19.

Gentner

Stevens

(2014) Mental models. New York, NY, USA: Psychology Press.

20.

Gillespie

(2014) The relevance of algorithms. In: Media technologies: Essays on communication, materiality, and society. Cambridge, MA, USA: The MIT Press, 167.

21.

Gnewuch

Morana

Maedche

(2017) Towards designing cooperative and social conversational agents for customer service. In: ICIS 2017 Proceedings.

22.

Sundar

(2019) Humanizing chatbots: The effects of visual, identity and conversational cues on humanness perceptions. Computers in Human Behavior 97: 304–316.

23.

Halliday

MAK

(1978) Language as social semiotic : The social interpretation of language and meaning. London, UK: Edward Arnold.

24.

Hertz

Wiese

(2019) Good advice is beyond all price, but what if it comes from a machine? Journal of Experimental Psychology: Applied 25(3): 386–395.

25.

Huete-Perez

Davies

Rodriguez-Ferreiro

, et al. (2025) Individual differences in associative/semantic priming: Spreading of activation in semantic memory and epistemically unwarranted beliefs. PLoS One 20(2): e0313239.

26.

Inbar

Cone

Gilovich

(2010) People’s intuitions about intuitive insight and intuitive choice. Journal of Personality and Social Psychology 99(2): 232–247.

27.

Johnson-Laird

(1983) Mental models: Towards a cognitive science of language. inference, and consciousness (No. 6). Cambridge, MA, USA: Harvard University Press.

28.

Jovanovic

Baez

Casati

(2020) Chatbots as conversational healthcare services. IEEE Internet Computing 25(3): 44–51.

29.

Jussupow

Benbasat

Heinzl

(2020) Why are we averse towards algorithms? A comprehensive literature review on algorithm aversion. In: 28th European Conference on Information Systems (ECIS), 2020.

30.

Jussupow

Spohrer

Heinzl

(2022) Radiologists’ usage of diagnostic AI systems: The role of diagnostic self-efficacy for sensemaking from confirmation and disconfirmation. Business & Information Systems Engineering 64(3): 293–309.

31.

Kämmer

Choshen-Hillel

Müller-Trede

, et al. (2023) A systematic review of empirical studies on advice-based decisions in behavioral and organizational research. Decision 10(2): 107.

32.

Kaufmann

Chacon

Kausel

, et al. (2023) Task-specific algorithm advice acceptance: A review and directions for future research. Data and Information Management 7(3): 100040.

33.

Keysar

Hayakawa

(2012) The foreign-language effect: Thinking in a foreign tongue reduces decision biases. Psychological Science 23(6): 661–668.

34.

Kim

Baek

Lee

, et al. (2022) How war-framing effects differ depending on publics’ conspiracy levels: Communicating the COVID-19 vaccination. American Behavioral Scientist 0(0): 000276422211182.

35.

Kim

Hur

(2024) What makes people feel empathy for AI chatbots? Assessing the role of competence and warmth. International Journal of Human-Computer Interaction 40(17): 4674–4687.

36.

Kitchin

(2014) Big data, new epistemologies and paradigm shifts. Big Data & Society 1(1): 2053951714528481.

37.

Langer

Hunsicker

Feldkamp

, et al. (2022) “Look! it’s a computer program! it’s an algorithm! it’s AI!”: Does terminology affect human perceptions and evaluations of algorithmic decision-making systems? In: CHI Conference on Human Factors in Computing Systems, 29 April 2022, pp. 1–28.

38.

Langer

Landers

(2021) The future of artificial intelligence at work: A review on effects of decision automation and augmentation on workers targeted by algorithms and third-party observers. Computers in Human Behavior 123: 106878.

39.

Lee

(2018) Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management. Big Data & Society 5(1): 2053951718756684.

40.

Logg

(2017) Theory of Machine: When Do People Rely on Algorithms? Harvard Business School working paper series: 17–086.

41.

Logg

Minson

Moore

(2019) Algorithm appreciation: People prefer algorithmic to human judgment. Organizational Behavior and Human Decision Processes 151: 90–103.

42.

Longoni

Bonezzi

Morewedge

(2019) Resistance to medical artificial intelligence. Journal of Consumer Research 46(4): 629–650.

43.

Luo

Lau

, et al. (2022) A critical review of state-of-the-art chatbot designs and applications. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 12(1): e1434.

44.

Mahmud

Islam

Ahmed

, et al. (2022) What influences algorithmic decision-making? A systematic literature review on algorithm aversion. Technological Forecasting and Social Change 175: 121390.

45.

Mariadassou

Klesse

A-K

Boegershausen

(2024) Averse to what: Consumer aversion to algorithmic labels, but not their outputs? Current Opinion in Psychology 58: 101839.

46.

Mozafari

Weiger

Hammerschmidt

(2022) Trust me, I’m a bot—repercussions of chatbot disclosure in different service frontline settings. Journal of Service Management 33(2). Emerald Group Holdings Ltd.: 221–245.

47.

Neely

(1991) Semantic priming effects in visual word recognition: A selective review of current findings and theories. In: Besner

Humphreys

(eds) Basic processes in reading: Visual word recognition. Hillsdale, NJ: Lawrence Erlbaum, 264–336.

48.

Nick

(2017) Algorithms as culture: Some tactics for the ethnography of algorithmic systems. Big Data & Society 4(2): 205395171773810.

49.

Önkal

Goodwin

Thomson

, et al. (2009) The relative influence of advice from human experts and statistical methods on forecast adjustments. Journal of Behavioral Decision Making 22(4): 390–409.

50.

Park

Chung

Lee

(2024) Human vs. machine-like representation in chatbot mental health counseling: The serial mediation of psychological distance and trust on compliance intention. Current Psychology 43(5): 4352–4363.

51.

Pasquale

(2015) The black box society: The secret algorithms that control money and information. Cambridge, MA, USA: Harvard University Press.

52.

Prahl

Van Swol

(2021) Out with the humans, in with the machines?: Investigating the behavioral and psychological effects of replacing human advisors with a machine. Human-Machine Communication 2: 209–234.

53.

Rapp

Curti

Boldi

(2021) The human side of human-chatbot interaction: A systematic literature review of ten years of research on text-based chatbots. International Journal of Human-Computer Studies 151: 102630.

54.

Schaap

Bosse

Hendriks Vettehen

(2024) The ABC of algorithmic aversion: Not agent, but benefits and control determine the acceptance of automated decision-making. AI and Society 39(4): 1947–1960.

55.

Shin

(2024) Artificial misinformation: Exploring human-algorithm interaction online. Cham, Switzerland: Palgrave Macmillan.

56.

Shin

(2025) Debiasing AI: Rethinking the intersection of innovation and sustainability. New York, NY, USA: Routledge.

57.

Smale

MJC

Fox

(2024) When being smart trumps AI: An exploration into consumer preferences for smart vs. AI-powered products. Computers in Human Behavior 161: 108405.

58.

Starke

Baleis

Keller

, et al. (2022) Fairness perceptions of algorithmic decision-making: A systematic review of the empirical literature. Big Data and Society 9(2): 20539517221115188.

59.

Sundar

(2020) Rise of machine agency: A framework for studying the psychology of human–AI interaction (HAII). Journal of Computer-Mediated Communication 25(1): 74–88.

60.

Tannen

(1993) Framing in discourse. New York, NY, USA: Oxford University Press.

61.

Teoh

(2020) What’s in a name? Drivers’ perceptions of the use of five SAE level 2 driving automation systems. Journal of Safety Research 72. Pergamon: 145–151.

62.

Tully

Longoni

Appel

(2025) Lower artificial intelligence literacy predicts greater AI receptivity. Journal of Marketing 89(5): 1–20.

63.

Tversky

Kahneman

(1981) The framing of decisions and the psychology of choice. Science 211(4481). American Association for the Advancement of Science: 453–458.

64.

Wang

Cooper

Eby

(2024) From human-centered to social-centered artificial intelligence: Assessing ChatGPT’s impact through disruptive events. Big Data & Society 11(4): 20539517241290220.

65.

Waytz

Heafner

Epley

(2014) The mind in the machine: Anthropomorphism increases trust in an autonomous vehicle. Journal of Experimental Social Psychology 52: 113–117.

The influence of terminology on use and preference in algorithmic systems: Beyond the hype for chatbots

Abstract

Keywords

Introduction

Related literature

Algorithm aversion

Modern chatbots

Terminology influence over algorithm preference and use

The moderating effects of task over algorithm use

Overview of studies

Study 1

Sample and procedure

Variables

Analytical strategy

Results and discussion

Study 2

Sample and procedure

Variables

Analytical strategy

Results and discussion

General discussion

Theoretical contributions and implications

Practical implications

Limitations

Conclusion

Footnotes

ORCID iDs

Funding

Declaration of conflicting interests

Notes

References