Abstract
This article investigates under which conditions users on Twitter engage with or react to social bots. Based on insights from human–computer interaction and motivated reasoning, we hypothesize that (1) users are more likely to engage with human-like social bot accounts and (2) users are more likely to engage with social bots which promote content congruent to the user’s partisanship. In a preregistered 3 × 2 within-subject experiment, we asked
Introduction
Communication on social media platforms is not always generated by human users. So-called
Because previous research connects some of these social bots to a plethora of malicious activities such as the promotion of misinformation (Wang et al., 2018), political astroturfing (Keller et al., 2020), and the influence of election outcomes (Bessi and Ferrara, 2016; Ferrara, 2017; Schäfer et al., 2017), it is important to scrutinize the mechanisms of their effects. In this study, we are interested in such malicious social bots which try to disguise their automated nature by blending with human online activities (e.g. “liking” or “retweeting”).
While previous work has already investigated the effects of bots on social networks (Cheng et al., 2020; Keijzer and Mäs, 2021; Ross et al., 2019), in this article we offer a psychological perspective on how deceitful and manipulative social bots and social media users engage with each other through following, retweeting, quoted retweeting, and commenting on Twitter. We argue that investigating such engagement with social bots is crucial to better understand how social bot communication affects not only social media networks but also the user’s contribution to the amplification of malicious social bot communication.
To this end, we experimentally investigate two factors: the
Based on previous results on the interplay of humanness and partisanship (Wischnewski et al., 2021; Yan et al., 2020), we assume that the effects of humanness and partisanship influence each other. Opinion congruency might be more pronounced for highly human-like accounts, whereas opinion congruency might matter less when accounts are less human-like. Moreover, we want to know
Investigating how users engage with varying degrees of social bot humanness and partisanship, we contribute to a better understanding of the assumed effects of social bots on online communication networks. In particular, the results of our study help to gain a deeper understanding of the users’ contribution to the amplification of malicious social bot content. Our results can also help to design countermeasures and inform policymakers and social media providers alike.
Theoretical background
To understand the effects of social bots on users and, ultimately, on society, different strategies have been pursued, for example, modeling approaches that simulate the (hypothetical) impact of social bots in social networks through (multi)agent-based modeling (Cheng et al., 2020; Keijzer and Mäs, 2021; Ross et al., 2019) or epidemiological models of contagion (Mønsted et al., 2017). For example, employing an agent-based model of the spiral of silence, Ross et al. (2019) found that, under certain circumstances, social bots can alter the opinion climate of a communication network. Another approach is to investigate how social bots spread information in online networks (Gorodnichenko et al., 2018; Salge et al., 2021). For example, Salge et al. (2021) observed information dissemination by social bots in the 2013 Brazilian Confederation Cup riots. Relaying on conduit brokerage, the authors derive a theoretical model of information dissemination, incorporating an algorithmic process of social alertness and social transmission. In doing so, the authors thoroughly explain the complex processes that constitute information dissemination by social bots.
While these approaches provide meaningful conclusions on how deceitful and manipulative social bots can affect social networking platforms as a whole, they do not offer insights into how individual social media users might (unwillingly) promote social bot accounts. Moreover, knowing how individual users perceive and engage with social bots is crucial to inform modeling approaches. For example, Ross et al. (2019) employed two agent-based models with varying degrees of social bot influence on individual users. The authors found that, depending on the social bot influence on individual users, group-level effects varied: Bots with low influence on users were more effective in sparse networks, whereas bots with high influence on users were more effective in dense networks. Similarly, Salge et al. (2021: 4) were interested in the study of “bots taking action to disseminate information, regardless of whether the information they disseminate is received or not by the actors.”
In this study, we want to know how much direct influence social bots exert on individual users to overcome this limitation. While the direct persuasive effect of social bots on individual users’ opinions is difficult to assess, we argue to investigate, in a first step, how social bots and users engage with each other. Such user engagement can be investigated from different perspectives and varying depth. In this study, we limit engagement, however, to selected actions that individual users can take on Twitter which are commenting, following, retweeting, and quoted retweeting. With engagement as a first proxy to assess a possible social bot influence, we follow the assumption “that the interaction with other individuals (or a group) [here: social bots] may affect or change subjects’ thoughts, feelings, or behavior” (Luceri et al., 2019:2). Empirical evidence supports this view. For example, investigating political persuasion on social media, Diehl et al. (2016) found that, besides news use, social interactions of users with each other positively affected attitude formations.
While not many studies have experimentally investigated under which circumstances users engage with social bots (see, for example, Edwards et al., 2014; Spence et al., 2018), previous observational evidence confirms that users, indeed, engage with social bots (Cardoso et al., 2019; Wagner et al., 2012). Trying to identify which users are more susceptible to engage with social bots, Wagner et al. (2012), for example, found that more interactive users were more likely to reply to and befriend social bots. Similarly, Wald et al. (2013) found that users’ Klout score 1 and the total number of followers predicted the likelihood of users replying to or following a social bot. Results by Cardoso et al. (2019) were especially alarming. The authors could show that “[o]ne in three posts reshared by humans is an original content created by bots” (Cardoso et al., 2019: 2).
Although these results indicate that users frequently engage with social bots, other findings suggest that this engagement is less likely, if the users suspect an account to be a social bot. For example, Edwards et al. (2014) found that, while users rated social bots as similarly credible and competent compared with human accounts, social bots were perceived as less social and task attractive. In another study by Edwards et al. (2015), the authors found that users display higher uncertainty, less liking, and less social presence when communicating with a social bot than a human user.
Considering these results, we suggest that users show an engagement preference for human-like social bot accounts over accounts perceived as social bots. Central to this claim is our hypothesis that the
Because engagement is not limited to users initiating engagement, we also investigate how users react when accounts initiate engagement. Similar to H1, we assume that users are less likely to react to low human-like accounts and are more likely to react to high human-like accounts:
Influence of users’ and profiles’ partisanship on engagement activities
Besides the humanness of accounts (high human-likeness vs medium human-likeness vs low human-likeness), we suggest that specific opinions expressed by accounts increase or decrease the likelihood of users engaging with the account. Especially findings in the context of political communication and political information processing could show that opinion-congruent information is more likely to lead to engagement on social media than opinion-incongruent information (Colleoni et al., 2014; Garz et al., 2020). Such effects have led researchers to assume the emergence of so-called echo chambers and filter bubbles within social media, suggesting that users become encapsulated only with like-minded views, reinforcing existing beliefs (Barberá et al., 2015; Colleoni et al., 2014). However, this notion has recently been challenged (Bruns, 2019) and refined Kitchens et al. (2020).
The psychological mechanisms driving this preferential behavior have been associated with motivated reasoning. Motivated reasoning generally proposes that information processing is driven by either accuracy or directional goals (Kunda, 1990). While the motive of accuracy goals is to arrive at an accurate conclusion, directional goals aim to arrive at predefined conclusions that support previous attitudes or identities. In an early study, Kunda (1987) found that female heavy coffee consumers were less convinced about the harmful effects of caffeine than female low coffee consumers.
Motivated reasoning has also been found for engagement activities of users on social media (Cinelli et al., 2020), indicating that users were more likely to engage with like-minded users, leading to homophilic interaction patterns. We assume that this preference for engagement with like-minded users, rooted in motivated reasoning, should also occur when engaging with social bots. However, we assume a differentiated pattern, depending on the engagement activities. For the case of Twitter, activities such as
Similar to H2, engagement can also be initiated by another account. In the same manner as H3a, we assume that motivated reasoning would affect this process so that users are less likely to react to opinion-incongruent accounts and more likely to react to opinion-congruent accounts:
To answer RQ1, how does the perceived humanness and partisanship of a social media account influence users’ willingness to engage with and react to it, we suggest so far that both perceived humanness of accounts (H1 & H2) and partisanship (H3 & H4) drive users’ engagement with an account. However, previous studies have also found that motivated reasoning also affects how users perceive the humanness of profiles. Results indicate that users perceive opinion-congruent accounts as more human-like than opinion-incongruent accounts, which were perceived as less human-like and more bot-like (Wischnewski et al., 2021; Yan et al., 2020). Building on these results, we take our assumptions in H1–H4 one step further and suggest a mediating role of account perceptions. We hypothesize that
In addition, humanness (H1/H2) and users’ partisanship (H3/H4) might affect each other in a way that opinion-congruent preferences are more pronounced when accounts are perceived as more human-like, whereas opinion-congruency might matter less when accounts are perceived as less human-like. Independent of the accounts’ displayed partisanship, users might generally be less likely to engage with accounts of low humanness. This would suggest an important boundary condition of motivated reasoning, indicating that users do not blindly engage with any account on social media just because it shares the users’ partisanship. Hence, we included RQ2 into our study: Does the perceived humanness of an account interact with the congruency of the displayed account partisanship and users’ partisanship?

Mediation hypothesis (H5).
Finally, we want to know
All hypotheses and research questions were preregistered prior to data collection and are publicly available via OSF (https://osf.io/w42ca/).
Method
The ethical committee of the University of Duisburg-Essen approved the study. The data set, stimulus material, analysis, and Supplementary Material are publicly available on OSF: https://osf.io/w42ca/.
Sample
To test our hypotheses, we recruited 223 US American Twitter users from the crowd-sourcing platform Prolific. The sample size of 220 was determined through a prior power analysis (for details, see preregistration). Participants’ age ranged from 18 to 75 (
Experimental design and procedure
We conducted an online experiment, using a 3 × 2 within-subject design, with two independent factors,
To ensure all participants had the same understanding of social bots, we provided a general definition of social bots 3 before the experiment. After that, participants viewed 18 different Twitter profiles in a randomized order. For each profile, participants were asked (1) how likely they would engage (retweet, follow, quoted retweet, and comment) with the profile, (2) how they would react if the profile engaged with them, (3) which motivations drove their engagement intentions, and (4) how automated they perceived the profile. Finally, the experiment asked about participants’ basic demographic data, social media usage, time spent on social media, Twitter usage, political interest, and partisanship.
Stimulus material
Constituting the first factor,
For the second factor, congruency, we manipulated the political partisanship expressed in each profile, representing either a Republican or a Democrat account. Each profiles’ partisanship was matched with the participants’ partisanship, resulting in the two factor levels: opinion-congruent (Democratic Twitter profile and self-identified Democrat/Republican Twitter profile and self-identified Republican) and opinion-incongruent (Democratic Twitter profile and self-identified Republican/Republican Twitter profile and self-identified Democrat).
For each of the overall six conditions, we created three Twitter profiles consisting of 10 posts per profile, resulting in 18 Twitter profiles. Each Democrat account had a matching Republican account, displaying similar features concerning their follower/followee ratio, posting timings, and posting behavior. An example of a low human-like Democratic profile and the corresponding low human-like Republican profile is shown in Figure 2.

Examples of two low human-like profiles, depicting a Democrat profile (left) and a Republican profile (right).
Measures
After viewing each Twitter profile, we asked participants several questions concerning their intention to engage with the respective profile. First, we wanted to know how likely participants would engage with a profile. Hence, for each of the four engagement activities (retweet, follow, quoted retweet, and comment), we asked, “How likely is it that you would [activity] this account?” Answers were given on a 5-point-Likert-type item, ranging from 1 =
To assess how profile perceptions affect the relationship of opinion congruency and engagement intentions, we measured the
Control variables
Previous studies have shown that multiple variables, besides the variables of interest in our study, can affect how users perceive and engage with other users online. For example, Wischnewski et al. (2021) found that younger participants, participants who spent more time on social media and who know about social bots, showed a greater motivated reasoning bias. Hence, we included a one-item measure to assess how much time participants spent on social media (“less than 1 hour,” “1–3 hours,” “4–7 hours,” or “more than 8 hours”), and participant’s previous
In addition, we assessed participant’s
Results
User-initiated engagement with social bots
The main interest of this study was the effects of humanness and partisanship on different engagement activities. Hence, we preregistered four repeated measures analyses of variance (ANOVAs), including predefined control variables, for all four different engagement intentions (following, retweeting, commenting, and quoted retweeting). As hypothesized, we found a significant main effect of humanness (H1) for following,
Mean engagement likelihoods and standard deviations for all four engagement activities.
Results of the mixed multinomial regression analysis.
Moreover, in H3a, we hypothesized that participants would show a motivated reasoning bias for the endorsement activities, following, and retweeting (main effect of congruency). Results of the repeated measures ANOVAs supported our hypothesis. We found a significant main effect of opinion-congruency for following,
In addition to the main effects of humanness and partisanship congruency, we wanted to know whether the effect of partisanship congruency was different for different levels of humanness (RQ2). Over all four engagement activities, we found that the effect of congruency was dependent on the level of humanness, following:

Mean engagement likelihoods for retweeting, following, quoted retweeting, and commenting.
Reactions to social bot-initiated engagement
Similar to users initiating engagement with social bot accounts, social bot accounts can initiate engagement with users by following user accounts and commenting on, retweeting, or quoted retweeting user posts. Visually inspecting the descriptive outcomes of the four reactive engagement decisions (Figure 4), we observed several overall trends. Confirming the user-initiated engagement results (see the previous section), incongruent Twitter profiles received similar (dis-)engagement reactions, independent of the level of perceived humanness.

Count data of reaction decisions for each of the four behaviors (retweeting, commenting, following, and quoted retweeting).
In contrast, reactions to congruent profiles were dependent on the level of humanness. Fewer participants indicated to report/block highly human-like accounts and more participants reported reacting to such accounts. Notably, participants likely just ignored congruent accounts of low and medium humanness. For highly human-like accounts, the reaction depended on the initiating behavior. While following, retweeting, and quoted retweeting were still likely to be ignored, commenting human-like accounts are most likely to engage participants.
To confirm the visual analysis, we conducted mixed multinomial regressions. Because the visual analysis conveyed that participants were most inclined to ignore any engagement behavior, for the dependent variable, we used “nothing” as the baseline category, which we compared with the decision to “react” and “block/report.” For the factor congruency, we classified “congruent” and for the factor sophistication “ambiguous” at baseline. Control variables were included in each model. Coefficients and standard errors of all models can be found in Table 2.
Results of the mixed multinomial regression models are similar to all initiating activities and support the visual analysis. Compared with less human-like (ambiguous) Twitter accounts, less human-like accounts were more likely to be blocked/reported but equally (un)likely to be reacted to. Compared with less human-like (ambiguous) Twitter accounts, human-like accounts were more likely to be reacted to and less likely to be blocked/reported. Moreover, congruent profiles were more likely to be reacted to and less likely to be blocked than incongruent profiles.
Profile perception as the driver of engagement decisions
As suggested by motivated reasoning theory, we assumed that matching the participants’ partisanship and the displayed partisanship of the account (i.e. congruent) would drive the decision to engage with this account. We found this hypothesis supported (see “User-initiated engagement with social bots” section). In addition, we wanted to know whether we can explain this effect through the participants’ perception of the account. Hence, we hypothesized that partisan congruency indirectly affected the engagement likelihood through a biased perception of the account (H5).
To test H5, we employed mediation analyses, using the PROCESS macro for SPSS by Hayes (2017). We initially preregistered mediation analyses only for the endorsement activities (following and retweeting) because we expected these activities to be affected by motivated reasoning. However, the analysis in section “User-initiated engagement with social bots” indicated that commenting and quoted retweeting were also affected by partisanship congruency. Hence, we also conducted mediation analyses for commenting and quoted retweeting, assuming the same mediating role of profile perception.
Overall, for engagement activities, the mediation analyses supported H5 only for low human-like and highly human-like accounts but not for less human-like (ambiguous) accounts. For both low and highly human-like accounts, incongruent profiles were less likely to be engaged with (significant negative c-paths). They were also perceived as less likely to be human (significant negative a-paths). In turn, being perceived as less human decreased the likelihood of engagement with an account (significant negative b-path). For both low human-like accounts and highly human-like accounts, including the perception of an account partially explained the effect of congruency on engagements (significant indirect effects). All path coefficients and confidence intervals are reported in Table S5 in the Supplementary Material.
Similar to low and highly human-like accounts, we found for less human-like (ambiguous) that congruency increased engagement likelihoods accounts and a more human-like profile perception (significant negative c′- and a-paths). However, we did not find that the profile perception affected the likelihood of engagement (non-significant b-paths). Trying to understand this null effect, we first inspected the means and standard deviations of profile perceptions for ambiguous profiles:
Engagement motivations
As a follow-up to the findings above, we also wanted to know
To understand whether engagement motivations depended on the level of humanness of an account and/or the congruency of an account, we ran repeated measures ANOVAs with the two within-factors, humanness and partisanship congruency, and the previously mentioned control variables. Across all motivations, we consistently found a main effect for humanness and partisanship congruency and an interaction effect of both. Planned contrasts revealed that engagement results were similar to the engagement and reaction results. For incongruent accounts, levels of humanness did not matter. Most engagement motivations were equally low independent of an account’s humanness. In contrast, for congruent accounts, most motivations increased with increased human-likeness. For a detailed report of the
Discussion
Drawing on insights from previous research on human–social bot interaction and motivated reasoning theory, we hypothesized that the likelihood of social bot accounts to engage with users depends on two factors: the humanness of an account and the partisanship displayed by the account. In doing so, we differentiated between initiating engagement of users with social bots and reactive engagement of users with social bots. The behaviors of interest were the engagement activities: following, retweeting, commenting, and quoted retweeting. In addition to the direct effect of partisanship displayed by the account, we also hypothesized partisanship to indirectly affect engagement likelihoods by altering how profiles are perceived (mediation hypothesis). Finally, to better understand why users chose to engage or not, we also explored users’ engagement motivations.
Through repeated measures ANOVAs, we found for all four engagement activities of interest (following, retweeting, commenting, and quoted retweeting) the expected effects of humanness and partisanship congruency as well as an interaction of both. These results indicated that (1) highly human-like accounts were more likely than medium and low human-like accounts to receive engagement from and also that (2) this was only true for congruent accounts. In contrast, accounts that did not share participants’ partisanship were highly unlikely to receive engagement from participants. Hence, our study highlights that Twitter users are more willing to engage with human-like accounts, especially when they share the same political partisanship.
Similarly, when investigating how likely participants would react to accounts, only human-like, congruent accounts were likely to receive any engagement. However, it was most likely that participants would not react at all when an account initiated engagement. Independent of the level of humanness, incongruent accounts were most likely to be either blocked/reported or ignored. This implies that only very sophisticated social bots, which can successfully disguise their automated nature, are likely to engage with or receive engagement from users.
Moreover, we found that the impact of partisanship congruency was dependent on the level of humanness. The effect of partisanship congruency was largest for highly human-like accounts, smaller for medium human-like accounts, and smallest for low human-like accounts. This implies that users do not “blindly” engage with any account which shares their political partisanship but incorporate their perception of the humanness of the account into their engagement decision.
Results of the mediation hypothesis support this. Here, we revealed that the effect of congruency was partly due to biased humanness perceptions, indicating that congruent profiles were perceived as more human-like which, in turn, lead to an increased likelihood of user engagement. Similar to the engagement results, this suggests that users are unlikely to react to clear social bot accounts and most likely ignore or block/report these accounts. However, we did not find this effect for profiles that fell neither into the clearly social bot category (low human-like accounts) nor into the clearly human category (highly human-like accounts). We assume that due to the ambiguous nature of these accounts, participants needed to engage in more deliberative processing, which has previously been found to reduce the effect of motivated reasoning (Pennycook and Rand, 2019).
Especially the results concerning partisanship congruency confirm previous findings of motivated reasoning and homophilic patterns in social networks (Colleoni et al., 2014; Garz et al., 2020; Mosleh et al., 2021). Our results add to this that, in the context of social bot accounts, this pattern is partly due to biased perceptions of profiles with partisan-congruent profiles being perceived as more human-like (see also Wischnewski et al., 2021; Yan et al., 2020). However, our results also show limitations of this effect. Partisanship congruency mattered the least for bot-like accounts. With previous studies indicating that human-like social bot accounts are likely to be rare (Assenmacher et al., 2020), we conclude that the influential impact of social bots is likely to be overestimated. In fact, our results suggest that most accounts that show low to medium levels of humanness are likely to be ignored if they are congruent or blocked/reported if they are incongruent. However, this also implies that, as soon as social bots are well enough developed to successfully disguise their automated nature, users become increasingly more susceptible, especially if accounts are tailored to support specific partisan views.
These results for humanness extend previous findings on user social bot engagement which conventionally do not differentiate between different levels of social bot humanness (Cardoso et al., 2019; Wagner et al., 2012). For example, Cardoso et al. (2019) found that an increasing number of users interact with social bots and share content from social bot accounts. However, the authors do not differentiate between different levels of humanness of these accounts. Our findings indicate that, besides the strong impact of partisan congruency, user engagement with social bots is most likely driven by highly human-like bots.
Finally, by investigating different engagement motivations, we could also show that the effects of humanness and partisanship congruency are reflected by users’ motivations to engage. If accounts are incongruent, participants were generally not motivated to engage. If accounts were congruent, participants were most motivated to engage when the accounts were also human-like. In addition, we found that the different levels of humanness and partisanship (in)congruency of a profile affect all engagement motivations equally, except for the motivation to share a quoted retweet. Here, we found that the motivation “I want to argue by adding my own opinion to a retweet” was not affected by the displayed partisanship of the profile. This supports our initial hypothesis that different engagement activities are affected differently by partisanship congruency. Activities that do not imply endorsement, such as commenting or sharing a quoted retweet, should be less affected by the displayed partisanship.
Limitations and future work
The discussed results include methodological and theoretical limitations. By choosing Twitter as a social media platform, assumptions are restricted to it. Similarly, we can only make assumptions about political partisanship in the US context. However, different dynamics might occur when transferring our experimental setup to different cultures but also different polarizing contexts. For example, while partisans in the US are less likely to engage with each other (Finkel et al., 2020), other contexts might show different engagement patterns.
Moreover, participants in our study were immediately confronted with Twitter profiles. Consequently, suspicious behavior such as repetitive retweeting was immediately evident. However, in their everyday social media browsing experience, participants are more likely to come across single posts of accounts. In addition, we measured participants’ engagement intentions but not actual behavior. While previous research suggests that intentions are generally a good indicator for behavior, research on the intention–behavior gap suggests that, under certain circumstances, behavior deviates from the intention (see Sheeran and Webb, 2016, for a review). Furthermore, the introductory definition of social bots might have primed participants to more aware of a possible bot presence which would not occur in a real scenario. To overcome these limitations and increase ecological validity, field experiments similar to Mosleh et al. (2021) are necessary to strengthen our findings.
Theoretically, our argumentation relies on the assumption that social bots exert influence through communication with users. In doing so, we imply that users actively engage with or react to social bots. This assumption bears at least two limitations. First, active engagement is not a necessary but only a sufficient requirement for influence. Especially findings on
Besides these methodological and theoretical limitations, our work also holds important social implications. In particular, our results suggest that with increased sophistication of social bots, in other words, an increased robotization of social media users, the line between “real” human users and “automated” users becomes increasingly blurred which can lead to feelings of alienation and dehumanization of human users (Fortunati et al., 2019).
Conclusion
In this article, we wanted to know under which conditions Twitter users engage with and react to social bots. We found that highly human-like social bots were most likely to receive user engagement and were also more likely to initiate engagement with users. We also found that this was only true for accounts that shared the same partisanship as the user. Thus, users prefer to engage with and react to highly human-like accounts that share the same political opinion. Moreover, this effect of partisanship congruency decreased for accounts displaying medium or low levels of humanness, indicating that users do not blindly engage with any account that shared their political partisanship. Thus, we conclude that the impact of social bots on individual users is nuanced and most likely overestimated. Social bot engagement is only effective if they achieve to disguise their automated nature.
Footnotes
Acknowledgements
We would like to thank Carolina Alves de Lima Salge and Björn Ross as well as both anonymous reviewers for their helpful comments.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the German Research Foundation (DFG) under Grant No. GRK 2167, Research Training Group “User-Centred Social Media.”
