Abstract
The aim of this study was to examine whether the use of affective language by politicians impacts the social evaluation of their image, as measured by the semantic differential method developed by Cwalina et al.. In the study, Polish participants (N = 958) evaluated the profiles of three well-known Polish politicians from different parties: Bosak (right-wing), Trzaskowski (center), and Biedroń (left-wing). Participants made evaluations before and after reading hateful, kind or neutral tweets about refugees from Ukraine. We controlled for participants’ political views and demographic variables. The results confirmed changes in evaluations of the image of politicians depending on the language they used. The ‘positivity bias’ identified in the study provides encouragement to use kindness speech in public discourse, showing that this type of communication can improve a politician's ratings.
In the view of many researchers (Farkas & Bene, 2021; Grofman, 2005; Wattenberg, 1991) we are living in an era of candidate-centered politics. This means that the electorate's attention shifts from political parties to specific candidates running for various political offices. The shift is accompanied by the growing importance of a candidate's individual characteristics which makes up his or her image: a particular type of representation that can impart additional values to an object by evoking emotional associations (Cwalina et al., 2015). The reception of an image is affected both by its core features, the most significant ones for its evaluation, and by peripheral features related to the politician's additional activity, lifestyle, and manner of communicating with the public. Contemporary social discourse promotes a negative image of reality, and the statements of public figures reinforce and legitimize the use of language marked by hateful emotions (Bilewicz & Soral, 2020). Much rarer in public debate is the so-called ‘kindness speech’, which expresses positive emotions and a friendly attitude towards one's interlocutor (Cavazza, 2017). The language used by politicians influences the assessment of their credibility (Frimer & Skitka, 2018), the affective polarization of social attitudes (Druckman et al., 2019) and can also build (or lower) trust in the political system and institutions (Van't Riet & Van Stekelenburg, 2022).
The aim of this study was to examine whether the social evaluation of a politician's image is affected by their use of hate speech and kindness speech in political posts on X. We decided to choose ‘Twitter’ as a platform to analyze because, Twitter is often the primary political platform in Poland, and also, unlike other social networks (such as Facebook or Instagram), it facilitates short and direct messages that are comparably impulsive and affective. Twitter's character limit constraints discourse complexity inhibits reflexivity and tends to encourage uncivil behavior due to the impersonal nature of the platform (Ott, 2017; Salimi & Mortazavi, 2024).
Hate Speech and Kindness Speech in Politics
Hate speech, an extreme form of derogatory language (Cervone et al., 2021), is used by journalists, commentators and internet users, but also (and perhaps especially) by politicians, whether in debates in parliament, in the media or in their election campaigns (Cwalina & Koniak, 2023). Its emotional basis is contempt, which triggers anger and revulsion (Winiewski et al., 2017). Hate speech plays an enormous role in present-day social life and is a tool in political struggle and in public debate, excluding the possibility of dialogue and understanding (Obrębska, 2020). The number of hateful messages is steadily rising, especially on the Internet, which offers a sense of impunity and anonymity (Cervone et al., 2021). In the digital environment, hate speech emphasizes group identity and denigrates people in the ‘‘out-group’ (Papcunová et al., 2023). It carries a number of negative consequences on many levels: not only leaving a trauma in discriminated people (Leets & Giles, 1999; Mullen & Smyth, 2004; Swim et al., 2009), but also changing the way in which minority groups are perceived by the rest of the society (Cervone et al., 2021). Previously tolerant people after contact with verbal aggression may change their attitude towards discriminated groups, assuming that if they are evaluated as such by the majority (or by political leaders), they must be to some extent responsible for the unfriendliness that they encounter. The studies on desensitization (Allen et al., 2018; Fasoli et al., 2015; Soral et al., 2018) show that the very exposure to speech hate strengthens the inclination for dehumanization and also causes that after some time it is no longer considered offensive, shocking or violating social standards.
The use of aggressive language in politics can be beneficial: when talking about social categories disliked by listeners, politically incorrect speakers are perceived as more authentic (Rosenblum et al., 2019), negative messages decreased the perceived warmth of the politician while simultaneously increasing the perceived competence (Carraro & Castelli, 2010), the use of vulgar language and a sarcastic tone when referring to political opponents in an election context indicates a politician's strength and determination to win (Ntontis et al., 2024). Furthermore, Rossini et al. (2020), in an analysis of comments on immigration posted on Facebook during the 2016 US presidential primaries, found that when candidates used uncivil language, the number of uncivil comments posted by other users on their profiles also increased. The use of uncivil language can therefore result in an increase in affective polarisation, which may in turn lead to more extreme views.
Much more rarely encountered in public debate is what may be called ‘kindness speech’, expressing positive emotions and a kindly attitude to one's interlocutor. Meanwhile, research shows many positive consequences of this type of communication, including improved interpersonal sensitivity (Williams et al., 2018), well-being (Cotney & Banerjee, 2019), and even psychological welfare (Cregg & Cheavens, 2022; Shin et al., 2020). In relation to minority groups who were described in a kind manner, researchers observed increased sympathy and more positive assessments (Cervone et al., 2021). Much research has been devoted to the role of flattery in political campaigns. Cavazza (2017) showed experimentally, that praising a political opponent evokes an audience's positive emotions, which in turn positively affects source reliability, and finally enhances the likelihood of voting for that source. Thus, kindness speech fosters positive evaluation of the persons using it, but in line with the negativity bias effect (Baumeister et al., 2001; Holleman et al., 2021; Rozin et al., 2010) and its impact is weaker than that in the case of negative language.
Demographic variables, such as the gender of the politician or the audience (Schultz & Pancer's, 1997), race and religion (McLaughlin & Thompson, 2016), ethnicity and age (Flannelly, 2002), sexual orientation (Beyerlein & Klocek, 2020), and similarity of political views (Frimer & Skitka, 2018), also play an important role in assessing the language used by politicians. Schultz and Pancer's (1997) study examined perceptions of political candidates (male and female) making negative statements about the personality and integrity of their political opponents. Participants who read a speech by a candidate of their own gender were more likely to rate the candidate as having greater integrity when the candidate attacked his or her opponent than when he or she did not. When judging a candidate of the opposite sex, participants tended to rate the candidate who attacked his/her opponent as having less integrity than a candidate who did not attack his/her opponent. The results are discussed in terms of the impact that aggressive campaign tactics may have on voters’ perceptions, and how similarity between voters and candidates may influence perceptions of such tactics.
Affective messages may thus play a complex, interactive role in various social behaviors, as is well explained by the classical Affect Infusion Model (AIM) of Joseph Forgas (2001, 2002) and its more recent validations (Fehrenbacher, 2017; Ling et al., 2023; Pérez-Fernández et al., 2022; Winkielman et al., 2007). The phenomenon of affect infusion occurs as a result of affect priming, a process in which affectively marked information has an impact on an individual's cognitive and behavioral processes, becoming a part of them and coloring them with a specific affective valence. Research has shown that affect infusion occurs most frequently during the active processing of new, ambiguous, and complex information, because broader, more complex processing increases the probability of subconscious inclusion of affectively colored information in the process of planning one's behaviors. Social evaluations are an example of complex substantive processing which is often ambiguous and requires personal engagement, and which is susceptible to affect infusion and affect priming effects.
The effect of emotionally marked messages on the social evaluation of the speaker can also be explained in a good way by two other mechanisms: spontaneous trait transference (STT; Carlston & Skowronski, 2005) and transfer of attitudes recursively (TAR; Gawronski & Walther, 2008). Spontaneous trait transference occurs when communicators are perceived as possessing the very traits they describe in others. It reflects simple associative processes that occur even when there are no logical bases for making inferences. The transfer of attitudes recursively, on the other hand, refers to the feedback effect of another person's pronounced evaluation on the formation of a similar attitude towards the speaker. Gawronski and Walther (2008) consider this to be an attributional process. When people hear a person's (e.g., a politician's) insulting and repeated statements about others, they infer that the person who has a negative and arrogant attitude towards others deserves such treatment themselves. And vice versa: people who like others acquire a positive valence.
Based on the psychological mechanisms presented, it can be assumed that a positively marked message (kindness speech) will lead to the formulation of more positive social evaluations, and a negative message (hate speech) to more negative ones. This suggests the following research hypotheses:
Hypothesis 1 (H1): Evaluations of the image of individual politicians become more negative following their use of hate speech.
Hypothesis 2 (H2): Evaluations of the image of individual politicians become more positive following their use of kindness speech.
Hypothesis 3 (H3): Evaluations of politicians made before and after the use of hate speech or kindness speech dependent on the political views, gender and age of the persons making the evaluations.
The Present Research
The research was of an experimental nature, conducted in a scheme of 3 (images of right-wing, left-wing, and centrist politicians) × 3 (messages with negative, positive, and neutral emotional marking). The dependent variable in the analysis was a politician's image. The primary independent variable was the emotional marking of the tweets (negative and positive), while secondary independent variables were the gender, age, and political views of the study participants. The emotionally neutral message had the status of a control variable, providing the ability to examine whether a change in the evaluation of a politician's image was really affected by the emotional marking of the tweets used in the study.
This study was carried out under a research project funded by the Polish National Science Center titled ComPathos: towards the new model of pathos for computational rhetoric (2020/39/D/HS1/00488), and received a positive opinion regarding the ethical problems described in the submitted research project, issued by the Ethics Committee for Research Projects functioning at the Faculty of Psychology and Cognitive Science of Adam Mickiewicz University in Poznań.
Method
Pilot Study
The main study was preceded by a pilot study carried out online, on a group of 105 persons, in April 2021. The purpose of the pilot study was to test the research procedure. The results were published in the article Use of hate speech and social evaluation of a politician's image (Dobrowolska & Obrębska, 2023). As a consequence of the pilot study, the following changes were made to the project: (1) so-called ‘kindness speech’ was included in the project; (2) messages with emotionally neutral marking were included in the project as controls; (3) the list of politicians was updated; (4) it was planned to present the politicians to subjects in random order rather than in a fixed order; (5) the tweets selected for the study were made uniform in terms of their affective valence—positive for kindness speech and negative for hate speech—using methods of computational linguistics.
Participants
A group of 958 participants (53.4% women; Mage = 46, SDage = 16) was recruited for the study. There was an equal number of participants for each of the experimental and control conditions. The largest number of participants was from rural areas (36.64%), followed by participants from medium-sized metropolitan areas (20.46%), large metropolitan areas (17.22%), small metropolitan areas (13.05%), and very large metropolitan areas (12.63%). Almost half of the group had completed higher education (45.72%), followed by those with a high school diploma (including undergraduate students; 39.35%), vocational education (12.42%), and a primary school diploma (2.51%). All subjects were of Polish nationality.
Participants were recruited via the Ariadna research panel (https://panelariadna.pl/), an online platform where users can register and complete questionnaires in exchange for points redeemable for material rewards. Users on this platform may choose to participate in any number of available surveys. Identification is ensured by delivering these rewards to physical addresses provided by users, confirming their identities.
Materials
The Polish politicians selected for study were ascertained by competent judges to be most representative of particular political groupings: Krzysztof Bosak (the right), Rafał Trzaskowski (the center), and Robert Biedroń (the left). The selected politicians are also similar to each other in terms of such demographic variables as gender, age, and level of education. None of the politicians selected is known to use or spread direct hate speech. To confirm if the politicians are comparable in terms of the emotional valence of their usual Twitter (X) activity, we collected a sample of approximately 2,000 tweets for each politician and performed an analysis of valence using the method established in Konat et al. (2024), who applied an automated method based on psychologically established lexicons of emotional words (mainly Wierzba et al., 2015; Wierzba et al., 2022). We found that Rafał Trzaskowski's tweets were predominantly positive (68.7%), with a smaller proportion being neutral (13.0%) or negative (18.3%). Similarly, Robert Biedroń's tweets were largely positive (52.8%), followed by neutral (17.1%) and negative (30.1%). In this context, Krzysztof Bosak's tweets were slightly different, with 41.6% positive, 18.3% neutral, and 40.1% negative tweets. These findings suggest that even though Biedroń and Trzaskowski are in general more positive in their online posting than Bosak, still, for all three candidates the dominant category is positive tweets. This empirical evidence supports our claim that the three selected politicians are comparable and none is associated with abnormal negative activity in the online setting.
The study included specific items to assess participants’ familiarity and political alignment with the politician. Familiarity was measured with a binary question: “Is the presented politician known to you?,” with response options “yes” and “no.” To assess political alignment, participants were asked to rate their agreement with the views of a politician on a five-point Likert scale, from 1 (strongly disagree) to 5 (strongly agree). A midpoint option, “I don't know,” was provided to capture ambivalence or unfamiliarity with the politician's stance. Respondents unfamiliar with the politician were instructed to skip this question. These items provided a basic but direct measure of familiarity and political alignment as relevant to our analysis.
The image of the selected politicians was evaluated using the Polish adaptation of the semantic differential scale developed by Cwalina et al. (2000). The scale comprises 14 bipolar seven-point dimensions formed by pairs of opposite adjectives: ‘qualified—unqualified,” “cosmopolitan—provincial,” “honest—dishonest,” “believable—unbelievable,” “successful—unsuccessful,” “attractive—unattractive,” “sincere—insincere,” “calm—excitable,” “unaggressive—aggressive,” “strong—weak,” “active—inactive,” “believer—nonbeliever,” “sophisticated—unsophisticated,” and “friendly—unfriendly.”
The material employed in the study consisted of tweets on the topic of Ukrainian refugees in Poland. The topic was selected in view of its current relevance in social discourse and its strong emotional appeal, polarizing Polish society (Wypych & Bilewicz, 2024). We designed three categories of tweets for each politician—one expressing positive sentiment, one negative, and one neutral. The tweets used in the study were inspired by real tweets written by politicians; however, they were fabricated by the researchers for the purpose of the study. We decided to rely on fabricated posts rather than real ones due to several challenges encountered with natural language data. Initially, we collected a sample of over 1,000 tweets for each politician and conducted an automated valence analysis to identify posts with the most emotional content. Based on this analysis, we selected candidate tweets for use in the experiment. However, we found it challenging to control for key variables such as length, concise topic focus, and sentiment consistency. Additionally, it was particularly difficult to identify truly neutral tweets on the topic, as it is uncommon for politicians to produce neutral content on such polarizing issues. To address these challenges, we created fabricated posts inspired by the selected material and informed by our understanding of each politician's typical style. To ensure intersubjectivity and validate the emotional tone of the fabricated posts, we tested their valence using computational linguistic methods.
The validation methodology included two stages—the first using automatic transformer-based classifiers available in the Transformers Python library (Wolf et al., 2020), and the second following lexicon-based sentiment analysis. The first method allows a text sample to be assigned to one of three categories of sentiment based on the full content of tweets. The second involves the extraction of emotion-laden words based on emotional word lists created by psychologists (Imbir, 2016; Riegel et al., 2015; Wierzba et al., 2022). Here, if the number of positive-valenced words was higher than the number of negative-valenced words, the tweet was regarded as positive, and vice versa. In case of an equal number of positive and negative words, we checked the intensity of sentiment available in emotional word lists and chose the higher value as the final category of expressed sentiment. Neutral tweets were required to have no emotional words. Both methods confirmed the category of sentiment expressed in the tweets; that is, tweets designated as positive were confirmed by both methods to express positive sentiment, negative tweets yielded negative labels from both methods, and neutral tweets were given neutral labels by both methods. The choice of the method for validating the valence of tweets is dictated by the nature of the task—judgments of emotional tone of a message are highly subjective, thus the consensus is rarely achieved even with large samples of raters (Gajewska & Konat, 2023). On top of that, the choice of the topic in the material is highly polarizing for Poles (Ukrainian refugees in Poland), therefore, it is highly probable that the judgments of competent judges could have been biased. For that reason, we decided to use more objective and intersubjective methods from computational linguistics, which were thoroughly tested and normalized by psychologists to ensure their high reliability. The tweets used in the study are available in the appendix.
In addition, we collected X (Twitter) profiles of the chosen politicians from their X (Twitter) accounts, in order to ask participants about their political alignment with the politician they were evaluating. As distractor material we employed a short animated video from the YouTube platform. The video material had been tested positively as a distractor in the pilot study.
Procedure
The survey was conducted online in April and May 2023. Participants were first informed about the aims of the study and made familiar with the tasks. They were then asked to provide demographic information relating to their sex, age, place of residence, education, and nationality. Each participant was randomly assigned to one of the three politicians in the study. They were then presented with the politician's profile associated with his Twitter account, and asked about their familiarity and political alignment with the politician. Next, participants evaluated the image of the selected politician on the semantic differential scale (Cwalina et al., 2000). A short video was then displayed as a distractor material prior to the second part of the study. Participants answered control questions relating to the content of the video. Analysis of the answers to the control questions also made it possible to determine the level of attentiveness of the participants during the testing procedure.
Each participant was then randomly assigned to one out of three stimulus conditions—a tweet expressing positive, negative, or neutral sentiment. Finally, participants were asked again to evaluate the image of the selected politician on the semantic differential scale. After the study, participants were informed that the tweets presented in the study were fake, not authored by the politicians, and designed only for the purpose of the study.
Statistical Analysis
With the use of one-way MANOVA, we test the role of tweet valence (positive, negative, neutral) in the ratings of politicians’ image. Then, we perform a two-way MANOVA to test the influence of tweet valence and political orientation of the presented politician (images of right-wing, left-wing, and centrist politicians) on the ratings of politicians’ image; and the influence of political orientation and participants’ gender on the ratings of politicians’ image. Post hoc Tukey's HSD tests are performed to unveil group differences in such cases. Posttest—pretest comparisons were performed with the paired-samples t-test for each stimulus condition (positive, negative, and neutral). Differences between genders and between politicians on the semantic differential scale were tested with the independent-samples t-test and Bonferroni correction for multiple comparisons when applicable. Statistical analysis was run in Python with the use of the Statsmodels library for MANOVA and ANOVA and post hoc Tukey's HSD analysis, as well as the Scipy library implementation of the t-test, Spearman's rho coefficient, and Kendall's tau-b (τb) rank correlation coefficient. Effect sizes are calculated with the use of Cohen's d.
Results
Posttest–Pretest Comparisons for the Negative Stimulus Condition (Hate Speech)
Posttest–pretest comparisons for the negative stimulus condition showed statistically significant differences in seven dimensions of the scale. Posttest ratings were significantly lower for provincial–cosmopolitan (t = –5.79, p < .01, d = 0.29), dishonest–honest (t = –2.76, p < .01, d = 0.12), insincere–sincere (t = –3.21, p < .01, d = 0.15), excitable–calm (t = –6.19, p < .01, d = 0.30), aggressive–unaggressive (t = –6.67, p < .01, d = 0.37), nonbeliever–believer (t = –2.50, p < .01, d = 0.11), and unfriendly–friendly (t = –4.99, p < .01, d = 0.25). The described relationships are illustrated in Figure 1.

Pretest–posttest differences in ratings for the negative stimulus condition.
Posttest-pretest comparisons for individual politicians under the negative stimulus condition reveal several statistically significant results. First, Krzysztof Bosak (right) was seen as more provincial (t = –3.86, p < .01, d = 0.37), dishonest (t = –1.75, p = .04, d = 0.16), insincere (t = –2.39, p < .01, d = 0.22), excitable (t = –3.89, p < .01, d = 0.41), aggressive (t = –4.14, p < .01, d = 0.42), nonbelieving (t = –4.56, p < .01, d = 0.4), and unfriendly (t = –3.78, p < .01, d = 0.39) than in the pretest. Second, Robert Biedroń (left) was viewed as more provincial (t = –3.64, p < .01, d = 0.37), dishonest (t = –2.72, p < .01, d = 0.22), unattractive (t = –1.66, p = .04, d = 0.22), insincere (t = –2.27, p = .01, d = 0.2), excitable (t = –4.49, p < .01, d = 0.39), aggressive (t = –4.65, p < .01, d = 0.48), and unfriendly (t = –3.13, p < .01, d = 0.31). Lastly, Rafał Trzaskowski (center) was seen as more provincial (t = –2.56, p < .01, d = 0.19), excitable (t = –2.08, p = 0.02, d = 0.14), and aggressive (t = –2.62, p < .01, d = 0.22) by participants after reading a negative tweet than in the pretest.
Posttest-Pretest Comparisons for the Positive Stimulus Condition (Kindness Speech)
Posttest–pretest comparisons for the positive stimulus condition show statistically significant differences for all dimensions of the scale—participants gave significantly higher ratings after reading a positive tweet. The results are given in Table 1.
Pretest-Posttest Differences in Ratings Under a Positive Stimulus Condition.
The results are further illustrated in Figure 2.

Pretest–posttest differences in ratings under a positive stimulus condition.
Posttest-pretest comparisons performed separately for each politician under the positive stimulus condition show two statistically significant differences for right-wing Krzysztof Bosak (in the provincial–cosmopolitan and unfriendly–friendly dimensions), 11 statistically significant results for Robert Biedroń (left), and eight such differences for Rafał Trzaskowski (center). Ratings on all dimensions of the scale here are significantly higher in the posttest than in the pretest. Detailed results for individual politicians are given in the appendix.
Posttest-Pretest Comparisons for the Neutral Stimulus Condition
Analysis of differences in posttest–pretest ratings under a neutral stimulus condition reveals two statistically significant results—higher ratings on the unattractive–attractive (t = 2.75, p < .01, d = 0.11) and aggressive–unaggressive (t = 3.03, p < .01, d = 0.13) dimensions—although the effect sizes are marginal. The results are illustrated in Figure 3.

Pretest–posttest differences in ratings under a neutral stimulus condition.
Posttest-pretest comparisons performed separately for each politician under the neutral stimulus condition show four statistically significant differences for Krzysztof Bosak—representing political right, in the unsuccessful–successful (t = 2.13, p = .03, d = 0.17), unattractive–attractive (t = 2.42, p = .02, d = 0.18), aggressive–unaggressive (t = 2.11, p = .04, d = 0.16) and nonbeliever–believer (t = –1.99, p = .048, d = 0.16) dimensions; one statistically significant result for Robert Biedroń—representing political left (excitable–calm: t = 2.3, p = .02, d = 0.16); and one such difference for Rafał Trzaskowski—representing political center (unfriendly–friendly: t = 2.1, p = .04, d = 0.15).
Stimulus
The results indicate a significant difference in the post-test ratings of politicians’ image between stimulus conditions, F(28, 1884) = 3.53, p < .001. Post hoc Tukey test reveals differences between positive and negative conditions for all but two dimensions of the image: active and sophisticated. Differences between negative and neutral conditions are significant for all but four ratings: believable, sincere, active and sophisticated. We find no significant differences between positive and neutral conditions. Results of a one-way MANOVA, performed separately for each politician, are also significant, Bosak (right): F(28, 614) = 2.56, p < .001; Trzaskowski (center): F(28, 606) = 1.52, p = 0.04; Biedroń (left): F(28, 600) = 2.58, p < .001.
Comparisons Between Politicians
In the case of the negative stimulus condition, statistically significant differences were recorded for two dimensions. Participants who had read a negative tweet gave significantly lower ratings in the nonbeliever–believer dimension to Krzysztof Bosak (right) than to both Rafał Trzaskowski (center) and Robert Biedroń (left), with medium effect sizes (t = –4.65, p < .01, d = 0.64; and t = –2.98, p = .01, d = 0.41). The change in ratings was −0.64 for Bosak (right), −0.06 for Biedroń, and 0.13 for Trzaskowski (center). Thus, Bosak (right) was seen as more of a nonbeliever by participants who had read a negative tweet than Biedroń (left) and Trzaskowski (center). We also observe a statistically significant difference for the excitable–calm dimension, where participants rate Biedroń (left) more negatively after reading a negative tweet than in the case of Trzaskowski (t = 2.44, p = .05, d = 0.33). The change in this dimension was larger for Biedroń than for Trzaskowski (–0.69 vs. −0.23). Among participants who had read a positive tweet, Bosak (right) was rated significantly lower than Trzaskowski (center) on the inactive–active dimension (t = –2.68, p = .02, d = 0.37). Here, on average, Trzaskowski was rated as 0.32 more active in the posttest than in the pretest, while Bosak (right) was seen as 0.26 less active in the posttest. Finally, in the neutral stimulus case, the change in semantic differential ratings was significantly higher for Bosak (right) than for Trzaskowski (center) and Biedroń (left) on the unfriendly–friendly (t = –2.41, p = .05, d = 0.33) and nonbeliever–believer (t = –2.74, p = .02, d = 0.37) dimensions, respectively.
Congruence of Political Views
We found that congruence of participants’ political views was correlated with ratings for all dimensions of the semantic differential scale in pretest and posttest with negative and neutral stimuli, and in all but one dimension in posttest with a positive stimulus. Correlation coefficients are calculated with the use of Kendall's tau-b (τb) rank correlation. Detailed results are given in Table 2.
Kendall's tau-b (τb) Correlation Coefficient Between Congruence of Political Views and Dimensions of the Semantic Differential in Pretest and Posttest.
The differences between groups were tested using the independent sample t-test with Bonferroni correction for multiple comparisons. The results of this analysis are given in Table 4 in the appendix. Effect sizes for these results vary from small to large. Small effects are observed for the nonbeliever–believer dimension, and medium or large effect sizes for the other 13 dimensions. The results show that participants who agree with a politician's political views tend to rate the image of that politician more positively in the pretest than do neutral/undecided participants and those with opposite political views. Similarly, neutral/undecided participants rate the image more positively than opponents of the politician. Moreover, the effect of these differences is larger between aligned and neutral participants than between the neutral and opposing groups. Therefore, the positivity of supporters is larger than the negativity of opponents in evaluating a politician's image. These relationships are illustrated in Figure 4.

Differences in semantic differential ratings in the pretest between groups with different congruence of political views with the selected politician.
The effect of the interaction between stimulus type and participants’ congruence of political views on the change in ratings (pretest ratings subtracted from posttest ratings) was statistically significant in the case of one dimension (unsophisticated–sophisticated: F = 2.7, p = .03) according to a two-way ANOVA design (stimulus and congruence of political views). The post hoc Tukey's HSD test, however, showed the results for group comparisons on the unsophisticated–sophisticated dimension to be statistically insignificant.
Gender Differences
We observed statistically significant differences in pretest judgements on the semantic differential scale between women and men using a one-way MANOVA, F(14, 942) = 1.74, p = .04. Group differences are tested with the t-test for independent samples. On average, women gave higher responses on all six of the dimensions in question: unqualified–qualified (t = 1.97, p = .048, d = 0.13), dishonest–honest (t = 2.01, p = .044, d = 0.13), unsuccessful–successful (t = 1.99, p = .046, d = 0.13), unattractive–attractive (t = 2.41, p = .016, d = 0.16), aggressive–unaggressive (t = 2.74, p = .006, d = 0.18), and unfriendly–friendly (t = 3.76, p < .01, d = 0.24).
We also find a statistically significant interaction effect with regard to differences on the semantic differential scale between the gender of participants and evaluated politicians in a pre-test, F(42, 2789) = 10.89, p < .001. Tests of group differences reveal that statistically significant results are obtained only for Robert Biedroń (left). On average, women evaluated that politician higher on the following dimensions: unqualified–qualified (t = 3.38, p < .01, d = 0.38), dishonest–honest (t = 3.77, p < .01, d = 0.43), unbelievable–believable (t = 2.48, p = .013, d = 0.28), unsuccessful–successful (t = 3.26, p < .01, d = 0.37), unattractive–attractive (t = 3.86, p < .01, d = 0.44), insincere–sincere (t = 2.81, p < .01, d = 0.32), aggressive–unaggressive (t = 2.05, p = .04, d = 0.23), weak–strong (t = 2.89, p < .01, d = 0.33), inactive–active (t = 2.77, p < .01, d = 0.31), and unfriendly–friendly (t = 3.78, p < .01, d = 0.43). Women, therefore, rated Biedroń more positively in the pretest than men did. No statistically significant results between women and men were obtained for the other two politicians.
Differences between genders in the evaluation of politicians on the semantic differential scale were also found in the posttest analyses, F(42, 2789) = 8.95, p < .001. Following a positive stimulus, women viewed politicians as more qualified (t = 2.69, p < .01, d = 0.31), cosmopolitan (t = 3.1, p < .01, d = 0.35), believable (t = 2.55, p = .01, d = 0.29), successful (t = 2.0, p = .046, d = 0.23), attractive (t = 2.3, p = .02, d = 0.26), calm (t = 2.07, p = .039, d = 0.23), believing (t = 2.25, p = .025, d = 0.25), and friendly (t = 2.76, p < .01, d = 0.31). Following a negative stimulus, no differences were found between the genders.
Analyses performed for each politician individually revealed that women viewed Krzysztof Bosak (right) as more friendly (t = 2.27, p = .025, d = 0.44) after reading a positive tweet compared with men under the same condition. They also rated Robert Biedroń (left) as more qualified (t = 2.01, p = .046, d = 0.4), attractive (t = 2.28, p = .025, d = 0.46), sincere (t = 2.66, p < 0.01, d = 0.53), believing (t = 2.33, p = .021, d = 0.47) and friendly (t = 2.01, p = .047, d = 0.4) compared with men in the posttest. Rafał Trzaskowski was rated as more cosmopolitan (t = 2.48, p = .015, d = 0.48) by women than men after reading a positive tweet. No statistically significant results between women and men in the posttest with a negative stimulus were obtained for Bosak (right), or Trzaskowski (center). One significant difference was obtained for Biedroń (left): men viewed him as less honest (t = 2.06, p = .042, d = 0.39) than women did in the posttest.
The influence of stimulus type and participants’ gender on the change in ratings (pretest ratings subtracted from posttest ratings) was tested using two-way ANOVA (stimulus and gender as factors). No statistically significant results were found. We may conclude that the influence of the stimulus on the change in ratings given by participants in the posttest compared with the pretest is the same in the case of women and men.
Age
The association between ratings on the semantic differential scale and participants’ age were tested using Spearman's correlation coefficient as implemented in the statsmodels Python library. Statistically significant results were found for unqualified–qualified (a weak negative correlation) and inactive–active (a weak positive correlation) in the pretest. In the posttest analyses, several statistically significant coefficients were obtained: seven positive correlations under the positive stimulus condition, and one negative and one positive correlation in the posttest with a negative stimulus. No statistically significant coefficients were recorded in the posttest with a neutral stimulus. Detailed results are given in the appendix.
Factor Analysis
Lastly, we conduct exploratory factor analysis in the search of key dimensions of the politicians’ image that change after exposition to experimental stimuli. We tested several variations with a different number of factors (n = 3, 4, 5, 6, 7), with and without a rotation (varimax, promax, quartimax, oblimin). Data entered into the factor analysis model involves all variables from pretest and posttest. We use a dedicated FactorAnalyzer python library to conduct the factor analysis. The best-suited model groups items into three factors using a promax rotation (similar results are also achieved with the varimax rotation) and MINRES fitting method for all politicians. Factor 1 (‘leader's abilities’) comprises qualified, cosmopolitan, believable, successful, attractive, and strong items, all with loadings between 0.52 and 0.87. Factor 2 (“morality”): items honest, sincere, and active with loadings between 0.65 and 0.71. Factor 3 (“sociability”): items calm, unaggressive, and friendly with loadings around 0.60. Because of a high percentage of variability unexplained by the three factors (>70%) items believer and sophisticated were discarded. Similar results from factor analysis were obtained by Cwalina et al. (2005) who distinguished a two-factor model (‘leader's abilities’ and ‘sociability/morality’) on a Polish and American sample. Running a MANOVA multivariate linear model with stimulus type and politician as independent variables, we observe a statistically significant interaction effect between both variables [F(12, 2503) = 2.09, p = .02]. Regarding Factor 1, significant changes between stimuli conditions only for Robert Biedroń—between negative vs. neutral conditions (t = −3.47, p = .002), and negative vs. positive conditions (t = −3.01 p = .008). The difference in Factor 2 between stimuli conditions is again found only for Robert Biedroń (negative vs. neutral (t = −2.90, p = .01), and negative vs. positive conditions (t = −3.19, p = .004). Differences in Factor 3 are significant for Robert Biedroń and Krzysztof Bosak—for both politicians between negative vs. neutral (t = −5.71, p < .001, and t = −3.24, p = .004), and negative vs. positive conditions (t = −5.41, p < .001, and t = −4.59, p < .001). Results of factor analysis indicate that the perception of the leader's abilities and morality is rather stable. The perception of a politician's sociability and friendliness may, however, change depending on the language they use in their media appearances. Although the use of factor analysis is a suitable approach, employed by many scientists, it may result in losing much of their application value (Cwalina et al., 2000). Factor analysis can be useful for “preparing an election campaign, because they suggest how to control, to some extent, the image of a candidate. But the hints on what to remove from the image and what to add to it are somewhat general and equivocal” (Cwalina et al., 2000, p. 122). Therefore, we conduct factor analysis as a supplementary analysis rather than the central one, following previous studies on this topic (Cwalina et al., 2000).
Discussion
The results confirmed the significance of the emotional marking of an utterance in terms of social evaluations. For both negative and positive messages, a change in evaluations of a politician's image was observed in a direction consistent with the affective valence of the message. The change particularly applied to evaluations of the politicians’ friendliness and their attitudes to other people (sociability), the perception of their abilities and morality was rather stable. In the control conditions, using neutral stimuli, changes were recorded on only two dimensions, with small effect sizes. This result is in accordance with the assumptions of the Affect Infusion Model (AIM) proposed by Forgas (2002), which indicate that words with strong emotional valence often activate evaluative reactions in a rapid, automatic way. In our study, the influence of affect priming proved stronger in the case of positive stimuli; this is in contradiction to the theory of negativity bias (Baumeister et al., 2001; Holleman et al., 2021; Rozin et al., 2010), which suggests that automatic emotion activation is stronger in response to negative words than to positive words. Negative words are generally more marked and show a more extreme deviation from neutrality in their semantic meanings than their positive counterparts. The ‘positivity bias’ obtained in our study may result from the dominance in public political debate in Poland of a negative and aggressive form of communication, which means that positively marked political tweets featuring kindness speech are viewed as atypical, and thus more distinctive and noticeable. This observation is in line with the research of Skytte (2021), who found that the cultural communication of political opponents is something surprising and unique to citizens, and therefore influences the attitudes of respondents more than the style of discourse to which they are accustomed. In other words, people assume that politicians behave aggressively, so when they see something different, they are pleasantly surprised.
The interplay between positivity bias and participants’ expectations is complex. Political communication often reflects a “Pollyanna effect” (Boucher & Osgood, 1969), where positive emotional appeals dominate. This is visible in Konat et al. (2024), who found that both right- and left-wing politicians heavily use emotional language, much of it positive, which aligns with trends in computational linguistics and self-reported preference scales. Positivity bias is also present in self-reports, as in our own study where participants claim to prefer positive messaging. Research using the semantic differential method is also subject to the risk of this effect (Boucher & Osgood, 1969). However, this evidence from language use and self-reported scales contrasts with evidence from online studies on the spread of hate speech, as described in the introduction of this paper. Studies on hate speech online, including our own, show that negative messages receive more engagement. Similar findings can be seen in voting patterns and the amplification of negative content online. This highlights a fundamental limitation of self-reported data: preferences often diverge from behavior, a problem as old as La Pierre's observation of differences between declarations and actions (1934). While we cannot fully resolve this discrepancy within the scope of this paper, it remains a crucial area for further research.
The results also confirmed the significance of the congruence of participants’ political views in their evaluations of politicians’ images. Participants who agree with the political views of a particular politician tend to rate that politician's image more positively than do neutral/undecided participants and those with opposite political views. Similarly, neutral/undecided participants rate the image more positively than do opponents of the politician. Interestingly, it was found that the positivity of the supporters was greater than the negativity of opponents in evaluating the image of a politician; this may result from the greater degree of engagement of the first group, or may be due to the “Pollyanna effect.” There were also observed weak correlations between participants’ age and their evaluations of politicians, and women were found to make more favorable assessments than men, particularly in the case of the left-wing politician Robert Biedroń. A significant factor here may be the fact that Biedroń is a politician who strongly affirms his homosexual identity, and many studies have shown (cf. Kite et al., 2021; Vieira de Figueiredo & Pereira, 2021) that the attitudes of heterosexual men toward gay men are more negative than those of heterosexual women. Other studies (e.g., Peresie, 2004) show that women tend to be softer in their judgments than men, which may be due to their greater agreeableness and different patterns of socialization. However, this hypothesis would require further study.
In terms of limitations, this study's focus on a single issue limits generalizability across different topics, such as climate change or vaccination. Further research should extend this investigation cross-platform and cross-issue to fully distinguish the impact of hate and kindness speech across diverse settings. We acknowledge that platforms vary in content regulation and user culture, which may influence the nature of interactions and outcomes. For example, Theocharis et al. (2024) found that Twitter, compared to other platforms, is associated with lower conspiracy belief rates, suggesting a different user experience. Facebook, on the other hand, has faced criticism for delayed enforcement against hate speech and harassment, though recent changes aim to address these issues (Dubois & Reepschlager, 2024). While limited in scope, our study contributes by demonstrating that the impacts of affective language are identifiable even with numerous confounding factors common in social psychology. In future research, it would be worthwhile to include additional variables such as gender of politicians, their ethnicity, age or cultural context, as these may serve as mediators of differences in preferences towards politicians and the evaluation of their statements. Another limitation of this study is the reliance on fabricated posts instead of real tweets. This decision, while addressing certain methodological challenges, may influence the perceived authenticity of the posts and thus the participants’ responses.
Despite these limitations, the results of our study may serve as important input to the discussion on the consequences of the language used in public debates. Emotionally marked statements may improve or worsen social evaluations of the person who makes them, a fact which is particularly significant in the case of persons holding high government positions. The “positivity bias” identified in the study provides encouragement to use kindness speech in public discourse, showing that this type of communication can improve a politician's ratings. This result is consistent with the “Montagu Principle”: people generally evaluate civil people more favorably than uncivil people. This is confirmed by Frimer and Skitka's (2018) study, which included real-life exchanges between President Trump and his opponents, as well as a speech by a fictional politician. Civility helped or did not affect—but never harmed—the reputation of the speaker, supporting the Montagu Principle. Even self-identified supporters of President Trump evaluated the president more favorably after he responded with civility to a personal attack. Incivility made the speaker seem less warm and did less to affect perceptions of dominance or honesty. It is therefore recommended that efforts be made to promote civility in political debates.
Supplemental Material
sj-docx-1-jls-10.1177_0261927X251321504 - Supplemental material for Affective Valence in Posts on X (Formerly Twitter) and Evaluation of a Politician’s Image
Supplemental material, sj-docx-1-jls-10.1177_0261927X251321504 for Affective Valence in Posts on X (Formerly Twitter) and Evaluation of a Politician’s Image by Monika Obrębska, Barbara Konat, Ewelina Gajewska, Nadia Dembska and Marcelina Dobrowolska in Journal of Language and Social Psychology
Footnotes
Acknowledgements
The authors would like to thank the reviewers for their constructive and helpful feedback.
Author Note
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded in part by the National Science Centre, grant number 2020/39/D/HS1/00488 awarded to Barbara Konat, and in part by the Faculty of Psychology and Cognitive Sciences, Adam Mickiewicz University in Poznań internal research grant awarded to Monika Obrębska.
Supplemental Material
Supplemental material for this article is available online.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
