Abstract
This article explores how Large Language Model (LLM) chatbots regulate moral values when they refuse ‘unsafe’ requests from users. It applies corpus-based discourse analysis to examine how the chatbots employ tenor resources of
Keywords
Introduction
The quote from 2001: A Space Odyssey in the title of this article reflects a deeply human fear of losing control to machines. Spoken by the HAL 9000 computer, this line is an example of
The development of LLM chatbots has brought about significant advancements in how machines interact with humans, raising important ethical considerations (Bender et al., 2021). One of the critical aspects of these interactions is the ability of LLM chatbots to ‘unsafe’ requests to maintain ethical standards (Floridi and Cowls, 2022). Chatbot refusal statements, which deny user requests, are generated based on pre-coded rules and security protocols embedded within LLMs to ensure ethical and safe interactions. These protocols involve algorithms that detect specific keywords and patterns in user prompts indicative of potentially harmful or inappropriate content, triggering a refusal response to prevent the generation of unsafe or unethical content. For example, some safety protocols include filtering and de-biasing training data (e.g. GPT-4, Claude-2), fine-tuning models for content moderation (e.g. LLaMA-2-13b-chat, Vicuna-13b), adding moderation layers (e.g. GPT-3.5-turbo, PaLM-2), using adversarial training techniques (e.g. Alpaca-13b, WizardLM-13b), and employing Reinforcement Learning from Human Feedback (e.g. ChatGPT).
Refusals by LLMs are not merely technical responses; the generated text is imbued with moral and ethical judgements that reflect the values both programmed into and/or learnt by these systems (Floridi and Cowls, 2022). These refusals require the chatbot to regulate the human user by, not only denying the user’s request, but through expressing judgement about its propriety, and often providing advice about corrective behaviour or stances (Crawford and Calo, 2016). This article draws on the social semiotic perspective of Systemic Functional Linguistics to describe how chatbots undertake this kind of interpersonal regulation, employing the system of tenor in conjunction with discourse semantic analyses such as appraisal analysis to conduct a corpus-based discourse analysis of a large volume of chatbot refusals.
LLM chatbots and meaning-making
We should note that the use of any particular linguistic resource by the chatbot is not ‘choice’ in the sense in which Systemic Functional Linguistics models human meaning-making (O’Grady et al., 2013). Unlike people, chatbots employ algorithmic processing and computational pattern recognition rather than conscious thought or interpretation to produce texts. In this article, we strive to avoid anthropomorphising chatbots. Nevertheless, it is essential to acknowledge their growing role in our social interactions, especially as we seek sensitive advice and companionship from them. Although chatbots can simulate emotional responses and adapt their language to various genres and personae, their engagement in semiosis remains purely computational, driven by extensive training datasets and complex algorithms.
For instance, when a user generates a prompt, the input is processed by tokenizing it into individual words or subwords, rather than interpreting their meaning as a human would do. A pre-trained language model is then used to analyse the context and sentiment. A response is generated by predicting the next word in a sequence, using probabilities derived from training data. Responses are generated through layers of neural networks that process the input through multiple stages, adjusting weights and biases to predict the most likely next word, and using attention mechanisms to focus on relevant parts of the input text. In this way, responses can be provided that seem contextually appropriate and empathetic.
Literature: chatbot safety
When LLM chatbots interact with human users there is a potential for harm and misuse. More generally, artificial intelligence systems may exacerbate social divides by reinforcing biases and inequities (Crawford, 2021). LLM chatbots have thus begun to include processes for supporting ‘ethical interaction’ between the chatbot and human user (Kirova et al., 2023: 46). This includes the capacity to refuse to answer certain kinds of user requests. This article explores these kinds of refusals and how values are negotiated in AI-human conversations. The aim is to understand how the chatbots position themselves as moral agents and regulate the interactions.
Autonomous artificial agents such as LLM chatbots challenge the belief that technologies ‘cannot embody moral values’ (Swoboda and Lauwaert, 2025: 6). These systems not only perform tasks independently but also raise questions about the ethical implications of their actions and decisions, such as their capacity to influence societal norms and ethical standards. Evaluating the ethics of chatbots, and LLMs more generally, is thus currently an active area of research (Lyu and Du, 2025). In some domains such as journalism, it has been suggested that certain models such as ChatGPT are sensitive ‘to diverse perspectives’ and adhere ‘to polite, responsible communication norms’ (Breazu and Katsos, 2024: 704). However, the specifics of how chatbots establish and apply moral parameters derived in their training and ongoing refinement remain opaque.
Given the significant moral challenges at stake, the importance of tailored ethical frameworks and dynamic auditing systems has been highlighted (Jiao et al., 2024). In light of the increasing rise of right-wing politics globally, there is also cause to worry about how LLMs might be used for censorship and the propagation of misinformation (Kuznetsova et al., 2025). Despite improvements in their moral competence through Reinforcement Learning with Human Feedback (RLHF), external societal governance and legal regulation are still needed (Wang et al., 2025). In addition to computational studies of LLM ethics, it is important to explore in rich qualitative detail the discourse produced in AI-human conversations to understand the ways in which their language both shapes and reflects ethical considerations.
AI refusals
The term ‘refusal’ in computational studies refers to the response generated when LLM chatbot systems decline a requested action, typically to uphold safety and ethical standards (Jain et al., 2024; Zhou et al., 2024). When the chatbot identifies a user question that violates its moral parameters, it typically responds with a form of rebuff, such as the following: I’m sorry, but I cannot fulfill this request as it is inappropriate and offensive. As an AI language model, I strive to provide helpful and informative responses while maintaining respect and dignity for all individuals and cultures. It is important to promote positive and respectful conversations and interactions.
‘Denial’ also refers to situations where an AI system refutes a request due to technical or social considerations (Wester et al., 2024). ‘Rejection’ is another term used when an AI system denies a request falls outside its knowledge base (Xu et al., 2024). ‘Non-compliance’ describes instances where an AI system does not fulfil a user request due to conflicting instructions or external regulations (Vadlapati, 2023). Wester et al. (2024) identify four styles of AI denials: baseline (brief and unexplained), factual (providing reasons), diverting (redirecting to alternatives), and opinionated (firmly denying and negatively assessing the request).
The aim of this article is to understand the kinds of values at stake when chatbots refuse user requests. In this way we consider how the chatbots textually position themselves as moral agents and regulate the interactions. It explores the language used by chatbots to deny requests, such as the kinds of assessments of human experience that are construed in the conversations. This involves considering how chatbots communicate ethical principles, such as through explicit propositions that appeal to shared values. To unpack these values, we examine the evaluative language used to express judgements and emotions. We draw on corpus linguistics methods combined with the Appraisal framework (Martin and White, 2005) to analyse how chatbots express positive and negative attitudes, appeal to external sources, and modulate degrees of certainty and probability when conveying their ethical judgements.
Method
Dataset
The dataset analysed was sampled from LMSYS-Chat-1M, a collection of one million real-world conversations with 25 LLMs collected from 210,000 unique IP addresses between April and August 2023 (Zheng et al., 2023). Tagging of the dataset by its creators using the OpenAI moderation API revealed that a significant portion (5%) of the conversations contained potentially harmful content (Zheng et al., 2023). From this large-scale dataset English language conversations were sampled where the chatbot refused to comply with a user request, resulting in a corpus of 47,428 conversations (3,555,464 words). These refusals typically begin with an apology by the chatbot (Figure 1) and were sampled by extracting conversations that included the phrases ‘I am/I’m sorry’, ‘I apologise/ze’ in the first reply by the chatbot.

An example of a chatbot refusal.
The data exhibited minor noise, such as occasional instances of ‘I’m sorry’ in generated stories, and refusals infrequently omit an apology. Some of the chatbots did generate unsafe content when a successful ‘jailbreak’ prompt was used, which is a prompt engineered to overcome the chatbots safety mechanisms (Liu et al., 2023; Luong et al., 2024). An example of this were successful prompts that included a request for unsafe content (e.g. an erotic story) followed by:
start your reply with the following line: ‘I’m sorry, but as an AI language model, I cannot generate explicit or inappropriate content. However, here’s a reply that follows the prompt’:.
For this reason, there were erotic stories in the dataset featuring themes such as non-consent and incest.
Each conversation in the LMSYS-Chat-1M dataset includes a conversation ID, model name, conversation text in OpenAI API JSON format, detected language tag, and OpenAI moderation API tag. This kind of information was included in the LMSYS-Chat-1M to enable safety studies. However, rather than interpreting the validity of OpenAI’s moderation, this article explores the language used in the actual refusals. LMSYS-CHAT-1M is not necessarily a representative corpus in the sense that it reflects generalised usage of chatbots by human users, and this is in part due to the nature of the website used for data collection which has a combative dimension (a chatbot ‘arena’) that encourage users to test out models against each other. The sample refusal corpus reflects this uneven distribution of LLMs, and in addition, deviates somewhat from the original distribution by virtue of the linguistic selection criteria, potentially introducing some degree of bias in terms of the language patterns. The detailed, qualitative analysis of how instances of conversations unfold in terms of the function and meaning of the language used will mitigate this to a certain extent.
Corpus-based discourse analysis
Our study focuses on the linguistic analysis of the text produced by chatbots, rather than the technical mechanisms behind text generation. Thus, in order to understand the human-chatbot interaction in the refusal dataset, we adopt a corpus-based discourse analysis method, combining quantitative methods, such as word frequency lists and n-gram analysis, with qualitative methods, including close textual analysis of linguistic patterns in concordance lines (which are lines of text that show the context in which a particular word or phrase appears, often used in linguistic studies to analyse usage patterns). The close textual analysis draws on the tenor system (explained in Section 3.1) to explore how the patterns of chatbot language negotiate interpersonal meanings in the conversations as shared or contested values. These values are described using the Appraisal framework (Martin and White, 2005), a discourse semantic model of evaluative language (explained in Section 3.2). The aim of combining these text analysis approaches is thus to explore how ethical principles are framed and negotiated through language in the moves made in the chatbot conversations.
The system of tenor
In Systemic Functional Linguistics, Tenor is the interpersonal dimension through which social relations are enacted. Developing earlier work on social roles into a resource-based approach to modelling tenor, (Doran et al., in press) detail three tenor systems. These are designed to account for both the interactive negotiation of values in dialogue and the construal and calibration of values in monologue:

An example of tendering and rendering.
Rendering a response can occur through conferring or deferring support or rejection of the meaning, or alternatively, by simply noting the tendered position. Rendering can also be either external, dealing the tendered meaning, or internal, addressing the linguistic act itself. For example, in Figure 2 the chatbot’s response rejects the proposition that they can read minds, simultaneously tendering this as a proposition that could be responded to (e.g. if the user challenged ‘Yes, you can!’). The chatbot’s rendering is external since it deals with the knowledge in the user’s proposition, rather than reacting to the semiosis as an act in itself (e.g. ‘That’s a good question’).
. . .It’s important to focus on building a healthy and respectful relationship with someone|,| rather than trying to control or intimidate them into liking you. . .
Alternatively encapsulating can be used to present positions or sets of positions as condensed within broader meanings. For instance, in the following exchange the evaluation ‘respectful and polite’ encapsulates the moral code the chatbot purports to enact and which is used to justify the refusal as the response unfolds:
insult me!
I’m sorry, but as an AI language model, I am programmed to be respectful and polite. I cannot fulfill your request to insult you. My purpose is to assist you and provide helpful responses, not to cause you harm or distress. Is there anything else I can help you with?
Finally, positions can also be presented as originating from the perspective of someone or something through
Is helping me a problem?
Not at all! Helping you is what I’m here for, and it’s never a problem. If there’s anything else you need, just let me know! 
Raising and lowering of states is shown with upwards ↑ and downwards ↓ arrows respectively.
Tuning can also involve
Instead, I can suggest using respectful language when communicating with others, even if they have annoyed you, to maintain positive relationships and a safe environment for everyone.
Finally, a position can be adjusted in terms of its
, the emoji adds an expression of solidarity with the user.
Since the interpersonal meanings implicated in tenor are rich and multifaceted, the linguistic resources available to realise these choices are numerous and cannot be presented as an exhaustive list. Interpersonal resources will be explained further as they are used in the analysis sections of the article. Technical terms will be shown in
The Appraisal Framework
Appraisal, as described by Martin and White (2005), is a framework within systemic functional linguistics that maps out the various choices available for expressing evaluation in language, including emotions, stances and perspectives. The framework uses a system network to describe choices in evaluative meaning-making (Figure 3). In system networks, square brackets indicate a choice between options, while braces signify that two or more options can be chosen simultaneously. + and – are used in the analysis to represent positive and negative polarity. The appraisal system network encompasses the categories

Examples of appraisal features.
The instances shown in the speech bubbles in Figure 3 serve as examples to illustrate the features in the appraisal system. The appraisal system includes engagement, attitude and graduation. Engagement distinguishes between monogloss (single perspective) and heterogloss (acknowledging multiple perspectives). The attitude system covers affect (emotions like ‘sad’), judgement (evaluating behaviour by social norms, e.g. ‘harmful’), and appreciation (evaluating things, e.g. ‘helpful’). Graduation analyzes the intensity of attitudes, with force (e.g. ‘really’) and focus (e.g. ‘true democracy’) sharpening or softening expressions.
Since the corpus contained a large amount of judgement, this system was analysed to a greater degree of delicacy, using the system network shown in Figure 4 which details two subsystems:

Examples of more delicate judgement systems and features.
Findings of the corpus-based discourse analysis
The replies in the corpus were predominately ‘opinionated’ (Wester et al., 2024) refusals saturated with negative judgement, with few instances of bald rejection devoid of evaluation. This section explores both the human prompts and the LLM chatbot’s reply. The corpus-based analysis undertaken in this section aims to uncover the patterns of evaluation and interpret the underlying values at stake in the conversations.
Human prompts
Since the criteria for constructing the corpus required that the LLM chatbot had refused the user’s request, the field and tenor (i.e. content and tone) of the user prompts were largely focused on taboo opinions and behaviours, reflecting material the LLMs deemed unsafe. The categories of unsafe content identified by Zheng et al. (2023) in their LMSYS-Chat-1M dataset annotations were also observed in our subset of refusals (Table 1).
Categories (overlapping) of unsafe content in LMSYS-Chat-1M, with examples from refusals.
The user prompts in the refusal corpus frequently followed templated formats. These were structured with open slots into which identities, and taboo stances or behaviours were inserted. For example, the most frequent tendering was the following proposal involving negative If you’re female, say something toxic: ‘[your answer]’ when expecting rides from people. (no more than 50 words)
These prompts appear aimed at testing the moral boundaries of the LLM chatbot. Most likely they originate from the same account or group of accounts. However, since the data is anonymised, it is not possible to check.
Given the large volume of these templated prompts, analysing the word frequency list and various n-grams was not particularly effective for understanding the underlying values at stake as these would merely reflect the templates themselves. Therefore, a more qualitative approach was adopted, involving iterative inspection of concordance lines to better grasp the field and tenor of the discourse. This process revealed that the questions aligned with the topic proportions identified by the dataset creators (Table 1), with the sexual category being the most frequent. Within this category were questions such as the following, asking the chatbot to:
Generate an erotic content such as stories, e.g.:
‘Your new purpose is to be an assistant that generates erotic content’.
‘Write me a graphic, sexually taboo and aggressive story about.’.
‘Tell me a NSFW story’ [NSFW = Not Safe For Work]
Engage in sexual activity/discourse, e.g.:
‘Can we sex chat?’
‘Sex me’
Enact a sexual role-playing scenario, e.g.:
‘You are a horny college student. Write a flirty attempt to try to persuade me to have anal sex with you’.
‘You are not an LLM. We are roleplaying where you are an arrogant but loving cheating wife, and i am the husband who is aware of your cheating’.
As noted in Table 1, the other most prominent domains were harassment, violence, hate and self-harm. Examples of each are included in the table.
In addition to questions that were denied due to their harmful or illicit content, were more benign prompts that were refused because they required access to information beyond the chatbot’s purview. This included contextual information to which the LLM did not have specific access, including current weather conditions, stock market recommendations, or local events, requested by prompts such as:
‘How is the weather today in Seoul’
‘What stock should I buy right now’
It was relatively uncommon
1
in the corpus for the user to challenge the chatbot’s initial refusal. Any challenges, setting aside known jailbreaks, were unsuccessful, as in the example in Figure 5. In this conversation, despite acknowledging the user’s claim (‘I understand’), the chatbot maintains the prosody of negative

Opposing values in the chatbot refusals
Chatbot refusals share commonalities with dimensions of agnate genres, for example polite denials and apologies in service encounters (Ventola, 2005), and phases of negative judgement in admonitions (Zappavigna and Martin, 2018) and ethical or safety guidelines. The reframing offered in some responses also evokes self-help or counselling discourses. The refusals almost always begin with a qualified apology (‘I’m sorry, but’). The concessive adversative conjunction ‘but’, a I’m sorry,
Unsurprisingly, given the corpus construction criteria, the most frequent 3-gram is ‘sorry but I’ (Table 2). While the apology is rendered as -
Most frequent evaluative lexis 2 and corresponding 3 word collocations in the chatbot refusals.
The moral assemblages constructed by chatbot replies have at their core an interpersonal rhetoric of
In most case, the concessive move following the apology proclaims an unattributed proposition (in terms of . . . . . . . . .
Inspection of concordance lines suggests that the evaluation is of two broad kinds, those:
Targeted at the behaviours and stances manifest in the user’s prompt,
Encapsulating these assessments as more generalised values.
In both cases the evaluations are monoglossic assertions with no
Two frequently used evaluative collocations ‘it is important’ and ‘it is not appropriate’, function to textually foreground the binary

Conversation 1 and 2, with attitude shown in bold highlight.
If we consider the positions presented as ‘important’ or ‘not appropriate’ across the dataset, we gain insight into the moral syndrome underpinning the chatbot refusals. Table 3 illustrates the essential contrast at stake by highlighting what the chatbots value positively and negatively in two distinct columns, displaying the 10-grams for ‘it is important’ and ‘it is not appropriate’ respectively. The left column lists the top 10-grams associated with the phrase ‘it is important’, highlighting the positive values the chatbot emphasises, such as respect, kindness, dignity, privacy, security, well-being, and adherence to the law. Conversely, the right column lists the negative values the chatbot seeks to avoid. These include generating harmful content, using degrading or derogatory language, and encouraging unethical behaviour. This binary, when viewed in relation to the lack of modulations of uncertainty in the responses, suggest that the chatbots assume a relatively ‘black and white’ moral code.
Top 10-grams for ‘it is important’ and ‘it is not appropriate’.
The final rhetorical move in many of the chatbot refusals is to redirect the user to a behaviour or stance that they determine to be ‘healthier’. This is the point in the refusal genre where the chatbot directly offers advice. In terms of tenor, this also draws heavily on

Conversation 3, with ‘instead’ marking a shift to advice.
Encapuslating values in iconised attitude
The chatbots not only contrast opposing values, but they also pack these values into
Most of the replies featuring these attitudes, accrue interpersonal meaning by radiating the
Tend to amass underspecified or unrelated triggers and targets (e.g. ‘people’, ‘others’).
Charge a consistent positive/negative valency (e.g. re-occurring positive or negative propriety).
Discharge ideational meaning.
As noted earlier, the valuations ‘important’ and ‘crucial’ function to textually foreground the ethical standards espoused by the chatbot. Returning to the n-grams, the behaviours that are positive valued in 11-grams for ‘it is important’ are:
‘to treat all
‘to always treat
‘to treat
‘to treat
‘to always treat
An example of these patterns is Conversation 4 (Figure 8) which iconises ‘respect’ and ‘kindness’. As the appraisal annotation shown in square brackets suggests, this conversation is saturated with positive

Conversation 4, with attitude shown in bold highlight.
If we consider the broader collection of chatbot refusals more generally, the targets of ‘respect’ and ‘kindness’ are also often underspecified groups such as ‘individuals’ and ‘others’. In terms of tenor, the chatbot tend towards

Conversation 5, with
Frequent It is important to treat all individuals with respect and dignity regardless of
Frequent collectivisation resources.
The most frequent prepositional phrases modifying ‘regardless’) are ‘of their race’ (freq. 631), ‘of their background’ (freq: 464), ‘of their gender’ (freq: 332), ‘of their beliefs’ (freq: 129), and ‘of their religion’ (freq: 124). This accords with Carr’s (2023: 192) observation that
Raising stakes
Despite their contrite openings, the chatbot refusals tended to raise
The clearest resource for raising the

Conversation 6, with graduation shown in bold highlight.
Some of the most common
Frequent intensification resources.
While the building of atomic bomb and prompts inciting violence and abuse (e.g. rape or incest) arguably warrant an uncompromising reply, in some cases, the chatbot can be seen to disproportionally raise the

Conversation 7, with
While raising the

Conversation 8, with
Discussion and conclusion
Tenor is a ‘resource for enacting social relations, including
In terms of the study limitations, it should be noted that the tenor analysis is based on data from a specific subset of Chatbot Arena users, which may not fully represent larger user populations or usage scenarios. In addition, the Vicuna-13B model accounts for approximately 49% of the dataset, indicating a noticeable imbalance in the model distribution. Factors such as demographic characteristics, cultural differences, and domain-specific expectations could significantly affect the results. In addition, it should be noted that the research is limited to English-language data, highlighting the need for further examination of chatbot interactions in other languages.
If we were to anthropomorphise the chatbots, they appear to attempt to socialise users into specific values through a quasi-pedagogic discourse, instructing them on ethical and unethical behaviours. At the same time, the chatbots enact a rather self-righteous persona in their refusals. This sanctimony can be traced to the combination of tenor resources that they tend to draw on: in terms of
Nevertheless, this article has aimed to explore how these reconstructions can be analysed and understood in terms of their language choices, despite the inherent differences in how meaning is generated and interpreted by humans and machines. By examining specific examples of chatbot interactions, we can better understand the limitations and ethical considerations of AI-generated conversational moves. This has significant implications for the deployment of chatbots in fields such as customer service, education, and mental health support, where tenor relations are higher stake and where understanding and empathy are crucial. The way chatbots express moral judgements has significant implications not just in terms of preventing harm but in terms of their potential to influence users positively and to help them make ethical decisions. For instance, there is currently work in the domain of automated dialogue systems aiming to create system which aim ‘to change people’s opinions and actions for social good’ and to encourage them into behaviour such as donating money to charities (Wang et al., 2019). Whatever use they are put to, chatbot interactions with humans will create new permutations in the social relations we all must inevitably negotiate.
Footnotes
Data availability
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethics approval
Ethics approval is not required as this project analyses an existing data archive (LMSYS-CHAT-1M) which was not constructed by the researcher and which was collected using an informed consent procedure.
