Abstract
Advancements and deployments of AI-based systems, especially Deep Learning-driven generative language models, have accomplished impressive results over the past few years. Nevertheless, these remarkable achievements are intertwined with a related fear that such technologies might lead to a general relinquishing of our lives’s control to AIs. This concern, which also motivates the increasing interest in the eXplainable Artificial Intelligence (XAI) research field, is mostly caused by the opacity of the output of deep learning systems and the way that it is generated, which is largely obscure to laypeople. A dialectical interaction with such systems may enhance the users’ understanding and build a more robust trust towards AI. Commonly employed as specific formalisms for modelling intra-agent communications, dialogue games prove to be useful tools to rely upon when dealing with user’s explanation needs. The literature already offers some dialectical protocols that expressly handle explanations and their delivery. This paper fully formalises the novel Explanation–Question–Response (EQR) dialogue and its properties, whose main purpose is to provide satisfactory information (i.e., justified according to argumentative semantics) whilst ensuring a simplified protocol, in comparison with other existing approaches, for humans and artificial agents.
Keywords
.Introduction
The surging interest towards Large Language Models (LLMs) rests upon the ambition of programming machines capable of communicating like human beings. Generally, language models (LM) tackle this challenge by studying the generative likelihood of word sequences so as to predict the probabilities of subsequent tokens in such sequences. LLMs can be identified as the last stage of a progressive development in language models research [96]: starting from statistical LMs (e.g., n-grams models and Markov chains) and moving through neural LMs (e.g., recurrent neural networks), a significative milestone has been achieved with the introduction of the pre-trained LM paradigm (mostly based on the concurrent rise of Transformer models [84]). Scaling up those pre-trained models resulted in LLMs, i.e., large(-sized pre-trained) language models displaying surprising ability in solving complex tasks [96]. The famous family of ChatGPT systems1?> is a noteworthy application of LLMs’s ability to produce human-level conversations.
Nevertheless, according to the study conducted by Hinton and Wagemans [42], the output of generative large language models as GPT-3 [14] (one of the foundational models upon which ChatGPT is based) does not provide satisfactory argumentative replies. The authors elaborate on such a conclusion after a thorough application of the Comprehensive Assessment Procedure for Natural Argumentation (CAPNA) protocol [41]. GPT-3 is able to produce different argument types (thus identifying common human dialectical patterns), but it fails when it comes to providing their acceptability, mostly generating fallacious arguments. The entailed consequence is that the capability of arguing, intended as an exchange of reasoning between intelligent entities, should be learnt by AIs if their purpose aims for more than just acquiring and repeating information. Overall, there is a current urge to provide clear explanations about what drives AIs decisions. This is also advocated by the technical report that OpenAI released about GPT-4 performances: “Despite its capabilities, GPT-4 has similar limitations to earlier GPT models: it is not fully reliable (e.g. can suffer from “hallucinations”), has a limited context window, and does not learn from experience. Care should be taken when using the outputs of GPT-4, particularly in contexts where reliability is important” [68]. In particular, the same authors encourage more investigation into AIs’ explainability given their current nature as “black-box” models.2 OpenAI researchers have already started to face the challenge of explaining single neurons within a deep learning network, however, there are several limitations that should be addressed in the future [9].
XAI is the field that studies how to improve artificial intelligence models’ interpretability by analysing different tools and strategies that provide adequate explanations (i.e., that satisfy specific desirable properties). Interestingly, different works propose an account of explanations that is primarily argumentative [1,24]. Similarly, the survey of [83] concludes that using argumentation to justify why an event started, or what led to a decision, can enhance explainability. These intuitions are also backed by [60], where it is suggested that AI systems should adopt an argumentation-based approach to explanations consisting of dialogue protocols characterising the interactions between an explainer and an explainee. Such a dialectical interplay, sketched in [60] as the Explanation–Question–Response (EQR) dialogue, will be developed here into a fully-fledged formal model. Combined with LLMs or any other AI-driven model, the EQR protocol would provide an informative method to deliver deliberated explanations to end-users, while also ensuring detailed replies to follow-on queries. Indeed, the underlying computational argumentation engine supplies strong rationales that justify the given explanations.
The paper is structured as follows. Section 2 covers the background topics leveraged throughout the article, whilst Section 3 provides a detailed description of the EQR protocol which greatly extends that of [16,60], including syntax, semantics and an overview of the involved utterances, and lays the groundwork for the full formalisation (connected with computational argumentation extension-based approach) that is given in Section 4. Section 5 discusses and analyses the results achieved and compares them with related works before examining potential future lines of research. Lastly, Section 6 concludes the paper with a summary of the contents and the main findings.
.Contributions
The main contribution of this study is the development of the complete formal model of EQR protocol, based on the idea from [16,60], and its theoretical analysis. Though the idea of an EQR dialogue is not new, all [60] provides is the basic concept of a dialogue that conveys an
.Background
In this section, we are going to outline the main theoretical subjects employed for the generation of EQR formalism and its main features. As we will see, computational argumentation and dialogue games can be combined to serve as a powerful XAI tool.
.Computational argumentation
A promising paradigm for modelling reasoning in the presence of conflict and uncertainty, computational argumentation has come to be increasingly central as a core study within Artificial Intelligence [6]. According to such theory, in order to determine if a piece of information is acceptable (which, roughly speaking, means that it can be treated as if it is true given the current state of an agent’s beliefs), it will suffice to prove that the argument, in which the considered information is embedded, is justified (which, roughly speaking, means that it is not seriously contested) under specific semantics. Dung’s abstract argumentation framework (AF) [29] is the most widely adopted formalism for determining the acceptability of arguments. In a nutshell, an argument is justified (i.e., acceptable) only if it is defended against any attacks from counterarguments.
(AFs, and Dung’s Semantics)
An argumentation framework is a pair A conflict free set An admissible extension The least complete extension (with respect to set inclusion) is called the grounded extension; A maximal complete extension (with respect to set inclusion) is called a preferred extension.
In the previous definition, we could describe Z as the argument that defends X, thus granting its acceptability. Given sufficiently large AFs, indirect defences might also occur. That is to say, given a sequence of acceptable arguments ending with X and starting with A, it can be the case that X (indirectly) defends A.
(Indirect defence and attack)
Let a b a b
Notice that an unattacked argument is, trivially, defended and indirectly defended by itself. A similar recursive definition can be described such that X indirectly attacks an argument A if:
Dung’s AF can be extended such that ‘preferences’ can be taken into account as well. It can be useful indeed to have a way of deciding, among two or more conflicting arguments, which ones are preferred, hence, which attacks will succeed as defeats. This leads to the formal definition of defeats:
(Defeats)
Let
Overall, abstract AFs represent general frameworks capable of providing argumentative characterisations of non-monotonic logic.3 That is to say, given a set of formulae Δ of some logical language L, AFs can be instantiated by such formulae. The conclusions of justified arguments defined by the instantiating Δ are equivalent to those obtained from Δ by the inference relation of the logic L. These instantiations paved the way for a plethora of different studies concerning the so-called ‘structured’ argumentation (as opposed to the abstract approach) [7,66,81].
.Dialogues games
The view of computation as distributed cognition and interaction [53] led to the rise of multi-agent systems, where agents are software entities with control over their own execution. This new paradigm required the design of means of communication between such intelligent agents [59]. The choice fell upon formal dialogues, due to their potential expressivity despite being still subject to specific restrictions. Dialogue games are rule-governed interactions among players (i.e., agents) that take turns in making utterances (i.e., moves) following the protocol of the game.4
.Types of dialogue
Dialogue games are commonly categorized according to elements such as: what the participants know, what the participants seek to get from the dialogue, and what the dialogue rules are intended to bring about [11]. The following is an extended list (with no ambition of being exhaustive) of the standard dialogue types presented in [89]:
Information-Seeking: one participant seeks the answer to some question(s) from another participant, who is believed by the first to know the answer(s). On the other hand, its complementary Information-Giving dialogue is characterised by a protocol that focuses on the participant who gives the response to the requesting interlocutor (e.g., [44]). Inquiry: the participants collaborate to answer some question(s) whose answers are not known to any one participant (e.g., [10]). Persuasion: one participant seeks to persuade another to accept a proposition she does not currently endorse. This can mean that the persuadee holds the opposite or is agnostic about the position put forward by the persuader. (e.g., [70]). Negotiation: the participants bargain over the division of some scarce resources. If a negotiation dialogue terminates with an agreement, then the resource has been divided in a manner acceptable to all participants. (e.g., [62]). Deliberation: the participants collaborate (hence, share the responsibility) to decide what action or course of action should be adopted in some situations. Appeals to value assumptions, such as goals and preferences, may influence the agents’ deliberation (e.g., [55]). Eristic: the participants quarrel verbally as a substitute for physical fighting, aiming to vent perceived grievances (e.g., [13]). Verification: one participant seeks the answer to some question from another agent. The first participant wants to verify if the second believes that p (i.e., the proposition with which the dialogue is concerned) is true (e.g., [22]). Query: one participant always challenges the answer about p from another agent. The first participant’s interest lies more on the second’s arguments rather than if the second agent believes p or not (e.g., [22]). Command: One participant tells another what to do. If challenged, instructions may be justified, possibly by referencing further actions which the commanded action is intended to enable (e.g., [34]). Education: One participant wants to teach another something. Unlike information-seeking dialogues, in education dialogues the tutor, or asking agent, does know the answer to the question she is posing (i.e., she is quizzing the learner). (e.g., [79]). Discovery: A new idea arises out of exchanges between participants. Unlike inquiry dialogues, here the focus is on the discovery of something not previously known. The question of whose truth is to be ascertained may (or not) emerge in the course of the dialogue (e.g., [56]).
.Dialogues combinations and control layers
In general, a dialogue game can be composed of multiple mixtures of dialogues, each of which might be of a different type. Drawing from the classification detailed in [57], we can identify the combination patterns listed in Table 1.5 The selection and transitions between different dialogue types can be rendered via a Control Layer [57], defined in terms of atomic dialogue-types and control dialogues. The first element is based upon a finite set of dialogue-types. Control dialogues, instead, are dialogues that have as their discussion subjects not topics but other dialogues. They include the so-called Commencement and Termination Dialogues in charge of opening (respectively, closing) the subject dialogue, thus contributing to the management of dialogue combinations and their transitions.
Dialogue combinations. These are familiar from computer programming, specifically they are the operations used to model computer programs in dynamic logic.
Dialogue combinations. These are familiar from computer programming, specifically they are the operations used to model computer programs in dynamic logic.
Following the study outlined in [59], we can now summarize the three main features of formal dialogues: syntax, semantics and pragmatics.
Syntax. The syntax of a language prescribes instructions on how to form words, phrases and their combinations. Similarly, determining the syntax of a dialogue game involves the specification of the utterances available to the agents and the rules that govern the interactions among such utterances. In addition, it is standard to consider utterances as composed of (1) an inner layer comprising the topics of discussion and (2) an outer (or wrapper) layer comprising the locutions.
Semantics. Research concerning dialogue games is at a crossroads between multiple fields of study. Indeed, the interplay among participants in the dialogue is a form of communication that draws from human linguistics knowledge. However, the language must also be necessarily formal and interpretable by computers (an issue tackled by research such as [91]). It might then be helpful to consider different types of semantics according to the specific focus, and final deployment, of the dialogue.
Axiomatic: defines each locution in terms of its pre and (possibly) post-conditions. Pre-conditions identify what must exist before the locution can be uttered, and post-conditions determine the consequences of such utterance. Public axiomatic approaches enable access to all conditions from each agent in the dialogue, whereas private axiomatic approaches restrict such access to a smaller subset. Operational: considers each locution as a computational instruction that operates successively on the states of some abstract machine. That is to say, it interprets these locutions as commands in some computer programme language. Denotational: assigns, for each element of the language syntax, a relationship to an abstract mathematical entity, its denotation.6
While the dialogue unfolds, agents usually incur commitments. That is to say, a speaker asserting the truth of a statement, may be committed to justifying such statement (even if it does not correspond to their real beliefs) against opponents’ challenges or retract its assertion. The commitments of all the agents are then tracked and stored in a public database, called a commitment store. This position adopts Hamblin’s understanding of commitments as purely dialectical obligations [37]. Walton and Krabbe consider instead commitments as obligations connected to a course of action that subsumes under this paradigm also dialectical commitments: “
Pragmatics. Pragmatics deals with those aspects of the language that do not involve considerations about truth and falsity. Such aspects usually include the illocutionary force of the utterances along with speech acts, i.e., non-propositional utterances intended to or perceived to change the state of the world.7 More precisely, drawing from the analysis of [35] based on relevant literature on the topic, such as Austins and Searle’s works [3,75,76], we can define speech acts as ‘verbal actions’ that accomplish something. Locution would correspond to the simple performance of an utterance, whereas illocution would be the actual intention of the speaker behind the locution meaning. For example, the sentence “You’re standing on my foot” uttered in a crowded place is a statement (locution) with the illocutionary force of a command (that is to say, the real meaning is the imperative “move away”).
One last important aspect considered by the dialogue literature regards the so-called ‘burden of proof’. Multiple authors have investigated the matter and proposed different definitions. For instance, according to Walton [88], the burden of proof is “an allocation made in reasoned dialogue which sets a strength (weight) of argument required by one side to reasonably persuade the other side.”, whereas van Eemeren and Grootendorst [30] described it as occurring when “a party that advances the [dialogue] standpoint is obliged to defend it if the other party asks him to do so”. In general, we could say that participants in a dialogue incur a burden of proof when declaring a proposition as their thesis, thereby compelling them to offer evidence or backing when such a thesis is challenged. In an evenly matched dispute, where the plausibility of the participants’ thesis is balanced, any new argument moved may tilt the burden of proof. Nevertheless, in some specific circumstances, the burden of proof can be much heavier on one particular side. As an example, consider any criminal trial: the prosecutor must prove guilt “beyond reasonable doubt” to win her case, which means that she bears a greater encumbrance than her counterpart who does not have to show that their client did not commit the crime, but merely that there is reasonable doubt that their client did so. The notion of burden of proof may be considered as a dialectical obligation which is ‘stronger’ than the previously examined (‘weaker’) commitments. Indeed, while the latter always occurs in a dialogue, this is not the case for the former, as Walton concluded “If there is no thesis to be proved or cast into doubt
While computational arguments can be embedded in any dialogue game’s inner syntactic layer to handle the current topic, argument semantics can justify the rationale behind each utterance. On the other hand, leveraging dialogue game protocols and their combinations allow for the generation of efficient strategies on a variety of subjects, which proves to be a useful feature to exploit for the eXplainable AI (XAI) research field.
.Explainable AI
The design and implementation of tools capable of enhancing artificial intelligence models’ interpretability, thus addressing the well-known opacity of their black-box algorithms, constitutes the core focus of XAI. The underlying idea revolves around the possibility of generating exhaustive explanations disclosing salient information about systems operations [5]. Noticeably, Article 22 of the General Data Protection Regulation (GDPR)8 introduces the right to obtain an explanation of the inferences produced by automated decision-making models. The necessity to abide by this new regulation contributes to making explainability a current hot topic in the AI research landscape. Nevertheless, there is still no consensus on a unique definition of explanation [85], especially because most scholars seem to be influenced by their subjective intuitions of what an XAI approach should entail. To help clarify essential aspects of explanations by drawing from social science studies, Miller [63] identified contextuality (which is the product of merging different explanatory features, e.g., selectivity and causality of information) as the most relevant factor. Similarly, Bex and Walton [8] view explanations as speech acts used to help understand something. Gunning et al. [36] focus instead on pinpointing the current main issues regarding XAI and present them in a list that includes challenges such as: accuracy vs interpretability, the use of abstractions to simplify explanations or preference of competencies over decisions as core elements of information delivery. Another problem is related to the requirement of tailoring explanations to the end-user who is interacting with the system. From this perspective, consider that explanations can also be provided in a dialogical form: given an initial reason, additional information and answers to follow-on queries may be delivered while the dialectical interaction unfolds. This enables a collaborative process where the explainer is capable of determining what information it is that the user wants (i.e., tailored to the user’s needs). Furthermore, a study from Lakkaraju et al. argues that decision-makers largely prefer interactive (dialectical) explanations such that: “natural language dialogues for explainability could enhance the [AI] model’s understanding with greater ease than current one-off explanations” [50]. The EQR protocol herein presented will follow precisely this explanatory strategy.
.Desirable properties of explanations
A plethora of research from different scholars has studied the general properties that effective explanations should fulfil. For simplicity, we will single out the most intuitive (and, we claim, more desirable) of such features:
[ [ [ [
As we are going to see, the dialogical model fleshed out in the following sections enables explanatory interactions that satisfy all of the above properties. Notice that we also selected such features due to their extensive scope, which makes them suited for any kind of explanation. Nevertheless, different researchers may prefer to focus on more specific aspects of the XAI procedure (e.g., post-hoc explanations for AI-assisted human decisions [45], principles of interactive explanations via natural language interactions [50], etc.), thus identifying diverse (and narrower) properties from the one we defined.
We have already mentioned how evaluation and assessment of explanations are particularly suited to be modelled as dialogical interactions between an explainer, i.e., an agent capable and willing to answer questions concerning the explanation and an explainee, i.e., an agent seeking to determine the validity of such answers. The research conducted in [16,60] suggest that a new type of dialogue denoted as Explanation–Question–Response (EQR) might be helpful for explanation, and sketched it as being halfway between a persuasion, an information-giving/seeking and a query dialogue, without the need for any complicated shifting formalism (as the Control Layer) that would account for different simultaneous discussions taking place. As such, the EQR dialogue is engineered to provide a simple and efficient way to capture multiple kinds of dialectical interactions that might occur when the topic revolves around the explanation of an issue. In the following sections, we are going to comprehensively describe one possible Explanation–Question–Response dialogue that fits the high level description from [60], detail a formal account of this dialogue, along with a protocol, and prove that it provides explanations in a specific, technical, sense.
.EQR dialogue syntax
Each utterance of an EQR dialogue presents two syntactic layers: (i) an innermost layer in which the contents of the utterances are expressed in a formal way through propositional logic; (ii) an outermost layer which expresses the locutionary force of the single utterances. We, therefore, denote as ‘arguments’ the components of the former (that, indeed, can be rendered as computational arguments as per Section 2.1), whereas the outermost wrapping layer can be represented by listing all the possible locutions of the dialogue as detailed in the first column of Table 2 and Table 3. In particular, Table 2 depicts a series of structural locutions (such as
Locutions to control the dialogue.
Locutions to control the dialogue.
Locutions to unfold the dialogue.
Resolution of conflicts. While the dialogue unfolds, different kinds of conflicts may occur, each of which entails different possible resolutions. Such conflicts depend upon a specific class of dialectical attacks listed as [2]:
Assuming that every participant of the EQR dialogue pre-emptively agrees on the involved ontology (i.e., state of the world, language and logic employed), it is then possible to identify two forms of resolution: (a) value-preferred defeat or (b) rational disagreement. Both types of resolution require a means for evaluating defeats according to the ranked-value order of the players. For this purpose, it is thus possible to employ any computational argumentation theory capable of handling defeats. The rational disagreement is then formalised via the generation of two different (and conflicting) admissible extensions, each of which is related to the preference of one (team of) player.
An axiomatic semantics for the EQR dialogue presents the pre-conditions necessary for the legal utterance of each locution under the protocol, and any post-conditions arising from their legal utterance (second and third columns of Table 2 and Table 3). Such pre and post-conditions influence the commitment store of each agent participating in the dialogue. These commitment stores are intended according to Hamblin’s definition [37], i.e., public statements that the agents have to defend in the dialogue (unless withdrawn), but they might not correspond to the agent’s real beliefs or intentions.
.Turn structure and winning conditions
Having in mind the semantic pre and post-conditions of each locution of the EQR dialogue protocol, we can informally identify the ordered sequence of locutions that distinguishes every turn by a player. We can determine two parties of agents (which can also be composed of one element each) playing the dialogue: the proponent team (PRO), i.e., the explainer agent/s that advance(s) the explanation (say, X) which is willing to back with additional information; the opponent team (OPP), i.e., the explainee agent/s trying instead to challenge the proponent statement. The goal of OPP is to successfully attack X, the initial argument moved by PRO which, in turn, has to counter every such attack via elucidating replies. Notice that the purpose of the explainee is to retrieve data and understand the rationale behind the received explanation X (recall that an EQR dialogue is a mixture of persuasion, information-giving/seeking10 and query dialogues) rather than suggesting its own view on the subject. The ordered sequence of locutions can then be depicted as in Figure 1 and summarized as follows:
PRO is the first to play. Its first turn will consist only in the utterance of the locution OPP is the second to play. Its first turn is characterised only by the utterance of either A turn can finish only after:
(PRO’s turn) an attack has been moved, a question has been asked or a statement has been uttered. That is to say, if a locution (OPP’s turn) an attack has been moved, a question has been asked or a statement has been uttered. That is to say, if a locution The team to whom the attack, question or response of the previous turn was addressed must begin its current turn with the locution No player can perform more than one locution per turn except for the ones designed to control the dialogue (Table 2).

Ordered sequences of locutions describing the turns of each player. The dashed arrows denote moves that must be performed during the first turn of each participant only (e.g., in all the subsequent turns, the player will start from the locution
Winning conditions. In an EQR dialogue, where we need to assess the reliability of an explanation, the ‘burden of proof’ lies on PRO, the explainer. Indeed, it is the proponent who needs to show the validity of its initial argument and persuade its contender via compelling reasons, while for OPP it suffices to successfully attack or question it such that PRO cannot respond with other than a The proponent wins if the opponent leaves the dialogue.12 PRO has countered every possible attack/answered every possible question moved by the contender party which is now persuaded about the validity of the initial argument X. This means that OPP has uttered the locution The opponent wins if the proponent leaves the dialogue. OPP successfully attacks/inquires the initial argument X of the contender party raising at least one doubt about its validity. This means that PRO has uttered the locution The utterance of
In this section, we provide a formal description of an EQR protocol, and prove some of its properties.
.Protocol definition
Before defining the protocol, it may be useful to briefly outline the role of each locution and determine their functions within the EQR dialogue frame. In particular, most of them can be rendered as challenges towards other uttered arguments.
This locution aims at asking the ground on which it is the case that ‘something’ (say a) and not otherwise. As such, it can be seen as an argument attacking another argument on a. Except with PRO’s first turn (in which Intuitively, this locution denotes a refutation against the ‘something’ (say a) it is addressed. As such, it can be straightforwardly seen as an argument attacking another argument on a. The same reasoning can be applied to
The role of locutions such as
In the next sections, we are going to employ the following notation. The dialogue locutions and their conveyed arguments will be formally denoted by
Finally,
To clarify, let us consider a brief conversation as an example, and label each utterance with the corresponding notation of Remark 1. For simplicity, we are omitting the control locutions of Table 2.
The following natural language interaction can be thought of as a trivial instance of an EQR dialogue where the explainer (PRO) tries to justify their initial argument about the weather forecast.
Suppose the proponent decides to believe the opponent’s statement about Channel 717 (hence uttering the
We now have all the elements to formally introduce the dialogue protocol. Similar to a list of instructions, this protocol determines the legal moves that can be performed by the participants. The conversation unfolds as a result of the legal arguments uttered and terminates when there are no more valid moves available. When this happens, the status of the initial explanation X is evaluated.
Let PRO moves the first set of locutions, i.e., (a) (b) (c) OPP moves the second set of locutions, i.e., (a) (b) (c) If A generic PRO’s turn, i.e., (a) (b) PRO chooses one among:
(c) According to the choice of point (4.3(b)), PRO selects one among:
(d) No unattacked argument (e) A generic OPP’s turn, i.e., (a) (b) OPP chooses one among:
(c) According to the choice of point (4.4(b)), OPP selects one among:
(d) There exists an unattacked argument (e) Every argument (f) The turn of the first team having no more locutions (a) (b) (c)
Definition 4 formalises the moves available to each team of participants during an EQR dialogue. PRO starts the game and utters a specific set of locutions [(4.0)], after which it will be OPP’s turn to move [(4.1)]. Then, the two teams will alternate, uttering ordered lists of locutions in accordance with the previous moves and phases of the dialogue [(4.3), (4.4)]. Notice that both PRO and OPP will have to abide by the respective relevance conditions. These are rules that force the teams to change the (temporary) outcome of the game to their advantage, thus avoiding any detour. That is to say, at the end of its turn, PRO must have (directly or indirectly) defended the initial argument
The conclusion of the dialogue occurs whenever one of the two teams terminates its available move [(4.5)]. At this point, PRO will be proclaimed the winner if OPP leaves the dialogue first by uttering the locutions
The termination of the dialogue leaves us with a number of arguments that, along with the existing attacks, can be semantically evaluated. Since those arguments can be considered as being members of an AF (as defined in Section 2.1), we can thus show the following proof:
(Soundness and Completeness)
Let
By Definition 4 (in particular (4.3(b–d))), Arg is defended by PRO’s arguments, which, in turn, are indirectly defended by undefeated PRO’s arguments. Assume that the set Suppose that
Theorem 1 proves the semantic connection that exists between EQR dialogues and computational argumentation which provides an additional formalism to establish the rationale behind PRO’s initial explanation and the subsequent arguments defending it. Furthermore, such an equivalency allows us to evaluate the EQR dialogue moves using any proof theory, algorithmic procedures, or methodologies semantically associated with computational argumentation. In other words, because the protocol is defined in terms of high-level argumentation concepts like Dung’s notions of attacks and admissibility, Theorem 1 will hold for any argumentation representation that respects these notions. Thus, for example, were we to implement agents equipped with knowledge in the form of ASPIC+ statements [66], construct arguments and identify attacks using ASPIC+, then an EQR dialogue between the agents would be sound and complete in the sense of Theorem 1.
Having fleshed out the EQR formal protocol, we can now illustrate how an interaction ensuing from the unfolding of such a dialogue results in an explanatory interplay that satisfies the desirable properties outlined in Section 2.3.1, as demonstrated in the following:
Any instantiation of the EQR dialogue protocol characterised by the victory of PRO enjoys the exhaustivity, selectivity, transfer of understanding and contextuality properties of explanation.
This result emphasises the suitability of an EQR dialogue as a formal tool capable of conveying relevant (selectivity), user-friendly (contextuality) explanations based on a shared comprehension (transfer of understanding) whilst covering each significant aspect of the procedure (exhaustivity).
Related work. In the literature, there are a number of other explanation protocols which have some similarity with the EQR dialogue protocol. One of the older examples may be found in the works of Walton [87] and the subsequent joint research from Bex and Walton [8] where the authors design a dialogue and detail a complete list of its locutions. However, to evaluate the explanation that this protocol provides, the explainee needs to resort to a different dialogue protocol (denoted an “examination”). A similar, multi-protocol, approach is adopted by Madumal et al. [54]. They devise a study for modelling explanation dialogues by following a data-driven approach in which the resulting formalisation embeds (possibly several) argumentation dialogues nested in the outer layer of the explanation protocol. The dialogue structure proposed by Sassoon et al. [73] in the context of explanations for wellness consultation also exploits multiple dialogue types (e.g., persuasion, deliberation and information seeking) and their respective protocols whilst mostly focusing on the course of action to undertake. All of these approaches differ from the EQR dialogue, both in the sense that the EQR protocol is partway between persuasion, information-giving/seeking and query, and also in the sense that we believe the EQR protocol more comprehensively incorporates locutions for handling each of these tasks without the need for adopting a Control Layer (Section 2.2.2) or switching between protocols. This allows for a simpler formalisation and ensures a closer approximation to real-world dialogues. Less directly related to EQR is the formalism proposed by Dennis and Oren [27], where another theoretical analysis of explanations is conducted. Here the interactions focus on BDI agents and the paper outlines properties strictly related to the introduced protocol rather than those that are scalable to general explanations like the features enjoyed by EQR.
Other relevant work on argumentation-based explanations is provided by the studies of Fan and Toni [32], and Shams et al. [77]. These mainly focus on explanations whose justification is rendered through argumentation semantics via specific dialogue formalisations, and there is thus an equivalence between the dialogues and the argumentation semantics. The EQR protocol enjoys a similar equivalence with argumentation semantics (Theorem 1) whilst also generating explanations that satisfy the aforementioned properties (Section 2.3.1). Another interesting approach consists of the formal protocol presented by Buisman [15], who outlines a dialogue hinging on the delivery of tailored explanations to target audiences. To ensure personalised interactions, a variety of purposedly designed locutions is added to the allowed moves list. Adopting a different method, with the EQR formalisation, we tried instead to create a simple protocol: we argue that few locutions suffice to unfold the dialogue and provide appropriate explanations for diverse end-users thanks to the selection of the conveyed argument. Finally, we note that there are similarities between the work of [79], on education dialogues, and EQR. These similarities are largely in terms of the intuitions behind both formalisms, since the two approaches differ in substantial respects. [79] presents three variants characterising the whole spectrum of possible interactions between a tutor and a learner: the tutor either (a) quizzes or (b) refines her perception about the learner, whereas the learner asks a clarifying question to the tutor (c1) or another learner (c2). Given these categories, we could consider (c1) as an instance of an information-seeking (respectively information-giving) protocol, while (c2) would better depict an inquiry kind of dialogue. On the other hand, (b) mirrors a query protocol, whilst (a) portrays a particular version of an information-seeking interplay where the tutor asks questions to which she already knows the answers. In short, although information-giving/seeking and query dialogue elements are shared by both education and EQR interactions, inquiry and ‘quiz’ aspects pertain only to the former. The EQR protocol focuses more on the persuasion facet and combines (and handles) all of its features together without requiring variants.
Future work. We can envisage several different extensions of the research we have presented. For example, the fully-fledged EQR dialogue that we introduced here could be implemented via the EQRbot chatbot proposed by Castagna et al. [17,19,20]. In the cited works, we investigated instantiations of argument schemes as a way to deliver explanations to patients according to the treatment recommended by a clinical decision support system (the specific decision support system considered was the CONSULT system [4,31,46]). Such information delivery can be enhanced by a dialectical protocol purposely designed to strategically convey explanations such as the EQR dialogue. Following the EQRbot implementation, another potential direction to pursue would involve representing the arguments associated with each locution of the EQR protocol as structured (rather than abstract) entities. ASPIC+ [66] would be a suitable formalism to use for this, especially in its dialectical version D-ASPIC+ which accounts for resource-bounded agents [25]. Consider also that, in real-world dialogues, human agents do not always make fully formed arguments. Instead, they often make incomplete arguments, called enthymemes [43,93,94]. Incorporating enthymemes in the EQR dialogue protocol would thus generate a better approximation of the everyday exchange of arguments performed by real-world agents, and that would be a further possible direction for future work. In addition, all of the previous approaches could, and arguably should, be tested via specific user studies to evaluate the quality of the proposed explanations. Finally, it may also be interesting to provide a comparison of the EQR protocol and other state-of-the-art explanation mechanisms in the realm of LLMs, such as the Tree of Thoughts (ToT) method [95]. Devised as an advanced reasoning strategy for LLMs’ ability in problem-solving tasks that harnesses the exploration and evaluation of multiple thoughts (i.e., coherent language sequences), ToT may also improve the interpretability of the decisions of a model. This approach employs search algorithms and backtrack processes to probe all of an LLM’s thoughts. In comparison to ToT, EQR dialogues leverage AFs and computational argumentation semantics (given the equivalence proved in Theorem 1) in order to provide the best ‘argumentative path’ leading to explanations, thus resulting in a more intuitive and human-friendly approach. Notice also that such paths account for divergent information, therefore mimicking and (potentially) outperforming the recent CCoT (Contrastive Chain of Thought) prompting technique, which mostly handles only one contrastive explanation at a time [21].
.Conclusion
Stemming as a novel approach to argumentation-based explanations for addressing XAI concerns, the Explanation–Question–Response dialogue introduced herein presents a fully-fledged protocol and a set of fundamental characteristics. These features include: (1.) a simple protocol that avoids meta-level locutions to manage the dialectical interplay whilst conveniently embedding multiple dialogue types. Compared to other dialogues that require a Control Layer, the simplicity of the EQR design favours its implementation. (2.) EQR exchanges of arguments result in interactions satisfying desirable properties of explanations (i.e., exhaustivity, selectivity, transfer of understanding and contextuality). Lastly, (3.) the information conveyed by a terminated EQR dialogue proves to be justified by a series of compelling reasons. Indeed, such an explanation turns out to be sound and complete with respect to Dung’s AFs admissible semantics: this equivalency thus allows us to evaluate the EQR dialogue moves using any proof theory, algorithmic procedures, or methodologies semantically associated with computational argumentation. Future investigation will focus on ways for extending the introduced protocol and testing our hypothesis that EQR dialogue provides a valid instrument in progressing XAI towards the recent challenges posed by the rise of LLMs.
Footnotes
Acknowledgements
This research was partially funded by the UK Engineering & Physical Sciences Research Council (EPSRC) under grant #EP/P010105/1.
Consider, indeed, the puzzling appearance of ‘emergent abilities’. Such an unpredictable phenomenon consists of specific competencies that occur only in large-scale models but not in smaller ones. Thus, it is not possible to anticipate the ‘emergence’ of these abilities by simply analysing smaller-scale models [
]. From the perspective of developing transparent AI systems with reliable and predictable behaviour, this is certainly problematic.
Notice that the literature also presents ‘argument games’, another dialectical formalisation which may seem similar to dialogue games in many aspects [18,
]. Nonetheless, the former can be intuitively regarded as a dialogue that an agent performs ‘within itself’, whereas the latter is more suited to model a public conversation that can simultaneously engage multiple agents.
Similarly, Walton and Krabbe [
] studied the interaction between multiple dialogues. Their analysis resulted in an informal classification of possible dialogue shifts: (a) ‘from one type to another’, a sequence composed of multiple kinds of dialogues; (b) ‘internal shifts’, which occur within the same dialogue type without normative changes; (c) ‘from one flavour to another’, where the transitions concern only flavours (i.e., “
An example of analysis of the different pragmatical meanings existing between, say, ‘commands’ and ‘promises’ can be found in [58]. Furthermore, the work presented in [
] introduces a specific syntax that accounts for the pragmatical uptake and revocation of utterances over actions.
The importance of providing a context is also emphasised in studies concerning robot failure explanations: it improves the resolution and identification of shortcomings and increases users’ trust [26,
].
Such that the explainer gives information and the explainee seeks information.
Observe that
Consider that a draw can be solved by an additional inquiry dialogue to adjust the preference ordering between players.
That is to say,
