Abstract
This work presents a requirement analysis for collaborative dialogues among medical experts and an inquiry dialogue game based on this analysis for incorporating explainability into multiagent system design. The game allows experts with different knowledge bases to collaboratively make recommendations while generating rich traces of the reasoning process through combining explanation-based illocutionary forces in an inquiry dialogue. The dialogue game was implemented as a prototype web-application and evaluated against the specification through a formative user study. The user study confirms that the dialogue game meets the needs for collaboration among medical experts. It also provides insights on the real-life value of dialogue-based communication tools for the medical community.
Keywords
Introduction
As the human society has become more digitally connected, it has developed an increased appreciation for interdisciplinary collaboration [23]. Healthcare is one such domain which has a long tradition of interdisciplinary collaboration amongst different medical experts [67]. Imagine a distributed health recommendation system where different experts come together to find the best possible diagnosis for a patient. These experts could be human agents, artificial agents or a combination of human and artificial agents. The goal of the collaboration would be to integrate multiple perspectives through knowledge transfer and conflict resolution in order to recommend the best possible diagnosis. Additionally, the system should offer explanations for its recommendation in order to build trust between the users and the system [44].
As an example scenario, consider the dialogue given in Table 1 between three medical experts represented by α, β and γ. The first column of Table 1 shows the identifier for the statement while the second column indicates the expert name followed by the statement they are making. The experts are participating in a semi-structured formal discussion similar to the ones that take place in oncology [43]. They are already aware of the objectives and format of the meeting before the dialogue starts so there is no build up on the objectives of the discussion as is the case in oncology meetings. We cover this in detail in Section 3. In this example, the meeting is not about a cancer diagnosis, but rather a more general diagnosis for a patient denoted by A. This is a contrived example, aimed at illustrating the conversation flow among healthcare experts who are participating in a semi-structured discussion to collaboratively make a recommendation for a patient. The goal of this example is not to provide a comprehensive medical discussion on the patient’s diagnosis as this is not the objective of this work. Moreover, note that the dialogue only shows the knowledge and reasoning of the experts to the extent that they choose to reveal through their conversation. It does not show the complete knowledge base or the reasoning process of the expert agents (who can be human or artificial).
Example dialogue between three medical experts α, β, γ
Example dialogue between three medical experts α, β, γ
In this example, we assume that agent α loosely represents a clinical psychologist, agent β a general practitioner and agent γ an endocrinologist (specialist in ear, nose and throat). Agent α first presents the facts of the case and offers his own diagnosis (depression). This is challenged by agent β. α then justifies their stance which is rejected by γ. γ then proposes their own diagnosis. β asks γ to explain their diagnosis. Subsequently, both α and β accept γ’s explanation. Then α proposes some tests to which β agrees and propose an additional test. γ agrees and proposes yet another test. However, their suggestion is ignored by the other two as they propose to close the discussion. However, γ does not consent and asks the other agents to respond to their suggestion. They do respond and subsequently γ agrees to end the discussion. Throughout the discussion, the agents try to collaborate in a cooperative manner. We use this dialogue as a running example throughout the rest of the paper to illustrate our approach.
As a first step towards realising such a hybrid human-artificial multiagent system capable of such a dialogue, we propose a novel interaction protocol between experts agents (whether human or artificial). We call this protocol as Experts’ Dialogue Game (EDG). Figure 1 visualises how EDG would fit into the pipeline for building such a system. As a first step, real-life consultations among medical experts are formalised as a requirement specification. Building on this specification, EDG is defined as a dialogue game among the participants in a hybrid human-artificial multiagent system (MAS). The output of EDG is a recommendation from the system along with a sequence of explanations justifying the recommendation. These explanations can then be plugged into a system-user dialogue to justify the system’s recommendation to the user. The user in this case could be the patient himself or a physician in charge of the patient. While most of the current literature [3,17,36,65,76] focuses on explaining the working of a system to a human user through a system-user dialogue, EDG investigates how an explanation dialogue can be used within a multiagent system as part of agent reasoning.

Workflow diagram of an explainable multiagent recommendation system employing EDG.
Dialogue games are dialectical systems [22] in the tradition of informal logic [77] and are formally defined as verbal interactions between two or more players according to some pre-defined rules for the dialogue [39]. Each interaction is specified with the use of locutions which represent speech acts permitted in a given dialogue [62]. Dialogue games require that each participant maintains consistency across its statements, also called commitments, at any point in the dialogue [77]. Characteristics of these verbal interactions are typically defined in multiagent communication according to the popular typology introduced by Walton and Krabbe [77]. Dialogue games have a long tradition of being used to solve formal problems as well as to model natural language communication in real-life settings. In the first case, they have been employed to search for formal logical proofs in the tradition of Lorenzen [1] and leading to the field of dialogical logic, as well as in the prescriptive approach such as Hamblin’s system [22], to disallow logical fallacies during natural language argumentation. In the second case, they are used to study the communication dynamics in real life settings, giving rise to the descriptive approach to dialogue games [11,27,36,40,53,64]. This approach can also be used to inform computational models of interactions between agents in a multiagent system.
The goal of EDG is for the participants to collaboratively find the best recommendation through exchange of knowledge and mutual agreement. In Walton–Krabbe’s typology, this scenario fits an inquiry type of dialogue in which an initial situation is a need to have proof (i.e. in our case – to find the best recommendation), the participants’ goal is to find and verify an evidence (i.e. to consult an observation), and the goal of the dialogue is to prove or disprove a hypothesis (i.e. to argue for or against a recommendation). This is in contrast to a deliberation dialogue where the initial situation is a need for action, the participants’ goal is to influence the outcome and the goal of the dialogue is to reach a decision on the best possible action [77]. In order to generate explanations of the recommendations, we define agents’ communicative behaviour in the dialogue through explanation-based illocutionary forces which can then be traced back and retrieved in response to a query and presented to the user as an explanation of the recommendation: (1) explanation requests such as
The contribution of this work is threefold. First, we present a requirement specification for collaboration between experts. While the requirement specification is grounded in the medical domain, it focuses on the general communication issues that can come up during expert collaborations. Hence, it can be abstracted to consultations among experts in general. Next, inspired by the tradition of dialogue embedding [77], we combine explanation-based illocutionary forces in an inquiry dialogue to generate richer traces for the inquiry process than what is possible with the assert locution, typically used in inquiry dialogues. Moreover, the dialogue game is unique in that sense that it meets the requirement specification from the domain experts. While the combination of different dialogue types is not ground breaking formally, the proposal makes an important methodological step into the applications of such formal dialogue systems into real-life domains which is empirically grounded in users’ requirements. Furthermore, to the best of our knowledge, no other descriptive dialogue game has focused on interactions among experts. The rich game traces allow the system to be transparent and can be used to explain the recommendations of the system to a human user through human-machine interfaces such as verbal and visual interaction. Finally, we evaluate the dialogue game against the requirement specification and verify the evaluation through a formative user study. Thus, we introduce the methodology of user-centred software engineering practice to dialogue games. While other works [11,27,36,40,53,64] in descriptive dialogue games have used insights from their domain of interest to inform the dialogue games, none of them have provided a requirement specification as far as we are aware.
The rest of this work is structured as follows. Section 2 summarises related work. Section 3 presents the requirement specification for expert collaborations. Section 4 formally presents the Experts’ Dialogue Game (EDG) while Section 5 provides implementation details of a web-based platform implementing the EDG. In Section 6, EDG is evaluated against the requirement specification presented in Section 3. Section 7 highlights user perspectives on EDG and the platform described in Section 5. Finally, Section 8 concludes the paper and provides directions for future work.
This section summarises related work on explanatory and inquiry dialogues, dialogue games in multiagent systems for healthcare, argumentation in healthcare and communication tools for multidisciplinary collaborations in healthcare.
Amgoud et al. [2] present an argumentation system for resolving inconsistencies in an agent’s knowledge base. Subsequently, they show how dialogue game theory can be applied on top of this to realise the different dialogue types in Walton and Krabbe’s typology [77]. A minimal framework for an explanatory dialogue system is presented in Walton [76]. The goal of the dialogue is for an explainer (an entity that explains, usually the system) to fill in the gaps in the knowledge base of the explainee (the target of the explanation, usually a human user) by informing them why something happened. This is considered as a transfer of understanding from the explainer to the explainee. Building on Walton’s minimal framework, Arioua et al. [3] combine an explanatory dialogue with argumentative illocutions. The explanatory illocutionary force is used by the system to explain the behaviour of some phenomena to the user while the argumentative force helps to resolve inconsistency in the knowledge bases of participants. In addition to commitment stores for each participant, they introduce an understanding store for the user which stores the missing links in understanding rather than what is currently understood. The discharging of all issues in the understanding store confirms a successful transfer of understanding.
Madumal et al. [36] analyse human-human and human-agent explanatory dialogues from various domains and propose an explanatory dialogue protocol based on induction. Their protocol also combines an explanatory dialogue with argumentative faculty. The goal of the dialogue is the same as proposed by Walton [76]. They use double acknowledgement for confirming understanding, that is, acknowledgement by the explainee on being satisfied with the explanation and explainer’s return acknowledgement. Dennis et al. present an explanatory dialogue game for explaining the behaviour of a Belief–Desire–Intention System [17]. Their protocol, like EDG, is grounded in dialogue game theory from informal logic using argumentation schemes and critical questions rather than argumentation theoretic dialogue [75]. The goal of the dialogue in this case is to understand system behaviour through comparing traces of the system from different participants. Similar to these works, we use explanation-based illocutionary forces to fill in missing gaps in the knowledge bases of participants.
Ilia et al. [65] extend an information seeking dialogue with explanatory illocutionary forces. The goal of the dialogue is to offer factual and counter factual explanations of a classifier’s output to a human user. They also evaluate the protocol through a user study and process analytics. Prakken and Ratsma [54] propose a case-based explanation dialogue to explain the outcome of a linear binary classifier. The goal of the dialogue is to provide a model-agnostic local explanation. The explanations in this case are not trying to explain the system reasoning but rather trying to come up with reasons to justify the system result. The dialogue game starts with the proponent of the dialogue presenting a similar example from the training set with the same outcome as the current instance. The opponent can argue against this using two strategies. The first is to use counter examples from the training set. The second is to highlight the differences between the current instance and the instance being presented as a justification by the proponent. A successful explanation amounts to a winning strategy for the proponent. However, unlike these approaches, we incorporate the explanatory illocutionary forces in the reasoning process itself rather than to only explain why the system behaved in a certain way. While these works target a transfer of understanding from the system to a human user, EDG involves explanations amongst the reasoning agents.
Although not as common as persuasion dialogues, a few other works have explored inquiry dialogues as stand alone dialogues or in combination with other dialogue types. Bex et al. [7] combine an inquiry dialogue with a persuasion dialogue for discussions between criminal investigators. The goal of the dialogue is to come up with the most robust explanation. They assume an adversarial setting in which each agent advocates for its own preferred explanation. Unlike this work, EDG assumes a cooperative setting where the main goal is to come up not only with the most robust explanations but decisions as well. Black and Atkinson propose a framework that embeds an inquiry dialogue over beliefs with a persuasion dialogue over actions. The inquiry dialogue allows the participants to collaboratively decide what to believe whereas the goal of the persuasion dialogue is to collaboratively decide what is the best action to do in order to reach the proponent’s goal. Once all the arguments have been given, it is upto to the proponent of the dialogue to make the final decision based his personal preferences [8].
Black and Hunter [9,10] propose two types of inquiry dialogues, which they call as argument inquiry and warrant inquiry. The goal of argument inquiry dialogue [9] is for two agents to jointly construct an argument for a claim by sharing relevant beliefs. The protocol has three moves and allows nesting of argument inquiry dialogues. In addition to a conventional commitment store, they also introduce a question store which keep tracks of the premises that need to be proven in order to prove the claim representing the dialogue topic. They prove soundness and completeness for their protocol. The goal for a warrant inquiry dialogue [10] is for two agents to share arguments to jointly construct a dialectical tree in order to determine the acceptability of a particular argument. The main difference between the two types of dialogues is that argument inquiry is not concerned about the acceptability of the argument constructed while the latter is. Warrant inquiry dialogue allows embedding argument inquiry dialogue and also involves a question store like the former. They also provide a strategy for selecting the next move for a participant for both types of dialogues. However, to the best of our knowledge, none of these works combine an explanatory dialogue with an inquiry dialogue to make agent reasoning explainable.
Other works have incorporated argumentation and dialogue games in multiagent systems to provide clinical decision support in a distributed environment such as cancer diagnosis and management. Huang, Jennings and Fox [24,25] present a multiagent architecture for medical decision support in an interdisciplinary setting. The architecture has four components; a three layered knowledge base, a centralised working memory, a communications manager and a human-computer interface. The architecture also covers decision making under uncertainty, task management and agent cooperation. The communication manager works with communication primitives or locutions and a communication protocol. However, the locutions in this case are geared towards managing the tasks in a distributed environment rather than a discussion amongst different experts. For example, the locutions request, accept, reject and alter are used in the task allocation stage. The locution inform is used to report on the allocated task while the locution propose is used to recommend a treatment plan in response to a query. In contrast, EDG is focused on facilitating the discussion amongst the experts rather than a distributed management of responsibilities.
Beveridge and Fox [6] use a dialogue game as an interface between the underlying task structure and ontological knowledge and the spoken dialogue generation system. They implement their approach to provide clinical decision support for breast cancer diagnosis. They use several locutions as initiating locutions. For example, inform is used to present new information, instruct is used to request an action from the user,
Vasileiou et al. [73] present an argumentation-based justification dialogue between two participants. The explainee is a human user who wants to understand the explainer’s (artificial intelligent agent) reasoning. The dialogue game has four locutions, two of which are reserved for each of the participants. The game only allows a single locution per turn. They provide evaluation of their dialogue game through a user study and discuss its properties. Rago et al. [56] present the notion of explanatory dialogue between two participants as an Argument eXchange (AX). They discuss desirable properties of AX for agents equipped with quantitative bipolar argumentation frameworks and gradual semantics. [29,59,60] propose an interactive clinical decision support system, called CONSULT, for multimorbidity patients to self-manage their treatment. The system integrates four types of data sources; the patient’s electronic health record, data from sensors monitoring the patient’s symptoms, the clinician’s input and finally treatment guidelines. It uses computational argumentation to aggregate the data and resolve inconsistencies in the data sources. It also provides an argumentation-based dialogue interface for system-patient interaction to interactively deliver the recommendations to the user. The system-patient interaction uses templates for its text-based natural language interface. It is based on three different argumentation schemes and their associated critical questions. These cover deliberation, persuasion and explanation dialogues. Castagna et al. [13] propose an explanation dialogue between two participants that also interfaces with the CONSULT system through a chatbot. They propose an argument scheme based on practical reasoning and use it for the explanatory dialogue. Shaheen et al. [63] propose an explanatory dialogue between two participants to explain the recommended treatment plan for multimorbid patients. All of these works focus on explanation dialogues between two participants, mainly a system as an explainer and a human as an explainee. In contrast to these works, EDG aims to use different types of explanatory dialogue forces to generate richer traces during inter-agent reasoning processes.
Pancho et al. [41,71] propose an argumentation-based deliberation dialogue between two agents to discuss the viability of transplant organs. The dialogue is implemented as part of Carrel+, a health information system to manage organ transplants in Spain. The dialogue model, called ProCLAIM, [70] is based on argument schemes and case-based reasoning. ProCLAIM employs a mediator agent to guide the participant agents on their legal moves, decide the validity of submitted arguments and finalise the recommendation regarding the viability of the proposed transplant organ. The mediator agent uses argument schemes, existing guidelines, case-based reasoning and a component manager to manage the strengths of submitted arguments. It applies abstract argumentation semantics [18] to decide the winning argument.
Xiao et al. [81] present a group decision description language and a consensus protocol for a multiagent system. However, the protocol is not based on speech act theory [62], but is an agent communication protocol that uses functions like averaging and intersection to generate consensus values unlike the work presented here. Patkar et al. [49] developed a clinical decision support tool, called MATE (Multidisciplinary meeting Assistant and Treatment sElector), to support multidisciplinary cancer conferences. The tool is responsible for information management and providing treatment recommendations after processing the data. It does not present a dialogue game to support multidisciplinary discussion amongst experts as is presented here. A comprehensive survey describing the use computational argumentation for explainable artificial intelligence can be found in Vassiliades et al. [74]. Some other works have proposed computational argumentation systems for clinical decision support [15,20]. [] present a negotiation protocol for agents in a Belief Desire Intention (BDI) architecture. However, the protocol is grounded in agent communication language rather than dialogue game theory.
The term Health Information Technology (HIT) refers to the application of information technology to facilitate healthcare. HIT systems fall on a wide spectrum ranging from administrative support, patient information management and retrieval, communication and decision support [14]. Carayon et al. [12] point out that most of the existing HIT systems are focused towards individual tasks rather than teams, even as team-based care is becoming a popular paradigm. Here we review representative HIT applications targeting multidisciplinary communication and collaboration support.
Care Connector [46,68,69] is a communication and collaboration platform implemented in a community hospital in Canada. It is a web application that integrates into the HIT of the hospital to retrieve and update electronic records. The application covers both care planning and monitoring modules in addition to a messaging module between multidisciplinary care providers. The patient information is stored as part of a Care Planner module. The messaging modules provides asynchronous communication of non-urgent messages using the information in the Care Planner as a shared knowledge base. The messaging module allows linking each conversation to a patient. It informs participants to post messages following the Situation, Background, Assessment and Recommendation (SBAR) framework which is used in healthcare communication. However, it does not force the participants to frame their messages according to this framework.
Kurahashi et al. [30] present another communication and collaboration tool called Loop. It allows multidisciplinary collaboration between teams. The teams can include healthcare professionals, caregivers as well as the patient. Each conversation loop is centred around a patient. The application includes a card with patient information on the left hand side while the right hand side has the messaging thread. The information exchange is secure and sharing information between different subgroups is allowed. For example, a professional only message exchange or with all the participants in the loop. It also allows tagging messages with user-defined labels to facilitate search later on. The labels represent different themes or issues described in the message.
[32] implemented a platform, called one-stop platform, for multidisciplinary collaboration among healthcare professionals in a Taiwanese hospital. The platform integrates into existing HIT system of the hospital. It covers administrative and planning aspects in addition to a messaging module. The messaging modules allows transparency and accountability for message posting and viewing. It supports exchanging text, audio and video messages. However, there is no specification on the format of the content that is exchanged.
Shared Care Platform (SCP) [38] is yet another collaboration tool implemented in a hospital in Spain that builds on social networking and open source tools. The tool is targeted towards facilitating healthcare professionals to manage multimorbidity patients. It has two components; a social networking component, called the Clinical Wall, and a decision support component. The Clinical Wall provides social networking like collaboration and communication support amongst healthcare professionals. It is integrated into the electronic health records of the patient. The record has an assessment section, a discussion section and a conclusion section. The assessment section includes information on patient history and assessments. The conversation starts with one clinicians posing a question to others. In the discussion section the clinicians exchange messages to arrive at an agreement with regards to the question. In the conclusion section all participants need to sign off on the agreed decision. During the discussion any clinician can be added to the conversation to invite their feedback. The decision support component uses the Clinical Wall and provides clinical guidelines in the form of rules. Other works [42,45,80,82] implement mobile applications to support care and communication amongst healthcare professionals, caregivers and patients. These applications mainly support administrative and information management tasks with simple messaging support for communication.
All these platforms and applications provide secure messaging amongst participants and well integrated interfaces for the existing HIT systems in place. In contrast, the prototype implementation provided for EDG does not provide any of these features since the goal in this work was to evaluate the underlying protocol rather than present a full-fledged web application. However, none of the existing applications provide support for framing the content and type of messages with an underlying dialogue protocol as is proposed in this work. So the platform provides a novel idea of supporting collaborative communications amongst healthcare experts based on an underlying dialogue protocol.
Requirements for expert collaboration
In this section, we propose a requirement specification for successful consultations among experts. Consultations among experts are common in the professional world, especially when critical decisions are concerned such as in medicine, aviation and engineering. We focus on consultations among medical experts as our domain of choice in order to develop a dialogue protocol for consultations among expert agents. This is because it is easier to abstract away from domain specific terminology in this case in order to understand the interaction dynamics. However, the requirement specification we present is abstract enough to be applied to consultations among experts in general since it avoids domain specific scenarios and terminology.
In order to understand collaboration scenarios between experts, we held informal discussion with some medical experts (specifically a gynaecologist, a radiologist, a general physician and a dentist). We identified two main scenarios, informal consultations such as during hand-off of a patient from an emergency room to a general or specialist ward, and formal consultations which usually take the form of multidisciplinary cancer conferences or case conferences for short. These conferences are structured discussions between different specialists to finalise diagnosis and treatment options for cancer patients [78]. The conference is attended by multiple specialists such as surgery, oncology and pathology. It starts off with the specialist in charge presenting each case history to the panel of experts. This is followed by a discussion amongst experts as to the best possible diagnosis and treatment options for each patient [21]. During the discussion, knowledge transfer between experts takes place. This happens through the explanatory, inquisitive and cooperative tone of the dialogue. Because of its explanatory value, many specialists consider it to have educational value for trainees [21]. We chose the case conference as our main use case to inform the protocol because of its formal and structured format. Moreover, some of the general communication issues [33,66,72] during informal hand-off also come up in case conferences. Subsequently, we reviewed medical literature on communication in case conferences to identify possible issues. We also included some works on general communication issues during informal collaboration between medical experts. These works were included since they were general enough to be understood by non-medical audiences.
Sutcliffe et al. [66] identify two types of communication failures in the medical profession: systematic and individual. While systematic failures result from a lack of sufficient organisation, individual failures have complex roots such as hierarchical and power dynamics and excessive workload. They suggest establishing communication guidelines to minimise both types of failures. In order to mitigate against these failures, we develop an interaction protocol grounded in the interaction dynamics during case conferences [16] as well as general communication dynamics [66] that can come up during informal collaboration between medical experts as a result of organisational subculture [66]. Formally, we used Scopus, PubMed and GoogleScholar to look for papers from the medical community that identify communication issues in cancer conferences and in general. We used the keywords ‘communication multidisciplinary cancer conference’, ‘multidisciplinary cancer conference’ ‘communication failure medical experts’ and ‘tumour board decision making’. All open-access, English language articles between 2001 and June 2021 related to medicine were considered. Amongst these, manual filtering was done to narrow down results to works involving reflections on communication issues amongst medical experts in cancer conferences and in general. The included articles were either reporting reflections from user studies [16,31,33,34,55,61,66,72], surveys [35,57,78,79] or best practices [43,55,78] followed by professionals. Works related to communication between patients and healthcare professionals were excluded. As were works that focus on the diagnostic recommendations for different medical conditions. The articles included in the study are given in Table 2. We stopped our search for articles when the same ideas started to recur in different articles and we felt confident that new articles were not adding any new perspectives.
Requirements for effective communication between experts according to medical literature
Requirements for effective communication between experts according to medical literature
We identified fourteen basic requirements for consultations among medical experts which are presented in Table 2. These cover both the systematic and individual needs for effective consultations between experts. A requirement was considered as inferred from a publication if it was explicitly or implicitly mentioned as a standard practice, a desired outcome or as a lack thereof. All the best practices, guidelines, reflections in the papers were taken into account, grouped together and summarised. This robust process of systematic and rigorous data collection from the domain literature is treated as providing validation of requirements which are then further evaluated in the user study on requirements embedded in the dialogue protocol (see Section 6). These requirements adhere to the structural guidelines and best practices for case conferences while at the same time addressing the communication issues that come up during informal consultations. They are abstract enough to be applicable in a formal collaboration setting between experts in different domains. They can be seen as sub-goals that can facilitate the collaboration in order for it to be productive.
The requirements were then categorised into four classes depending on the mechanism through which they can be satisfied. Table 2 lists the requirements according to their proposed categorisation. Each row presents the requirement id, description, number of papers in the literature that mentioned this requirement and references to the corresponding works. Each of the categories and their corresponding requirements are described next.
This category represents communication requirements that are directly related to the dialogue participant. This category has only one requirement, labelled as
Cooperation oriented requirements
The cooperation oriented requirements, with identifiers RC1–RC4, stress different aspects of cooperation during the collaborative dialogue.
Protocol oriented requirements
Protocol oriented requirements cover communication and logistic aspects that should be handled at the protocol definition level. Six such requirements were identified. These are given identifiers RP1–RP6. RP1 ensures that patient history (or observations pertaining to the issue at hand in case of non-medical domains) is explicitly stated during the dialogue so that any faulty assumptions can be countered. RP2 brings critical concerns of the participants to the forefront of the collaboration process. By doing this, it ensures that the participants reflect on these issues. Explanations and clarifications can be useful tools for transfer of understanding amongst the participants. This can promote cooperation and help to align the knowledge and thinking of the participants. Hence, RP3 formalises this need and makes it part of the dialogue. Similarly, RP4 ensures that the protocol design incorporates a conflict resolution mechanism. Finally, RP5 and RP6 mitigate against possible power dynamics resulting from the organisational structure that might influence the dialogue participants. RP5 ensures that the protocol design incorporates inclusiveness while RP6 incorporates equality into the design.
Implementation oriented requirements
Implementation oriented requirements express logistic concerns that can only be addressed at the implementation level. There are three such requirements which are given identifiers from RI1–RI3. Since a collaborative dialogue between more than two experts can entail administrative overhead, most studies [31,34,79] found that having a designated role to oversee this greatly improves the collaboration process. Hence, this is captured by RI1. RI2 captures the necessity of recording the dialogue history so that it can be referred back to at a later time if required. Finally, since expert collaborations generally cover confidential topics and data, RI3 ensures that any confidential information exchanged during the collaboration is protected.
We consider all the requirements to form a core part of discussions although some seem to come up more frequently in literature as compared to others. For example, requirements RC4, RP2, RI1, RC36 and RC2 are more frequently mentioned while some others such as RP6 and RI3 are not. Nevertheless these represent fundamental aspects of these exchanges.
A formal dialogue system for expert collaboration
This section formally presents the Experts’ Dialogue Game (EDG) which embeds explanation-based illocutionary forces in an inquiry dialogue in order to emulate the inquisitive, explanatory and cooperative aspects of real-life consultations. This is done so that the dialogue game can generate richer reasoning traces and meet the needs of successful collaboration amongst experts. An Inquiry Dialogue is defined in Walton and Krabbe’s popular typology of dialogue types [77] as a collaborative discussion amongst participants to find out the answer to one or more questions when none of them is presumed to know the correct answer beforehand. Later, Walton [76] introduces an Explanatory Dialogue as a discussion between two or more participants in order to bring about a transfer of understanding from one to another. In this case, the participants already agree on the topic but differ in their understanding of it.
A Dialogue Game (DG) is a tuple
Each of these elements is described next.
The game requires two or more participating agents, each representing an expert in some area, belonging to the set
Initial knowledge bases of agents α, β, γ for the example dialogue
Initial knowledge bases of agents α, β, γ for the example dialogue
Each agent has its own private knowledge base, represented as Let Let Let Let Let Let Let
Locutions,
Each locution, represented by the letter τ, is of the form
Combination rules for consultation between two expert agents
Combination rules for consultation between two expert agents
L1 (informational locutions). There are five locutions in this subset: observation, verdict, advise, concern and assert. These are labelled from L1.1–L1.5 respectively. In the commencement phase, L1.1–L1.4 are used to set the context of the dialogue. In the progress stage, all five can be used to introduce new knowledge into the conversation. While L1.1 to L1.4 are used to introduce facts pertaining to a specified topic, L1.5 (assert) is used for introducing inference rules that relate the content of the first four locutions to each other. No distinction is made between strict and defeasible facts and rules.
L2 (requests). We use a simplified version of the typology for different explanation requests (and replies) presented by [11]. This is because while they meet the conversational needs for a specific scenario in the financial domain, our protocol targets a general consultation setting between experts without going into domain specific details. Consequently, three types of requests are included. A request for explanation, represented by
L3 (replies). This subset has five locutions explain, justify, clarify, agree and retract which are given identifiers from L3.1–L3.5 respectively. The first three are locutions for answering the corresponding wh-requests from the L2 subset while the last two cover other possible answers such as agree and retract. The protocol assumes that the agents are always able to clarify and explain, but not always able to justify, in which case they retract.
L4 (management locutions). This subset defines a total of three locutions, which are given identifiers from L4.1–L4.3. These are prompt, end and pass. These manage the dialogue in different ways. prompt serves two purposes: it allows the speaker to indicate to the other participants that they are awaiting a response on a particular locution and it can also be used during the termination stage to justify why a participant has disagreed to end the dialogue. end indicates an acknowledgement by a participant that they are satisfied with the dialogue outcome, thus, giving their consent to end the dialogue. If they have an outstanding issue, they can refuse to give their consent to end the dialogue. In this case, they are invited to justify this by using prompt to let other participants know which of their statements have not received a response yet. Finally, since the dialogue game allows participants to make multiple moves, pass is used to manage turn-taking. Whenever a participant has finished whatever they wanted to say (they are allowed to use multiple locutions) in their turn, they signal the end of their turn by using pass. Thus, in the case of more than two agents, the protocol allows everyone to participate in the explanatory dialogue since the dialogue game allows using multiple responses to each locution (a detailed description is provided in Section 4.4 when the dialogue rules are introduced). This means that in response to locutions from subset L2, other agents who were not directly addressed in the preceding wh-request can choose to participate in the information exchange by making an appropriate move.
The game has three stages: an opening stage governed by Commencement Rules, a progress stage governed by Combination Rules and a termination stage described by Termination Rules [52]. Each of these are described next, followed by commitment, turn-taking and politeness rules.
Commencement rules. The topic of the dialogue game can be one or more subsets of O. The game always starts with the initiator agent presenting the facts of the case (observations), its own conclusions (verdicts), corresponding recommendations (advice) and any critical points (concerns) it deems important. Thus, the first turn is composed of the first four locutions from the locution subset L1. A move, represented by μ, is a tuple
Combination rules. The protocol allows participants to start new threads in the conversation at any time. This is achieved by allowing one or more locutions in the same turn where each locution corresponds to a move. For a move which uses the
Termination rules. The dialogue terminates when all the participants agree to end it. Any participant can start the process for getting consent from others to end the dialogue. They can do this by using the end locution. This signals the start of the termination stage. Since the participants are assumed to be assertive and cooperative, this means that anything that the participants do not explicitly challenge is taken for granted as an agreement. Hence, when the dialogue ends, all the participants are assumed to have agreed on all the elements of set O under discussion. However, each voting for termination may not always end in successful termination since any participant can refuse to give their consent. They are then invited to highlight any outstanding issues they have by using the prompt locution as explained in Section 4.3. If this happens, the dialogue moves back into the progress stage. Otherwise, they give their consent to end the dialogue (and to accept all the statements that went unchallenged by them) by using the end locution. A move which uses the end locution also has
Commitment rules. Dialogue games generally require each participant have their commitments publicly available in the form of a Commitment Store. The commitments are created as a result of particular speech acts and they ensure accountability for the participants. This is useful for making the dialogue coherent and productive. We follow Hamblin’s notion of commitment stores as done by [48] where an agent’s commitment store,
C1: For locutions subset L1 and L3.1 to L3.3, the content of the locution is added only to the individual commitment store of the speaker,
C2: For locution L3.4, the content of the locution is added to the individual commitment store of the speaker agent and also to
C3: For a locution
C4: For all locutions belonging to L4, no changes are made to neither the individual commitment store of the speaker nor
C5: For L3.5, which represents a retract, the content of the locution is removed from the commitment store of the speaker and from
C6: After the dialogue terminates, the union of all individual commitment stores minus the conflicts is added to
Example of a dialogue game between three medical expert agents,
1
Example of a dialogue game between three medical expert agents,
1 Subscripts of locutions are not mentioned for clarity.
Table 5 exemplifies the commencement, combination, commitment and termination rules for the running example. The first column of Table 5 indicates the turn identifier, the second column lists the identifier of the agent making the move, the third column identifies the locutions moved, the fourth column shows the changes in the commitment store of the speaker and the last column indicates the changes in the multilateral agreement store. In

State transitions between commencement, progress and termination states for the example dialogue in table 5.
EDG promotes making justifications, explanations and clarifications explicit in the discussion by not allowing
Alternative ending for the running example. α, β, γ
Dialogue game between three medical expert agents for the alternative ending in Table 62
2 Subscripts of locutions are not mentioned for clarity.
The collective agreement store serves as the output of the multi-agent system. It allows the most relevant knowledge for the decision making to be pooled together in a systematic way which is more computationally efficient than pooling all the knowledge bases of the agents. In the process, it also preserves the privacy of the agents since only publicly shared information is used. This approach allows for the building of a modular explainable multiagent system in which the multiagent decisions can be made independently of the human-machine interface. For example, it can be used to provide justified decisions made by expert agents to a user using another explanatory protocol for human-machine interaction such as the one proposed by Ilia et al. [65]. In this case, the collective agreement store can serve as the interface between the two modules of the explainable Artificial Intelligence (XAI) system.
EDG relies on the locutions
Promoting elicitation of justifications, explanations and clarifications allows EDG to keep track of collective agreements and resolve discrepancies in the agreement store. An example of this can be seen by comparing the example in Table 5 with the alternative scenario presented in Table 7. In the first case, the
Turn-taking rules. EDG identifies two roles for the participants, initiator and participant. However, the initiator role ends after the first turn, whereby everyone becomes a participant. The initiator provides sufficient context for the dialogue through the locutions in the first turn. The protocol enforces turns but no particular turn-order is enforced. Each agent has to move at least one locution in response to the proponent’s moved locutions. Since multiple locutions are allowed in each turn, each agent has to end his turn with the pass locution to mark that he is finished.
Politeness rules. Structurally, dialogue games can allow participants to respond only once to each move (single-reply) or offer several responses as well (multi-reply), to use only one locution in each move (single-move) or more than one (multi-move) and to transfer the turn as soon as some objective condition is met (immediate-reply) or later (non-immediate-reply) [52]. Based on these definitions, we consider EDG to be multi-reply, multi-move and non-immediate-reply. A brief discussion justifying each of these properties follows next.
Multi-reply. The protocol achieves this in three ways. The first two enable this property for the respondent while the last one enables the speaker to proactively demand an additional response. For the respondent, it allows multiple arguments in one turn by not imposing any restrictions on the number of arguments included as content of each locution. For two, it allows respondents to come back to earlier choice points in the dialogue since it does not impose the restriction on addressing the preceding move. So they can move several arguments referring to different previous moves if desired. For the speaker, it enables them to direct the conversation back to issues that were not addressed to their satisfaction using locution L4.1.
Multi-move. The protocol does not limit the number of locutions that can be moved in one turn by each participant (see Section 4.4). Hence, it is by construction multi-move.
Non-immediate-reply. Since the protocol does not enforce an external condition to shift the turn, it allows each agent to complete its move uninterrupted and proactively transfer the turn, it is then non-immediate-reply.
All these properties make EDG very flexible and close to natural conversation. However, this flexibility can lead to dialogues that are incoherent or compromise the explanability and cooperative aspects of the dialogue. Hence, it calls for introducing the same mannerisms in place in natural conversations that act to counter these complications in real life conversation. So, EDG introduces two such mannerism into the dialogue as politeness rules. It identifies two such rules to ensure dialogue progression and conflict resolution. The first is related to
We take a protocol-oriented view of Agent Communication Language (ACL) semantics [50,52]. In this view, the semantics and use of utterances should be defined at the dialogue level rather that at the level of individual locutions [52]. Pitt and Mamdani [50] distinguish between the content and conversational states of the dialogue. The former is dependent on the information state of the agent. Information state of an agent reflects its knowledge base. Semantics at this level define the change in the agent’s information state. The latter is determined by the speech acts exchanged earlier in the dialogue, which are referred to as the conversation state. The conversation state can be described by the set of possible responses for each speech act. Consequently, the commitment rules described in Section 4.4 form the content level semantics for the protocol while the combination rules given in Table 4 define the conversational semantics for the dialogue. Since the protocol treats the commitment store as a subset of the agent’s knowledge base, the commitment rules express post-conditions about the agent’s information state as a result of the speech act. Next we describe the pre-conditions for making the move.
Pre-conditions for managing information state.
P1. For locutions subsets L1 and L3.1 to L3.3, there are no constraints on the content except that it should be relevant to the dialogue topic.
P2. For locution subsets L2, L3.4 and L4.1, the content of the locutions should already be part of the commitment stores of one of the agents.
P3. For L3.5, the content of the locution should belong to the commitment store of the speaker.
P4. For L4.2 and L4.3 no conditions apply as there is no content.
Pre-conditions for managing conversation state.
P5. For locution subsets L1, L2, L3 and L4.1, those imposed by Table 4.
P6. For L4.2, the agent finds no conflicts or objections in the information state of the dialogue as represented by its own commitment store.
P7. For L4.3, the agent making this move must have used at least one other valid locution before this one.
While pre-conditions can relate to both constraints on the agent’s information state [2,48] or to constraints on the conversational state of the dialogue [76], here we specify pre-conditions to manage the information and conversational states of the dialogue itself. The limits introduced on the content of locutions in Table 4 also form part of the pre-conditions for managing the information state of the dialogue. We require that the agents maintain dialogue history and do not repeat a locution with the same content.
A platform for expert collaboration
A prototype of EDG was implemented as a web application in order to evaluate it with human experts through a user study. The web application allows the participants to ‘chat’ while enforcing EDG protocol. However, the participants do not need to remember the protocol, the web application enforces it for them. For each ‘message’ in the chat, it shows the possible locutions from Table 4 that can be used in response as a drop down menu. The participant can select a locution to frame their response and type their text in the corresponding text field. This section provides details on the implementation and the user study design.
Implementation
EDG was implemented as a prototype full-stack web application that allows human participants to engage in discussion regarding the best diagnosis and treatment options for a patient. Figure 3 shows a screenshot of graphical user interface of the web application with a hypothetical example. The application was implemented using JavaScript frameworks for client and server. The dialogue history is recorded in an SQLite database on the server. The server keeps track of the number of participants in a game and rotates the turns in a cyclic manner in the order in which the participants join the game session. The application enforces politeness rules by using highlights. It alerts the user who was the target of a wh-request by highlighting the request in red and not allowing this user to play any other locution until all the wh-requests to them have been discharged. Similarly, if a user plays the prompt locution, the target locution is highlighted in blue on all participants’ user interface to alert them on the request for response. However, they are not forced to respond to this alert. The web application was used to evaluate the usability of EDG through a user study.

Screenshot of the web application implementing EDG.
We conducted a user study to evaluate the usability of the platform and its underlying protocol. According to the International Organization for Standardization (9241–11:2018), Usability measures how effectively, efficiently and satisfactorily a system, product or service can be used by the specified users for achieving their specified goals [26]. We carried out formative usability testing with a total of six participants. Formative usability testing is done during the development process with a relatively small number of participants to identify potential issues [4]. While traditional usability testing requires 30 to 50 test subjects, a type of formative usability testing approach known as discount usability testing has been shown to uncover
Participants. The participants were final year medical students from a medical university in Barcelona, Spain. They were volunteers who responded to a call for participation after reading the advertised information sheet through their University’s human resource department.
Design.

Workflow diagram of the user study design.
For each session of the usability testing, participants were divided into groups of three. They were then tasked with collaboratively deciding on the best possible diagnosis and advice for an anonymous patient with a thyroid disorder. The patient data was taken from the publicly available thyroid dataset from the UCI ML repository (
Goals. The goal of the user study was to elicit perspectives of medical experts on aspects related to effectiveness, efficiency, engagement and ease of learning for the discussion platform and the underlying protocol. These aspects also tie in to the requirement specification from Section 3. However, they allow some additional usability considerations to be explicitly taken into account. Specifically, it aimed to answer the following questions for each of these aspects:
Does the platform add value to professional discussions of medical experts?
Did knowledge transfer take place as a result of the discussion?
Are participants satisfied with the explanations provided?
Are participants satisfied with the final decision and its justification?
Do participants have the moves they need at each step to express themselves?
Does the application impede the participants in some way during the dialogue?
Does the application and the protocol promote discussion?
Do participants find the classification of explanation requests useful or confusing?
The next section provides details on the result of the user study. Additionally it provides a detailed evaluation of EDG according to the requirement specification of Section 3.
In this section we discuss how the protocol and its implementation measure up against the different types of requirements identified in Section 3. We introduce three evaluation criteria, referred to as levels, for this: dialogue, system design and user study. Each of these is introduced next.
Dialogue. This level determines how well the dialogue rules presented in Section 4 contribute towards satisfying each requirement.
System design. This level evaluates each requirement against the protocol implementation since some requirements can only be met at the implementation level.
User study. This level validates the satisfaction of each requirement through the user study described in Section 5.2.
Table 8 presents a summary of the results. The first column shows the id of each requirement, columns second to fourth show whether the corresponding requirement is verified through the dialogue, system design or user study. Finally, the last column shows the representative quote from a participant in case the requirement satisfaction is verified through the user study. The checkmark symbol in a table cell shows that the corresponding requirement is satisfied for the level indicated in the column header. The triangle symbol indicates that the corresponding requirement is satisfied indirectly by virtue of implementing the dialogue protocol. We distinguish it from the checkmark to show that the system design level does not make an active contribution to fulfilling the requirement for these cases. So, the evaluation for these requirements is superficial at this level. A blank value for the cell indicates that corresponding requirement is not satisfied by the level indicated in that column header. Following paragraphs give a detailed discussion on each row of Table 8.
Summary of requirements evaluation
Summary of requirements evaluation
RA1 is the only requirement that is directly derived from the participant’s characteristics. Hence, it can be evaluated through the user study as well as with dialogue rules. By virtue of the fact that the dialogue rules already satisfy RA1, the system design also satisfies the requirement inherently. We use the symbol △ to indicate this.
All locution groups
Evaluation against cooperation requirements
This section describes how the system and the dialogue game measures up against each of the cooperation oriented requirements on the three levels.
At the dialogue level, RC1 is enabled through two mechanisms. The first is absence of any explicit locution to express disagreement such as ‘I disagree’. This allows the protocol to avoid deadlocks amongst participants. Secondly cooperation is enforced through implicit disagreements using locution subsets
RC2 is considered as a cooperation requirement since the dialogue protocol requires cooperation between the participants as a means for quality control. Hence, the protocol ensures RC2 because it requires at least one explicit agreement by another participant in order for the recommendation to be added to the collective commitment store. Even at that stage, they are open to non-monotonic debate. Hence at the end of the dialogue, only decisions that have survived the critical discussion are recommended. The embedding of explanatory illocutionary force, represented by locution
Since RC3 also requires a cooperative setting, it is listed as a cooperation requirement. The dialogue rules enforce RC3 by not forcing the participants to reply only to the preceding move, rather the rules allow participants the flexibility to reply to any number of previous locutions at any time. This allows any participant to either introduce new knowledge into the conversation by providing facts using locution rule subsets
RC4 is also dependent on cooperation of participants since only cooperation can ensure knowledge transfer. The dialogue game protocol enables this through locution classes
Evaluation against protocol oriented requirements
Next we evaluate the dialogue game against each of the six protocol oriented requirements in detail as summarised in Table 8.
RP1 and RP2 are topic-based and lay down specific requirements for the inclusion of these topics. Their satisfaction can be verified through locution rules
The protocol incorporates RP3 through locution rules
The dialogue game provides mechanism to resolve conflicts through the introduction of locution rules
By enforcing turn-taking, the protocol ensures equal opportunity for getting input from all participants, satisfying RP5. Since it is trivial to verify this requirement through the user study, it is marked as empty cell in Table 8. However, since the system design enforces a turn-taking mechanism, it satisfies RP5.
Finally, the protocol incorporates RP6 by giving the same validity to all moves by all participants. This requirement concerns the dialogue protocol definition so it does not make much sense to evaluate it through the user study or against system design. Hence, the corresponding cells are marked as empty in Table 8.
Evaluation against implementation oriented requirements
Next we discuss evaluation of each of the implementation oriented requirements according to the three evaluation criteria.
RI1 and RI2 require that the dialogue game be coordinated and the dialogue should be recorded. Both of these are satisfied at the system design level because the system acts as coordinator of the dialogue and does administrative book keeping tasks such as regulating turns, recording the dialogue and informing participants of the moves available at each time in the dialogue. Although the protocol description requires that these two requirements be met, it does not specifically provide a mechanism to enforce this. RI2 in particular, can only be enforced through system design. Hence, both these requirements are evaluated and verified at the system design level rather through dialogue rules or user study. So, the corresponding columns are marked as blank for the latter.
The dialogue rules do not provide any mechanism to protect patient privacy as specified by RI3. However, this requirement is partially satisfied through system design. This is because the platform limits the scope of information transfer to only the participants of the dialogue. Therefore, it protects patient privacy by design. Moreover, in the case of using anonymised data, it guarantees absolute privacy of the patient. However, the current prototype does not encrypt the information exchanged to secure it from malicious interference and leaves the anonymisation of the data exchanged to the participants’ discretion. Hence, it only partially fulfils RI3. Since it does not make much sense to evaluate it through the user study, the corresponding cell is marked as blank in Table 8.
User perspectives on expert collaboration system
The post session interviews from the user study were recorded with the consent of participants, transcribed and the interview data was thematically analysed to discover key insights. Thematic analysis is concerned with identifying patterns in qualitative data. The analysis can identify themes at the surface meaning level, referred to as semantic level or go beyond what was said to discover underlying concepts, known as latent level [37]. Attention was paid to both semantic and latent meaning implied in the feedback. Specifically, the comments from the participants were organised into related concepts following the bottom-up organisation method of affinity matching. Affinity matching is a bottom-up analysis technique for analysing data from a user study. In this case, relevant findings are grouped and the category labels are inferred from these groupings [5]. The advantage as opposed to a top-down approach with pre-defined labels is that it keeps an open mind to what the data might reveal. We identified six themes from the participants’ feedback on various aspects of the dialogue game. Table 9 shows the representative quotes form participants against each identified theme. Next, we describe the insights from users’ perspectives for each theme.
Representative quotes of participants for each emergent theme
Representative quotes of participants for each emergent theme
Value in real-life use. The most common theme was the practical value of such a discussion system in the medical community. All participants in group 1 agreed that the platform would facilitate professional communication between medical experts, especially multidisciplinary communication and communication between public and private sectors. Participants agreed that the platform facilitates knowledge transfer and allows for a productive discussion. One participant was of the opinion that the platform helps clear communication by taking on the burden of politeness, in that doctors in general do not ask each other for clarifications and justifications directly as it could be considered rude but since this is part of the platform’s functionality, it no longer seems like a personal affront, rather just the way the platform works. This is a really significant comment since providing a means of getting around personal issues that hamper communication was one of the basic requirements the platform and the underlying protocol aimed to fulfil.
Satisfaction with move options. One of the most important practical aspects of a dialogue game for expert discussion is whether it allows the participants to express themselves completely. This was evaluated through taking participants’ feedback on whether they found the dialogue locution sufficient for their discussion. All participants in group 1 expressed overall satisfaction with the request-response options that are part of the underlying protocol. They thought that it presented the ideal scenario. Three out of six participants thought that the response I agree was not sufficient by itself because they would have liked to express partial agreement as well. For example, I agree but. One participant was of the view that not having any I do not agree option explicitly was good because doctors are in general always coming to a deadlock because of this so it was useful to ask for an explanation instead rather than expressing explicit disagreement.
Utility of different explanation requests. In order to evaluate whether the classification of different explanation requests is useful in a practical scenario, participants were asked whether they found the categorisation useful. Three out of six participants in group 1 felt that all explanation requests were useful and they would use them in their communication. However, the other half argued that a justification request sounded really aggressive and unnatural and they would never use it in a real world scenario. They would prefer to use the explanation request instead. However, all participants agreed that they were not concerned about the subtle differences between explanation, clarification and justification requests as long as the intention of asking for an explanation was conveyed.
Effect of turn-taking. Since traditional collaboration between experts is real-time, it was an open research question for protocol design whether it should allow synchronous or asynchronous communication. In order to meet requirements RP5 and RP6, the protocol was designed to be synchronous and participants were asked to give their feedback based on their experience. Two different perspectives emerged on the turn-taking mechanism implemented in the platform and the underlying protocol. Two out of six participants in group 1 felt that the turns were good and helped to coordinate the discussion. One participant felt that it would be useful to be able to edit your response when it was not your turn because you might remember something after your turn had passed. One participant was of the view that ideas slipped your mind while waiting for your turn while another participant felt that the turn-taking put a lot of pressure on you to say something during your turn even when you did not think that you had anything to say. They suggested that a dynamic turn-taking mechanism would be better to preserve coordination of the discussion and to deal with the last two problems. They thought that an option to queue for the turn token would be best.
User interface. Although the main aim for the user study was to evaluate the dialogue game protocol, it was anticipated that participants would end up evaluating the system at the user interface level. This was confirmed in the user study where the participants ended up offering several valuable suggestions for user interface improvements. All participants in group 1 thought that the clear separation of the discussion into history, diagnosis, advice and concerns was very useful and efficient. Three out of six participants seemed happy with the user interface. Three out of six participants agreed that having the history on a side panel would be more efficient because it would help reduce scrolling time to check on it. One participant also felt that displaying history in separate lines would also improve presentation. One participant felt that nesting the messages would make the presentation more clear and efficient.
User study design. One of the themes that emerged was that the participants were surprised with how the task in the user study was designed. Specifically, they found the distribution of patient history amongst the different participants unnatural compared to their experiences in the professional scenarios. All participants in group 1 found the division of patient history between the participants unnatural compared to the professional real life scenarios in which all the history is presented upfront. While this was done to check if transfer of information took place, they argued that knowledge transfer and information took place irrespective of this small test. So for the second group all the patient history was provided upfront to participant 1.
This work envisioned a human-artificial agent hybrid collaborative recommendation system. As a first step towards this end, it presented a requirement specification for collaborative interactions between experts and an inquiry dialogue game grounded in the specification. The dialogue game allows multiple expert agents to collaborative on the best recommendations for a user. The game combines explanatory illocutionary forces in an inquiry dialogue. The motivation for doing this is to make the inquiry process and consequently, the output of the multiagent system explainable by generating richer traces for the reasoning process itself. The game presents an approach towards incorporating explainability within multiagent systems. This work also presented an evaluation of the dialogue game against the requirement specification through a user study. The user study was also significant in that it highlighted the real life utility for such dialogue platforms in the medical domain. Such platforms can enable clearer and systematic communication across multiple healthcare disciplines and sectors. This work proposes to import the methodology of software engineering into the area of formal dialogues. This methodology consists of the following steps: the collection of requirements for a dialogue game in a selected domain application; the design and implementation of a protocol according to these requirements; and the evaluation of a dialogue system against the requirements.
The next step would be to implement and evaluate EDG for a multiagent system. This would involve evaluating formal properties of the system such as deadlocks, livelocks and termination guarantees. One possible approach for investigating runtime termination guarantees for EDG is to explore multiagent frameworks that can provide this kind of guarantee through an appropriate moderator role. For example, the governor role in Electronic Institutions [19] can be extended to close termination property at runtime. Another research direction could be to close the protocol against disruption by non-cooperative and malicious agents. It might also be interesting to investigate whether the protocol can do away with turn-taking and synchronous communication since it may not scale well to a system with many participants.
While EDG presents which locutions can be used in response to others, it does not investigate how an agent can select the best locution in response to another based on its knowledge base. This is a very important aspect to implement the protocol in a multiagent system. Hence, an important future direction is to develop reasoning strategies for agents for participating in the EDG. One possible approach for doing this is to explore argumentation-based reasoning for EDG on the lines of [10]. Subsequently, the next step would be to adapt and implement the protocol in a human-artificial agent hybrid system. Other possible interesting directions to investigate include investigating how well the requirement specification presented in this work generalises to other domains such as engineering or aviation. On the flip side, it can also be interesting to evaluate how well existing dialogue game protocols [10,51,58] conform to the requirement specification through a user study. Finally, the prototype of the platform developed as part of this work can be refined and released as open source software for facilitating communication between experts in the healthcare domain.
The politeness rules introduced here show that in order to make dialogue games more human-centred, we need to introduce the same societal machinery being employed in real life conversation in order to safeguard the integrity of the dialogue in a computational context.
Footnotes
Acknowledgement
The work reported in this paper has been supported in part by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 860621, in part by the project 2021 SGR 00754 of the Catalan Government, in part by Chist-Era under grant 2022/04/Y/ST6/00001, in part by POB CyberDS of Warsaw University of Technology within the Excellence Initiative: Research University (IDUB) programme under grant 1820/1/Z01/POB3/2021, and in part by VW foundation (VolkswagenStiftung) under grant 98 542.
