Abstract
Persuasion is an important and yet complex aspect of human intelligence. When undertaken through dialogue, the deployment of good arguments, and therefore counterarguments, clearly has a significant effect on the ability to be successful in persuasion. A key dimension for determining whether an argument is good is the impact that it has on the concerns of the intended audience of the argument (
Introduction
Persuasion is an activity that involves one party trying to induce another party to believe or disbelieve something or to do (or not do) something. It is an important and complex human ability. Obviously, it is essential in commerce and politics. But, it is equally important in many aspects of daily life. Consider for example, a child asking a parent for a raise in pocket money, a doctor trying to get a patient to enter a smoking cessation programme, a charity volunteer trying to raise funds for a poverty stricken area, or a government advisor trying to get people to avoid revealing personal details online that might be exploited by fraudsters.
Arguments are a crucial part of persuasion. They may be explicit, such as in a political debate, or they may be implicit, such as in an advert. In a dialogue involving persuasion, counterarguments also need to be taken into account. Participants may take turns in the dialogue with each of them presenting arguments, some of which may be counterarguments to previously presented arguments. So the aim of the persuader is to persuade the persuadee through this exchange of arguments. Since some arguments may be more effective than others in such a dialogue, it is valuable for the persuader to have an understanding of the persuadee and of what might work better with her.
In this paper, we consider how arguments that relate better to the priorities and concerns of a persuadee can be more effective in a persuasion dialogue. Consider for example a doctor in a university health clinic who is trying to persuade a university student to take up regular exercise, and suppose the student says that she does not want to take up a sport because she finds sports boring. The doctor then needs to find a counterargument to the student’s argument. Suppose the doctor has two options:
Option 1: Doing sport will not only help your physical health, but it will help you study better.
Option 2: Doing sport will get you in shape, and also help you make new friends.
The argument for Option 1 concerns physical health and getting a good degree, whereas the argument for Option 2 concerns physical health and social life. Now suppose the doctor has learnt through the conversation that the student does not prioritize physical health at all, ranks social life somewhat highly, and ranks getting a good degree very highly. In this case, the doctor will regard the argument in Option 1 as being a better counterargument to present to the student, since it appears to have a better chance of convincing the student.
Whilst this is a simple and intuitive idea, there is a lack of a general framework for using concerns in making strategic choices of move in the way suggested by the above example. The notion of a concern seems to be similar to the notion of a value. Often a value is considered as a moral or ethical principle that is promoted by an argument, though the notion of a value promoted by an argument can be more diverse and used to capture general goals of an agent [3]. Values have been used in a version of abstract argumentation called value-based argumentation frameworks (VAFs) [9–11], and furthermore, they have been considered in models of dialogical argumentation for persuasion [8]. In VAFs, values are used for ignoring attacks by counterarguments where the value of the attacking argument is lower ranked than the attacked one. Thus, the role of values in VAFs is different to the role of concerns as suggested in the example above where the doctor chooses the argument of greater concern to the patient.
In this paper, we investigate how taking into account the concerns of a participant can be used in making strategic choices of move, and show how this can be incorporated in argument-based software for persuasion. We focus on the use of persuasion to change the belief in some goal argument (i.e., an argument that we want the persuadee to believe). A goal argument might reflect an “intention” to do something (as illustrated in the examples above) but this is not mandatory. For example, it could be that we want the persuadee to believe a particular explanation for a past event.
Background on automated persuasion systems
We assume that an automated persuasion system (APS) is a software application running on a desktop or mobile device. It aims to use convincing arguments in a dialogue in order to persuade the persuadee (the

Interface for an asymmetric dialogue move for asking the user’s counterarguments. In this example, the user is asked about two arguments with four possible counterarguments per argument.
The dialogue may be asymmetric since the kinds of moves that the APS can present may be different to the moves that the persuadee may make. For instance, the persuadee might be restricted to only making arguments by selecting them from a menu in order to obviate the need for natural language processing of arguments being entered (as illustrated in Fig. 1). In the extreme, it may be that only the APS can make moves.
Whether an argument is convincing or not depends on the context of the dialogue and on the characteristics of the persuadee. An APS may maintain a model of the persuadee. This may be used to predict what arguments and counterarguments the persuadee knows about and/or believes. This can be harnessed by the strategy of the APS in order to choose good moves to make in the dialogue.
In our previous work on developing APSs, we have primarily focussed on beliefs in arguments as being the key aspect of a user model for making good choices of move in a dialogue. To represent and reason with beliefs in arguments, we can use the epistemic approach to probabilistic argumentation [7,41,48,49,51,63,72] which has been supported by experiments with participants [62]. In applying this approach to modelling a persuadee’s beliefs in arguments, we have developed methods for: (1) updating beliefs during a dialogue [43,45,50]; (2) efficiently representing and reasoning with the probabilistic user model [34]; (3) modelling uncertainty in the modelling of persuadee beliefs [36,46]; (4) harnessing decision rules for optimizing the choice of argument based on the user model [35,37]; (5) crowdsourcing the acquisition of user models [47]; and (6) modelling a domain in a way that supports the use of the epistemic approach [18]. So these developments offer a well-understood theoretical and computationally viable framework for taking belief into account in APSs for applications such as behaviour change.
In our previous work, there is no consideration of how the concerns of the persuadee could be taken into account in an APS. Therefore, our ultimate goal in this paper is to understand the notion of concerns in order to harness them for making better choices of move in asymmetric dialogues in APSs. To this end, we have undertaken empirical studies with participants showing that: there is a common understanding amongst participants as to what is meant by concerns associated with arguments; useful preferences over types of concern can be gathered from participants; participants tend to select counterarguments according to their preferences over types of concern associated with the counterarguments; preferences over types of concern can be deployed by an automated persuasion system in order to improve the persuasiveness of a dialogue.
In addition, the methods that we employed in the empirical studies can be used as pipeline of methods for acquiring and deploying types of concern, and preferences between them, in automated persuasion systems.
Note, in this paper, we focus exclusively on the role of concerns in persuasion, and so we do not consider role of beliefs in persuasion here. We leave the combination of beliefs and concerns to future work.
We proceed as follows. In Section 2, we consider how we can capture concerns in argumentation dialogues, then in Section 3, we present our empirical studies for investigating how concerns can be acquired and used in argumentation for persuasion. We situate our work with respect to related literature in Section 4, and finally discuss our work in Section 5.
Argumentation with concerns
In order to investigate the use of concerns in persuasion, we will draw on abstract argumentation for representing arguments and counterarguments arising in persuasion dialogues, and then we will consider how concerns arise in argumentation.
Abstract argumentation
In abstract argumentation, as proposed by Dung [22], each argument is treated as an atom, and so no internal structure of the argument needs to be identified. This can be represented by an argument graph as follows.
An
So an argument graph is a directed graph. Each element
Consider arguments
Given an argument graph, a natural question to ask is which arguments are acceptable (
In our work, we will use the definition of an argument graph. However, as we explain in the next subsection (
We assume that a
In this paper, we will just consider two types of move. For both types, we assume that the arguments appearing in the move come from an argument graph that is known by the system.
A
A
We introduce the menu move as a way for the user to give her input into the discussion. We assume that we are unable to accept input from the user that would involve free text presentation of arguments since natural language processing technology is not yet able to adequately understand such input. Using menu moves leads to an
A feature of the posit and menu moves is that they involve a set of arguments at each step. To illustrate a real-world situation where a discussion that can proceed by participants exchanging sets of arguments, consider an email discussion between two colleagues. Here, one participant might start with a persuasion goal, and the second participant might provide some counterarguments to that persuasion goal. The first participant might then reply with a counterargument to each of the counterarguments, and so on.

Example of an argument graph.
A
Note, by indirect attack, we mean that an argument is on a path to the persuasion goal with an odd number of arcs where that number is greater than 1, such as the path from
To formalize our protocol, we require the following subsidiary functions. The first function gives the set of options which are the counterarguments of an argument that have not been used at a previous step in the dialogue. This function is used to ensure that the dialogue only allows for an argument to be presented at most once in the dialogue. The second function gives the arguments that appear in a menu.
The set of
The
Before we give the definition of the asymmetric posit dialogue protocol, we explain the conditions in the definition as follows.
For each step
For the first step, the first move is by the system and it is a singleton set containing a persuasion goal (
The participants take turns, so if
For each step
Each argument
For each argument
For each step
A dialogue ends when one of the following conditions holds:
The last move of the dialogue is
The last move of the dialogue is a user move, and the user selects the null argument (
We capture the above informal conditions in the following definition for the asymmetric posit dialogue protocol.
An For each step For step For each step For each step for each argument and for each argument For each step For the final step For each argument
There are three main advantages of this protocol: (1) it supports a form of asymmetric dialogue where the user can only select arguments that are presented to her via the menu; (2) it allows for multiple arguments to be attacked at each step of the dialogue; and (3) it does not force all arguments to be addressed by counterarguments. For the latter point, the system can fail to counter some, though not all, of the user’s arguments at a step without the dialogue terminating at that step. This is because we are not assuming any particular dialectical semantics (such as proposed by Dung [22]). Rather, we are just concerned with what are allowed exchanges between the system and user.

An argument graph for Example 2.
For the argument graph in Fig. 3, the following is a dialogue according to the asymmetric posit protocol. Note, that the system did not select a counterargument to System move: User move: System move: User move: System move:
For the argument graph in Appendix B, the following is a dialogue according to the asymmetric posit protocol where System move: User move: System move: User move: System move:
So the above protocol captures the behaviour where the agents exchange arguments from the argument graph until one of the following conditions holds: (1) the system has no further arguments to present to defend the persuasion goal; (2) the user has no further arguments to present to directly or indirectly attack the persuader’s persuasion goal; or (3) the user has a list of counterarguments to choose from, but the user chooses the option that none of the counterarguments are appropriate (
At the end of the dialogue, the system (
This protocol is different to the dialogue protocols for abstract argumentation that are used for determining whether specific arguments are in the grounded extension [64] or preferred extension [15]. It is also different to the dialogue protocols for arguments that are generated from logical knowledgebases (
Note, the discussion of concerns that we consider in the next section does not rely on any specific protocol. We have used the asymmetric posit dialogue protocol in the empirical studies later in this paper (as we explain in Section 3.2). However, we believe that the findings we make from the empirical study are generalizable to a wide variety of dialogue protocols.
A concern is something that is important to an agent. It may be something that she wants to maintain (for example, a student may wish to remain healthy during the exam period), or it may be something that she wants to bring about (for example, a student may wish to do well in the exams). Often arguments can be seen as either raising a concern or addressing a concern as we illustrate in the following example.
Consider the domain “Cycling in the city”. Depending on the participants, the first argument could be addressing the concern of health whereas the second could be raising the concern of health. ( (
When considering a set of agents, there may be a number of similar concerns, and it may be appropriate to group these into types of concern. For example, we could choose to define the type “Fitness” to cover a variety of arguments from those that flag problems associated with lack of exercise through to those that suggest solutions for people training for marathons. The actual types of concern we might consider, and the scope and granularity of them, depend on the application. However, we assume that they are atomic, and that ideally, the set is sufficient to be able to type all the possible arguments that we might want to consider using in the dialogue.
The types of concern may reflect possible motivations, agenda, or plans that the persuadee has in the domain (i.e., subject area) of the persuasion dialogue. They may also reflect worries or issues she might have in the domain. When an argument is labelled with a type of concern, it is meant to denote that the argument has an impact on that concern, irrespective of whether that impact may be positive or negative.
Returning to Example 4, consider the concern type “Health” and the domain “Cycling in the city”. Both
In this paper, we do not differentiate between positive and negative impacts. We just assume that when we label an argument with a type of concern, then the argument has an impact on that concern. Nevertheless, being aware that the impact may be positive or negative may help in identifying them in practice. We leave formalization of impacts on concerns for future work.
However, we do require information about preferences over types of concern. Given a set of types of concern for a given domain, an agent may have a preference ordering over those concerns that reflects the relative importance of the concerns to the agent. For instance, in the “Cycling in the city” domain, we may have the types “Health”, “Safety”, and “Personal Economy”, and we may have an agent that regards “Health” as the most important concern for her, “Safety” as the next most important concern for her, and “Personal Economy” as the least important concern for her. Another agent may have a different ordering over these types of concern.
There are some potentially important choices for how we can represent the preferences. For instance, we can consider preferences as pairwise choices or we can assume a partial or even linear ordering. We are agnostic as to how preferences are represented, and assume an appropriate choice can be made for an application. There are also numerous techniques for acquiring preferences from participants (see
We will use the preferences over types of concern to make good choices of move. Since agents may differ in their preferences, we need to find out their preferences during a dialogue. We can do this by querying the user. In practice, we do not want to ask the user too many questions as it is likely to increase the risk of the user terminating the dialogue prematurely. Furthermore, it is normally not necessary to know about all the preferences of the user. To address this, we can acquire comprehensive data on the preferences of a set of participants, and then use this data to train a classifier to predict preferences for other participants based on a subset of questions. We investigate this further in the empirical study (Section 3).
In a dialogue, the system needs to choose which arguments to present. For instance, in the asymmetric posit protocol, the system chooses arguments from amongst those that are counterarguments to the arguments selected by the user. There are various criteria that can be used for selecting arguments. Here, we consider how the system can select arguments based on the concerns that the user has.
For this, we assume that we have a set of arguments, and that each argument is labelled with the type(s) of concern it has an impact on. Furthermore, the system has a model of the user in the form of a preference relation over the types of concern. We do not assume any structure for the preference relation. In particular, we do not assume it is transitive.

Compare types
For each user argument
The algorithm considers each pair of arguments
We can use the algorithm above to select moves in a dialogue, such as in part 4 of the asymmetric posit dialogue (Definition 4). We will harness this in the empirical study of persuasion dialogues in the next section.
In this section, we will describe the aims, methods, results, and conclusions, for a sequence of empirical studies. All these studies have been approved by the UCL Research Ethics Committee.
Aims
The purpose of this sequence of studies is to investigate the nature of concerns of a participant in a dialogue. In particular, we want to consider the following questions.
Question 1: Do participants tend to agree on the types of concern that should be associated with an argument?
Question 2: Can participants give meaningful preferences over the types of concern? In other words, do participants express non-random preferences over concerns?
Question 3: Are participants playing by their preferences over concerns? In other words, when a participant makes a choice of argument to present from a set of possible candidates, does she choose the argument with the highest ranking concern?
Question 4: Can we use a participant’s preferences over concerns to choose more preferred arguments so that the dialogue is more persuasive?
In order to use preferences to choose good moves in a persuasion dialogue, we would expect to have positive answers to Questions 1 to 3. So Question 4 depends on the answers to the questions before it.
Methods
Building upon the discussion of types of concern given in the previous section, we now present the methods we require for our empirical study. These are split into the following steps which we explain in more details in the specified subsections.
Creating the argument graph (Section 3.2.1).
Crowdsourcing the associations between arguments and types of concern (Section 3.2.2).
Gathering the pairwise preferences the participants have on the types of concern (Section 3.2.3).
Checking whether participants are playing by their explicit preferences over concerns (Section 3.2.4).
Creating a decision tree for each pair of types that can allow us to predict a participant’s preferences over types of concern (Section 3.2.5).
Building an automated persuasion system having a discussion with a human and reasoning about the preferences of this participant (Section 3.2.6). The decision trees developed in step 5 will be harnessed in the automated persuasion system to allow it to present fewer questions to the participants about their preferences.
We use these steps as the methods for this empirical study, but they can also be harnessed as a general framework for acquiring and deploying concerns in persuasion systems. They offer a pipeline of methods for the development of an automatic persuasion system, from the creation of the argument graph through to using the preferences on the types in an automatic dialogue. This pipeline includes the gathering of the type associations with the arguments and the learning of the preferences. So although we present our framework via its application to the “Cycling in the city” domain, it is general and can be applied to other domains.
Note that for the studies with participants, we recruited separate groups of participants for each of step 2 (i.e., the study described in Section 3.2.2), steps 3/4 (i.e., the study described in Sections 3.2.3 and 3.2.4), and step 6 (i.e., the study described in Section 3.2.6). The reason we recruited three disjoint groups of participants is that we wanted to remove any possibility of bias that might arise from participating in multiple studies.
Creation of the argument graph
This first step consists of listing the arguments associated with the issue at hand. We start by specifying a goal argument in the dialogue. This argument is what the persuader wants the persuadee to believe by the end of the dialogue. In the empirical study described in Section 3.2.6, the automated persuasion system starts with this argument in every dialogue.
In this paper, we undertook a web search on the pros and cons of city cycling. We consulted diverse sources including online forums and documentation on the topic (for instance, governmental websites promoting cycling). From this search, we manually identified a number of arguments, and attacks between them, concerning aspects of cycling in the city. Recall that we only deal with the attack relation in this work, and so we did not consider other kinds of relation such as support. Also, we did not attempt to distinguish between different kinds of attack (such as undercutting or undermining). Some arguments were edited to enable us to have reasonable depth (so that the dialogues were of a reasonable length) and breadth (so that alternative dialogues were possible) to the argument graph.
We wrote a short statement specifying each argument. This resulted in 51 arguments about aspects of cycling in the city. The arguments are enthymemes (i.e., some premises/claims are implicit) as this offers more natural exchanges in the dialogues. The full list of the arguments can be found in Appendix A, and the argument graph can be found in Appendix B.
We flagged two of the 51 arguments as being potential persuasion goals: “You should cycle to work.” and “The government should invest in supporting initiatives to increase cycling.”. We used the former as the principal goal and the latter as a fall-back goal if the persuader gets into a situation where the principal goal cannot be defended, and so it can open new possible branches of dialogues.
As an alternative to manual creation of the argument graph, this step can be crowdsourced by asking for arguments and counterarguments from participants. This can be done incrementally as follows: create an argument graph composed of the goal argument and several obvious counterarguments; ask the participants for counterarguments to the arguments in the graph; create a new graph composed of the previous arguments and the new counterarguments; repeat steps 2, 3 and 4 until no additional arguments are added to the graph. Potentially, this alternative will provide better coverage of the arguments and counterarguments for a domain, but it may involve more time and effort to run the interactive and incremental process, and it may also involve filtering out duplicate arguments and rejecting poor quality arguments.
Association of types of concern to arguments
Once all the arguments have been defined, they need to be typed. The types of concern that can be associated with the arguments are topic dependent. In this work, we manually defined a set of 8 allowed types of concern and crowdsourced their associations with the arguments. These types are: “Time”, “Fitness”, “Health”, “Environment”, “Personal Economy”, “City Economy”, “Safety” and “Comfort”. The difficulty of this step lies in presenting the participants with a set of types of concern that is sufficiently broad to cover all possibilities, but not too specific in order to not uselessly spread the votes and have too many types associated with a given argument.
The creation of the set of types of concern can alternatively be left to the participants as a pre-step. However, it bears the risk of having too many types (or even unique ones,
For this step, we used 20 participants from the Prolific1
For each argument described in Appendix A, we asked the participants to choose the type of concern they think is the most appropriate from the list presented at the beginning of this section. We also used two attention checks asking them explicitly to select “Environment” and “Personal Economy”. This was to ensure that participants were concentrating on the instructions for each question.
After the set of types of concern had been created, the next step was to determine the preferences that the users of our system could have on these types. Preference elicitation and preference aggregation are research domains by themselves and it would take more than a paper to fully investigate them all in our context. For this reason, in this work, we decided to use pairwise preferences because of the simplicity of their elicitation. The drawback is that participants can have cycles in their preferences, unlike when a linear complete ordering is forced by the elicitation process. However, in our case, this was not problematical as we later used the pairs of preferences themselves instead of a preorder created from the pairwise elicitation procedure.
For each set of counterarguments for every argument, we asked for the preferences in all the possible pairs of types of concern associated with the counterarguments in the set. So for each pair, we asked the participants to state if they preferred the first type, the second type, if they were equal or if they were incomparable.
For this step, we used 50 participants from the Prolific crowdsourcing website. At the recruitment stage, the participants were informed that the study was about cycling in the city. The prescreening was as follows: first language was English; current country of residence was United Kingdom; age was between 18 and 100. We used two attention checks explicitly asking the participants to choose the left (resp. right) type in two pairs.
Are participants playing by their explicit preferences?
In order to answer Question 3 (
To investigate whether this principle holds, we augmented the previous step of the study (given in Section 3.2.3), by asking the participants to choose the counterargument they “think is the most persuasive” for each of a series of 6 arguments. Using the short names for arguments given in Appendix A, the arguments were: (1) the main goal; (2) lanes; (3) savestime; (4) clothes; (5) expensivelanes; and (6) moretime. For each of these 6 arguments, all counterarguments in the argument graph were given, and the participant had to select the counterargument that they think is most persuasive.
Creation of the decision trees
A potential problem with using preferences in an APS is the need to ask the user about their preferences. This may be an issue if the user is asked too many questions, and as a result, disengages. A potential solution is to train a classification system for predicting the preferences that a given user may have. For this, we created decision trees using information that we had obtained about participants.
In addition to asking for the preferences of the participants over the types of concern (as explained in the previous subsection), we asked them to take a personality test, and to provide some demographic information (such as age, sex, etc.), and domain dependent information such as situational information (living in a city, in the countryside, etc.). Note, after collecting this data, we decided to not use the domain dependent information as we were able to get good results without it. We used the Ten-Item Personality Inventory (TIPI) [27] to assess values on 5 features of personality based on the OCEAN model [58]. It consists of the “Openness to experience”, the “Conscientiousness”, the “Extroversion”, the “Agreeableness” and the “Neuroticism” (the emotional instability).
Using all the data, we learnt a decision tree for each pair of types asked during the previous step using the Scikit-learn2
As a first stage, we ran a meta-learning process in order to determine the best combination of tree depth and minimum number of samples at each leaf for each pair of types. The meta-learning process is the repeated application of the learning algorithm for different choices of these parameters (
Let

Difference
We used cross-validation in the meta-learning to determine the best combination of tree depth and minimum number of datapoints at each leaf. Once the best parameters were found for each pair of types, we then ran the actual learning part using these parameters with all the datapoints concerning the personality and demographic information. We thus obtained one decision tree for each pair of types that was used by the automated persuasion system in the final study.
In order to investigate Question 4 (
Note that for the belief, as described in Appendix F, we asked for a value between
The decision trees are the last part needed by the preference-based system to be able to fully function. Once up and running, the system needs to respond to the arguments from the participant in order to maximize the chances of having her persuaded at the end of the dialogue.
In this study, we used 99 participants recruited from the Prolific crowdsourcing website: 50 for our method and 49 for the baseline (one participation for the baseline was not saved by the system). At the recruitment stage, the participants were informed that the study was about cycling in the city. We presented each participant with a chatbot composed of a front-end we coded in Javascript and a back-end in Python using the Flask web server library.3
The code is available at
The prescreening was as follows: first language was English; current country of residence was United Kingdom; age was between 18 and 100; commute/travel to work was by any means except walking and cycling; no long-term health condition/disability; employment status was full-time or part-time; no full-time remote working; working hours were regular (
In this section, we present the results concerning the experiments specified in the methods section (i.e. Section 3.2). We analyse the results of each step and give insights into the data obtained.4
The data is at
We start by considering results concerning Question 1 (Do participants tend to agree on the types of concern that should be associated with an argument?) from Section 3.1. Of the 20 participants for this step, we removed 2 participants from the pool as they failed one or both attention checks, thus obtaining 18 answers.
Table 4 in Appendix C shows the results regarding the associations. Five of the arguments have a single concern associated with them, and of the remaining arguments, most had the majority of people selecting the same concern.
The types in bold in Table 4 are the types kept for the following steps of the experiments. To select them, we decided to keep all the types having strictly more than half of the votes of the most voted type. For instance, if the most voted type gathered 8 votes over 18, all types with strictly more than 4 votes were kept in the association.
The
Pairwise preferences on types of concern
Next, we consider results concerning Question 2 (
First, we considered whether the choices made by the participants were different from random choices. Recall, that for each pair of types of concern
Second, we considered whether there was a structural pattern to the preferences expressed by each participant. Out of the 49 participants, 11 had a linear partial order (
Partition of dominating types (% is of the proportion of the 22 participants)
Partition of dominating types (% is of the proportion of the 22 participants)
Since the participants are not making random choices, and they have given their pairwise preferences over all the types of concern, we have a positive answer to Question 2.
We now turn to Question 3 (
The six arguments and their counterarguments were a subset of those considered in Section 3.2.2 and so for the analysis, the types of concerns were known for each of them.
We use
Let
We then average on all the participants. A value of 1 means that 100% of the participants played by their preferences and that all the types of the chosen argument are preferred to all the types of non-chosen arguments.
Agreement ratios between explicit and implicit preferences for specific arguments. Using the short names for arguments given in Appendix A, the arguments are: (1) the main goal; (2) lanes; (3) savestime; (4) clothes; (5) expensivelanes; and (6) moretime
The results are presented in Table 2. As some chosen counterarguments have “City Economy” as one of their types, and given that this type is never preferred, a ratio of 1 cannot be reached. Hence, a value of 0.71 is a very high agreement ratio when using our calculation method.
Note that the result for the second argument (“The city can build cycle-only lanes.”) is marginally below the others because one counterargument that the participants could choose (“They create traffic jams.”) was seen in the argument/types association step as having 3 completely different types: “Environment”, “Time” and “City Economy”. However, we assume that no participant could see all three of these types of concern. Therefore, the type(s) seen by the participant concerning this argument might be preferred for her, even though the other(s) is/are not, thus artificially decreasing the ratio. The only other argument with 3 types is “Even if it does take more time, it is a more relaxing way to travel if you avoid busy roads.” but with “Time”, “Comfort” and “Health” where the last two can be related (and sometimes exchanged).
Since the agreement ratios given in Table 2 are quite high, the participants do appear to be playing by their preferences, and hence we have a positive answer to Question 3.
We applied the methodology explained in Section 3.2.5 in order to learn the parameters and subsequently the decision trees. In our case, the depth is from 1 to 5 and the minimal number of samples at each leaf is between 1 and 13.
Given our argument graph, 22 pairs of types (and thus decision trees) out of 28 possible pairs occur. Note, we have 28 possible pairs because there are 8 types of concern, and hence,

Example of a decision tree for the Time/Comfort pair where the categories for the occupation are given in Appendix E Question 3 and “E” stands for “Extraversion” in the OCEAN model.
In order to study the usefulness of this method, we compared it with a dummy classifier (as named by the Scikit-learn library). We tested three different types of dummy classifiers: “uniform”, “most frequent” and “stratified”. The difference lies in how they predict the classes for the testing data given the learning data classes. Given a learning data class distribution (for instance, two classes the “uniform” dummy predicts the “most frequent” dummy predicts the “stratified” dummy predicts
In our case the “stratified” dummy was more efficient than the other two. We thus used it as our comparison baseline.
Our decision tree method yielded an average difference (as of Algorithm 2) of 0.18 (the lower the better) on a 10-fold cross validation for each pair of types. The stratified dummy yielded an average result of 0.37, with a difference statistically significant with
Finally, we consider the results concerning Question 4 (

Initial and posterior belief for the baseline.

Initial and posterior belief for our method.
Results for the automated persuasion systems. We report our results using the following seven criteria: the ratio of participants with positive (resp. negative) change in the belief; the average change on all the participants (avg. change); the average change on the participants who did change (avg. Change w/o 0); the ratio of participants going from a negative belief (resp. positive) to a positive belief (resp. negative); and the ratio of participants who took the time to provide further arguments (that can somewhat be related with the engagement)
Figures 5 and 6 show the belief before and after the participants engaged in the dialogue with respectively the baseline chatbot and the preference-based one. As we can see, the majority of participants do not change their belief. However, although most people also already believe our goal argument, it is easy to see that more people change from negative belief to positive belief and fewer from positive to negative when using our preference-based method instead of the baseline method.
All the results are summarized in Table 3. The participants and the systems exchange 4.2 messages (by both persuader and persuadee) on average during the dialogues. To test the significance of the results, we put the participants into three groups: those who experienced positive changes in the belief; those who experienced negative changes; and those who experienced no change. We used a Chi-square test with the results of our preference-based method as the expected values and the results of the baseline as the observed ones. Note that, as the number of participants were different by one between our method and the baseline method, we repeated it three times with an increment in all the three different groups, one at a time. The difference between our method and the baseline is statistically significant with a maximum
We now return to the aims of the empirical study that we presented in Section 3.1 and consider how the results have answered the questions raised.
So the empirical studies have provided a positive answer to each of the questions. However, further experiments are required to make these claims with more confidence. For instance, different domains, and different experimental set-ups should be investigated in order to get a deeper understanding of the role of concerns in persuasion.
As well as answering the above questions, our empirical study demonstrates that our methods can be applied to the implementation of an automated persuasion system in the “Cycling in the city” domain, and that they appear transferable to other domains.
Literature review
Since the orginal proposals for formalizing dialogical argumentation [38,53], most approaches focus on protocols (
Some strategies have focussed on correctness with respect to argumentation undertaken directly with the knowledgebase, in other words, whether the argument graph constructed from the knowlegdebase yields the same acceptable arguments as those from the dialogue (
In [12], a planning system is used by the persuader to optimize choice of arguments based on belief in premises, and in [13], an automated planning approach is used for persuasion that accounts for the uncertainty of the proponent’s model of the opponent by finding strategies that have a certain probability of guaranteed success no matter which arguments the opponent chooses to assert. Alternatively, heuristic techniques can be used to search the space of possible dialogues [59]. Persuasion strategies can also be based on convincing participants according to what arguments they accept given their view of the structure of an argument graph [60]. As well as trying to maximize the chances that a dialogue is won according to some dialectical criterion, a strategy can aim to minimize the number of moves made [4].
There are some proposals for strategies using probability theory to, for instance, select a move based on what an agent believes the other is aware of [70], or, to approximately predict the argument an opponent might put forward based on data about the moves made by the opponent in previous dialogues [32]. Using the constellations approach to probabilistic argumentation, a decision-theoretic lottery can be constructed for each possible move [51]. Other works represent the problem as a probabilistic finite state machine with a restricted protocol [42], and generalize it to POMDPs when there is uncertainty on the internal state of the opponent [33].
In value-based argumentation, the values of the audience are taken into account. Often a value is considered as moral or ethical principle that is promoted by an argument. Though as considered by Atkinson and Bench-Capon [3], the notion of a value promoted by an argument can be more diverse and used to capture general goals of an agent. A value-based argumentation framework (VAF) extends an abstract argumentation framework by assigning a value to each argument, and for each type of audience, a preference relation over values. This preference relation which can then be used to give a preference ordering over arguments [3,5,9–11]. The preference ordering is used to ignore an attack relationship when the attackee is more preferred than the attacker, for that member of the audience. This means the extensions obtained can vary according to who the audience is. VAFs have been used in a dialogical setting to make strategic choics of move [8]. Whilst, the notion of a concern seems to be similar to the notion of a value, the role of concerns is different to that of values. In our approach, we do not use concerns to change the structure of an argument graph but rather use concerns directly to select arguments to present to a user.
Whilst there may be some overlap in the notion of a value assignment and of a concern assignment, there do appear to be differences between them. In particular, values appear to capture ethical or moral dimensions that are promoted by an argument whereas we view concerns as being issues raised or addressed by an argument. In addition, the way they are used is different. For VAFs, it is about trying to determine what is acceptable to an audience using a static presentation of an argument graph, whereas for our approach, it is about trying to determine what are the best moves to make in dynamic setting during a dialogue to get the best outcome.
More recently, an alternative notion of values has been proposed for labelling arguments that have been obtained by crowdsourcing. In this alternative notion, a value is a category of motivation that is important in the life of the agent (
The notion of interests as arising in negotiation is also related to concerns. In psychological studies of negotiation, it has been shown that it is advantageous for a participant to determine which goals of the other participants are fixed and which are flexible [25]. In [69], this idea was developed into an argument-based approach to negotiation where meta-information about each agent’s underlying goals can help improve the negotiation process. Argumentation has been used in another approach to co-operative problem solving where intentions are exchanged between agents as part of dialogue involving both persuasion and negotiation [21]. Even though the notions of interests and intentions are used in a different way to the way we use the notion of concerns in this paper, it would be worthwhile investigating the relationship between these concepts in future work.
The empirical approach taken in this paper is part of a trend in the field of computational argumentation for studies with participants. This includes studies that evaluate the accuracy of dialectical semantics of abstract argumentation for predicting behaviour of participants in evaluating arguments [17,68], studies comparing a confrontational approach to argumentation with argumentation based on appeal to friends, appeal to group, or appeal to fun [74,75], studies of appropriateness of probabilistic argumentation for modelling aspects of human argumentation [62], studies to investigate physiological responses of argumentation [76], studies using reinforcement learning for persuasion [40], and studies of the use of predictive models of an opponent in argumentation to make strategic choices of move by the proponent [71]. There have also been studies in psycholinguistics to investigate the effect of argumentation style on persuasiveness [52].
Finally, there are a number of studies that indicate the potential for dialogical argumentation in behaviour change applications including dialogue games for health promotion [16,28–30], embodied conversational agents for encouraging exercise [61], and tailored assistive living systems for encouraging exercise [31].
Discussion
In this paper, we proposed a pipeline of methods for acquiring and using types of concern, and preferences over them, to select moves in a dialogue. In addition, we have presented empirical studies to validate the pipeline including showing that people do agree on the concerns that can be associated with arguments, that people do have preferences over types of concern, and that people do tend to play by their preferences. We have also demonstrated that we can train classifiers to predict the preferences over types of concern for individuals. Finally, we have shown that we can use preferences over concerns to improve the persuasiveness of a dialogue.
The aim of our dialogues in this paper is to raise belief in goal arguments. A goal argument may reflect an intention but this is not mandatory. We focus on beliefs in arguments because belief is an important aspect of the persuasiveness of an argument (see for example [47]). Furthermore, beliefs can be measured more easily than intentions in crowdsourced surveys.
Preferences over types of concern are an important aspect of persuadee modelling that we have investigated in this paper. We did not assume that the preferences are transitive, and we found that preferences given by participants are often not entirely coherent. This may reflect that participants may find it difficult to remember at each point in a survey what answers they have given previously. It may also reflect that participants change their minds about their preferences as the survey progresses. So we may regard some of the preferences given by participants as noise. Despite this, our investigations in this paper indicate that it is beneficial to harness this preference information for user modelling.
Another topic for future work is the specification of the protocol. Many protocols for dialogical argumentation involve a depth-first approach (e.g., [8]). So when one agent presents an argument, the other agent may provides a counterargument, and then the first agent may provide a counter-counterargument. In this way, a depth-first search of the argument graph is undertaken. With the aim of having more natural dialogues, we used a breadth-first approach. So when a user selects arguments from the menu, the system then may attack more than one of the arguments selected. For the argument graph we used in our study, this appeared to work well. However, for larger argument graphs, a breadth-first approach could also be unnatural. This then raises the questions of how to specify a protocol that interleaves depth-first and breadth-first approaches, and of how to undertake studies with participants to evaluate such protocols.
In future work, we plan to undertake empirical studies with the same research questions but a different domain. The aim would be to see whether we can get similar results in a new domain. In the final part of the empirical study reported in this paper, we had the majority of participants having a starting position of agreement with the stance that we were arguing for. In future studies, we will seek topics for discussion where fewer of the participants start by agreeing with the stance that we are arguing for. This may enable us to demonstrate a bigger effect. We also intend to combine the approach reported here of using concerns of the participant with the beliefs of the participant (as we reported in [35]) and undertake empirical studies to see if we can further improve the degree of persuasion.
Finally, a better understanding of concerns, and the potential impacts that arguments may have on them, may facilitate a better understanding of the nature of persuasion dialogues. Consider two agents in a dialogue. If one of them presents an argument
Consider the concern type “Personal Economy” and the domain “Cycling in the city”. The following are arguments of type “Personal Economy”. Assuming this type means that the agent has the concern that she wants to economize on personal expenditure, then the argument A3 has a positive impact on the concern “Personal Economy” and the counterargument A4 has a negative impact on the concern “Personal Economy”. (A3) Cycling saves money on public transport. (A4) Bikes are expensive.
We will investigate the patterns arising for types of concern, including whether there are advantages or disadvantages to maintaining or switching the type of concern during a dialogue in the future.
Footnotes
Acknowledgements
This research was funded by EPSRC Project EP/N008294/1 “Framework for Computational Persuasion”. The authors are grateful to Lisa Chalaguine and Sylwia Polberg for valuable feedback on earlier versions of this paper. The authors are also grateful to the anonymous reviewers for numerous suggestions for improvements to the paper.
List of the arguments
People cycle 8 miles on average to go to work. Pollution is everyone’s problem. I live further than 8 miles away. I am not fit enough because of my weight. Many companies now have secure bike parking for the employees not to waste time on looking for one. My home is too small to store a bike. You can have work clothes in a bag and wear cycling clothes. I lack the stamina for cycling far. Even if cycling does take more time, it is a more relaxing way to travel if you avoid busy roads. Cycling in a city is unhealthy because of the exposure to the pollution. Cycling saves time because there is no need to find a car parking space. The city can build cycle-only lanes to remove cyclists from the traffic. It is unfair everybody has to pay the infrastructures with taxes for the use by only some. Cycling is good for your heart and muscles. It is a hassle having to change clothes. Plenty of shops sell inexpensive cycling clothes and equipment. I cannot physically cycle. Cycling improves your fitness while you are commuting or doing errands. Many companies have shower facilities. Cycle-only lanes will decrease shopping because of hassle to drivers. Time is wasted looking for a secure place to lock the bike. Traffic jams encourage people to cycle. Buying cycle clothing is too expensive. My workplace is too far away from my home. Cycling in the city is dangerous for the cyclists. But many cyclists don’t know how to ride in traffic. Cars kill more pedestrians than bikes. Building cycle lanes is too expensive for the city. Studies show that cyclists are subject to lower pollution levels than drivers. Wearing bright cycling clothes makes cycling much safer. The city can arrange free cycle courses to improve road safety. Changing clothes does not remove the need to shower. Cycling in the city is dangerous for the pedestrians. More cyclists means less drivers and less pollution. Cycling is cheaper for the citizens than driving or public transportation. Studies show that cyclists come to the shops more frequently than drivers and so are good customers. Cycling improves weight loss. You are at risk of rain and your clothes can get wet. Cars can carry more shopping than bikes and so contribute more to the local economy. Work clothes are uncomfortable for cycling. Cycle lanes create traffic jams. Creating bike parking means less car parking is available and that can affect the local economy. It takes more time to travel by bike. Evidence shows that infrastructures for cyclists favour the local economy generating more taxes for the city to use. Changing clothes is just a habit that you do quickly at the start and end of the working day. Each car parking space can accommodate 6 bikes which means more people can be catered for and that can boost the local economy. Many cyclists are also shoppers. Cycling is an easy way to get fit. I do not like to cycle.
Argument graph
See Fig. 7.
Association of types of concern to arguments
Table 4 shows the results regarding the associations. The first column is the short argument name as specified in Appendix A. The types in bold are the types kept for the following steps of the experiments. To select them, we kept all the types having strictly more than half of the votes of the most voted type. For instance, if the most voted type gathered 8 votes over 18, all types with strictly more than 4 votes were kept in the association. Results for the argument/types association experiment
Arguments
Types
8miles
8
7
Env.
3
pollution
14
Health
4
further
11
Fitness
2
P. Eco.
2
Comfort
2
Env.
1
weight
13
Health
5
parking
12
Safety
3
Comfort
2
C. Eco.
1
home
11
Comfort
5
Env.
2
clothes
15
P. Eco.
3
stamina
10
6
Comfort
2
relaxing
6
5
5
Env.
1
P. Eco.
1
unhealthy
14
Env.
4
savestime
17
P. Eco.
1
lanes
11
7
taxes
10
8
dislike
12
P.Economy
4
Health
1
Fitness
1
lesspollution
11
Env.
5
C. Eco.
2
bright
17
Health
1
courses
15
C. Eco.
3
fit
16
Health
2
shoppers
12
P. Eco.
6
shower
12
Health
3
P. Eco.
2
Time
1
footdangerous
14
C. Eco.
2
Env.
2
6bikes
15
P. Eco
1
Comfort
1
Env.
1
lessdrivers
15
Health
3
cheaper
15
C. Eco.
3
shopmore
13
P. Eco.
4
Env.
1
weightloss
11
Health
7
rain
13
Health
3
Env.
2
carrymore
12
P. Eco.
3
Comfort
3
uncomfortable
17
P. Eco.
1
trafficjams
7
4
4
Safety
2
Comfort
1
lessparking
12
P. Eco.
2
Comfort
1
habit
10
7
P. Eco.
1
heart
11
7
hassle
9
7
P. Eco.
2
unexpensive
15
C. Eco.
2
Comfort
1
cannot
7
6
Comfort
2
P. Eco.
2
Safety
1
fitness
14
Health
4
facilities
10
P. Eco.
4
Time
1
Health
1
C. Eco.
1
Fitness
1
lessshopping
11
Env.
3
P. Eco.
1
Safety
1
Health
1
Comfort
1
wastedtime
18
encourage
11
Env.
3
C. Eco.
2
Safety
1
Health
1
expensive
18
toofar
11
Fitness
3
Comfort
2
Env.
1
P. Eco.
1
bikedangerous
18
dontknow
15
Time
1
Health
1
C. Eco.
1
killmore
15
Health
2
C. Eco.
1
expensivelanes
17
Env.
1
moretime
18
infrastructures
18
Argument/types association survey
We broke the list of arguments in 5 different steps in order for them to look less daunting to the participants. Note that the arguments inside each step were presented in a random order to each participants. However, the split of the list amongst the steps was the same.
Pairwise preferences survey
“Preferences on argumentation topics” “We first ask you some questions in order to create a profile. Please note that everything is anonymous and no information can be used to personally identify you. Moreover you can choose to quit this survey at any point in time.” “Please enter your Prolific ID.” “Please select your age in the list.” “18–24”, “25–34”, “35–44”, “45–54”, “55–64”, “65+”. The results from this question are given in Table 5.
Age distribution “Please select the category of your occupation.” “Non working” “Students” “Semi-skilled and unskilled manual workers” “Skilled manual workers” “Supervisory or clerical and junior managerial, administrative or professional” “Intermediate managerial, administrative or professional” “Higher managerial, administrative or professional” “Other”
Occupation repartition “Please select your sex.” “Female”, “Male”, “Other” The results from this question are given in Table 7.
Gender distribution “Please select your number of children who are below 18 years old.” “0”, “1”, “2”, “3+” The results from this question are given in Table 8.
Number of children “Please select your home location.” The results from this question are given in Table 9.
“Town with 100,000+ inhab.” “Suburban area of a town with 100,000+ inhab.” “Town with more than 10,000 inhab.” “Smaller town” “Countryside” Idem but for the workplace instead of the home. The results from this question are given in Table 9.
Home and workplace location “Please enter an estimation of the distance in miles. For information, a mile is approximately 1.6 km.” “I cycle from my home to my workplace.” “I experience a condition that is making it hard for me to move/walk/cycle.” We asked for the Ten-Item Personality Inventory (TIPI) in the exact same way it is described in the original work [27]. “For each pair of topics, please state if you think that the left/right hand side topic is a more important priority for you, if both are equal or if you cannot compare them.”
“I prefer the first” “I prefer the second” “They are equal to me” “I cannot compare them”
Age group
18–24
25–34
35–44
45–54
55–64
65+
Proportion
10.2%
47.0%
14.3%
20.4%
6.1%
2.0%
The results from this question are given in Table 6.
Category
1
2
3
4
5
6
7
8
Proportion
10.2%
16.3%
12.2%
4.1%
26.5%
22.5%
4.1%
4.1%
Category
Female
Male
Other
Proportion
69.4%
30.6%
0%
Category
0
1
2
3+
Proportion
61.2%
20.4%
10.2%
8.2%
Category
1
2
3
4
5
Home
36.7%
22.5%
24.5%
12.2%
4.1%
Workplace
47.0%
20.4%
18.3%
14.3%
0%
Dialogue with a chatbot
“Cycling in the city” “You are about to take part in a survey concerning cycling from home to work. This is completely anonymous and you can decide to stop your participation at any point in time. In addition, there is no right or wrong answer. We are only interested in what you think.” “Please enter your Prolific ID.” “Please select your sex from the list.” “Female”, “Male”, “Other” The results from this question are given in Table 10.
Gender distribution We asked for the Ten-Item Personality Inventory (TIPI) in the exact same way it is described in the original work [27]. “What is your position towards cycling from home to work?” The participant could select a value between “What is your position on the government investing in initiatives and infrastructures for cyclists?” The participant could select a value between “Please read and interact with the chat box below. You will not be able to type in your answer. However, you will be able to click on messages when asked to do so.” This is the part of the study where the dialogue is undertaken with the participant. Each dialogue is a sequence of arguments, and menus of counterarguments, that are presented to a participants, and the participant is asked to select counterarguments from each menu (as illustrated in Fig. 1). “What is your position towards cycling from home to work?” The participant could select a value between “What is your position on the government investing in initiatives and infrastructures for cyclists?” The participant could select a value between “Do you have any additional argument concerning cycling in the city?”
Category
Female
Male
Other
Proportion
64.6%
35.4%
0%
