Abstract
Embodied Conversational Agents (ECAs) can increase user engagement and involvement and can strengthen the effect of an intervention on health outcomes that is provided via an ECA. However, evidence regarding the effectiveness of ECAs on health outcomes is still limited. In this article, we report on a study that has the goal to identify the effect of a match between a health topic and the ECAs’ appearance on ratings of personality characteristics, persuasiveness and intention to use. We report on an online experiment with three different ECAs and three different health topics, conducted among 732 older adults. We triangulated the quantitative results with qualitative insights from a focus group. The results reveal that older adults prefer an ECA that has an appearance matching a certain health topic, resulting in higher ratings on persuasiveness and intention to use. Personality characteristics should be measured embedded within a health topic, but are not rated higher because of a match. We furthermore provide guidelines for designing the content of the ECA.
Introduction
eHealth is in a transition phase, whereby text- or video-based communication is gradually being replaced by more engaging means of interaction. Examples of such means are online videos and games, and also Embodied Conversational Agents (ECAs). ECAs have been defined as “more or less autonomous and intelligent software entities with an embodiment used to communicate with the user”. 1 Most often, they take the form of virtual persons or cartoon characters on websites, accompanied by a scripted chat functionality. Compared to traditional text, video or audio communication, ECAs provide a form of simulated human to human interaction, whereby natural dialogues are aimed to be mimicked (including interactive scripts and body language).
Previously, ECAs have been implemented within the health context for a variety of contexts, such as health education, 2 mental disorders, 3 or as a decision aid for shared decision making in healthcare. 4 Using an ECA to convey information, to offer an intervention, or to support decision making has been found to sort different effects. A review by Provoost and colleagues 5 found that ECAs can increase user engagement and involvement, and can strengthen the effect of an offline intervention that is provided alongside the ECA. At the same time, they conclude that the evidence regarding the effectiveness of ECAs is still limited, as it is an emerging field and most studies have focused on usability and user acceptance. A similar conclusion was drawn in a review by Kramer et al., 6 who found that, on the one hand, using ECAs leads to higher usability and use of a digital intervention, but that the effect of using ECAs on health outcomes is still unclear. Often, motivation to change increased, but health behavior and health literacy did not. A prerequisite for use and effect of an ECA within any context, including health, is a developed and engaging design. Multiple studies have concluded that the design of an ECA should instill emotional bonding with the end-user.7–9 Most of the studies, aimed at generating design guidelines for ECAs in the health context, have focused on the design of speech or textual output, ECA gaze and facial expressions, and body gestures, while the design of the ECA’s appearance seems to be somewhat neglected.6,10
The influence of appearance
As supported by research with humans, an ECA’s appearance is essential in influencing users motivation, attitude and future behaviors. For example, people prefer for a health-care provider to wear a white coat or a professional dress, 11 for a physiotherapist to wear a tailored dress, 12 and for female therapists to wear casual attire. 13 These preferences are found to communicate expertise and authority.12,14 Hence, they have an important influence on patient trust levels, establish confidence, influence the perception of empathy, and increase the likelihood that patients will comply with care instructions.14,15 This effect goes beyond attire; it is long known that relative to a physically unattractive counsellor, an attractive counselor is perceived more favorably with regard to her competence, professionalism, assertiveness, interest, and relaxation, and her ability to help with problems. 16
Similar to the finding that humans are influenced by the appearance of humans, they are also influenced by the appearance ECAs. Different studies on appearance of ECAs conclude that different appearances lead to different outcomes in terms of motivation and behavior change, 17 regardless of the underlying technical system. Baylor and colleagues showed, for example, that an attractive and ‘cool’ agent leads to higher levels of motivations among youngsters, because it most closely reflected themselves. 18 Furthermore, a recent study by ter Stal and colleagues 19 found that an ECA’s appearance effects the users’ perception of authority, whereby older male agents were seen as more authoritative than young female agents. In another study, ter Stal and colleagues showed that the agent’s role (e.g. a peer or expert) also effects the perception of the agent’s characteristics, such as trust and friendliness, but also the likeliness of following the agent’s advice. 20 Thus, adapting the appearance not only influences first impressions, but also future interactions between the user and ECA.
Persuasiveness and personality
In general, the goal for an eHealth intervention is to persuade end-users into using the intervention and complying with the desired behavior (e.g., losing weight or monitoring their health status). By taking into account persuasive strategies or functionalities in the design of an eHealth intervention, one can increase end-user adherence. 21 An overview of these design principles has been created by Oinas-Kukkonen and Harjumaa 22 and includes features such as rewards, third-party endorsement, and instilling authority. However, implementing these features (either with or without an ECA) should not be seen as the optimal solution for creating a successful eHealth intervention. Rather, the design should be tailored towards the target group and the health-related behavior goal.23,24
Tailoring the design of an ECA can be done by adapting the dialogue script, its body language, or its appearance, all part of an ECA’s personality. Leading to the question which personality traits are ideally incorporated in an ECA. Existing studies on ECAs measure user satisfaction via characteristics as liking, trust and friendliness.25,26 In a previous study, we set up multiple co-design sessions with community-dwelling older adults in which we discussed which personality traits they would prefer for a health ECA. We found that the most valued traits among this group were friendliness, warmth, trustworthiness, concern and competence. 27 These traits, with the exception of ‘warmth’, were also identified as most important among older adults in a card-sorting task study by ter Stal et al. 20 These findings indicate that associating the right personality traits with appearance to an ECA for a specific health-related behavioral goal, might increase the persuasiveness, and hence appreciation and effect.
Research objectives
In this article, we report on an experiment with three different ECAs and three different health topics. The aim is to identify the effect of a match between a health topic and the ECAs’ appearance on ratings of personality characteristics, persuasiveness and intention to use. The ultimate goal is hereby to create design guidelines for ECA design that ensures high persuasiveness and intention to use. To this goal, we conducted an online experiment among older adults and triangulated the quantitative results with qualitative insights from a focus group. The main research question that we formulated goes: How does the match between a health topic addressed by an ECA and the ECAs’ appearance effect end-users’ evaluation of the ECAs personality characteristics and persuasiveness?
To guide this study, we formulated three hypotheses, based on the literature discussed above: 1. When an ECA is embedded in a health topic, it is rated higher on positive personality traits (friendliness, warmth, trustworthiness, concern, and competence). 2. Older adults prefer an ECA that has an appearance matching a health topic (cooking, food, and loneliness). 3. An ECA that has an appearance matching a health topic is rated higher on positive personality traits, persuasiveness and intention to use.
Method
Method online experiment
Participants
Participants were recruited via a Dutch research panel of the National Foundation for the Elderly, consisting of approximately 1350, mainly community-dwelling, older adults. Participants received an email asking whether they were willing to participate in the online questionnaire. The only inclusion criterion was that participants should be fluent in the Dutch language. In addition, community-dwelling older adults whom participated in a previous study of the same project were invited per newsletter to complete the questionnaire. 27 Since four participants previously indicated a preference to receive documents per post, they received a paper version of the questionnaire.
Stimuli
Based on a previous co-creation study with community-dwelling older adults in the Netherlands,
27
we created three different ECAs, with different names and personas (see Figure 1). The first ECA represents a female peer (Ellen), the second ECA a chef (Herman) and the third ECA a fantasy figure (Bo). The three different ECAs (Ellen, Herman and Bo).
In addition, and also based on the co-creation study, we created three storyboards (see Appendix 1). Each storyboard addresses a different health topic, and is based on a different behavior change technique (BCT). The first context is ‘Cooking’, and includes a recipe book with the BCT tailoring, with the aim to improve eating behavior. The second context is ‘Food,’ consisting of a food diary via which users self-monitor their eating behavior. The third and last context is ‘Loneliness,’ and consists of a bundle of audio-fragments from other older adults about social activities they performed, based on the BCT social learning, with the aim to decrease loneliness.
Procedure
The online survey tool Qualtrics was used for the questionnaire. After providing written informed consent and completing the questions on the socio-demographics, participants were randomized. A quarter of the participants were randomly assigned to questionnaire A, three quarters of all participants were randomly assigned to questionnaire B (see Figure 2). Flow of the questionnaire. Note: E = Ellen, H = Herman, B = Bo. F = Food, C = Cooking, L = Loneliness.
In questionnaire A, participants were asked to rate the appearance of the three different ECAs on five personality characteristics. They only viewed the image of the ECA (similar to Figure 1). Next, participants were asked to indicate the importance of the five characteristics. Last, participants viewed three different storyboards without an ECA (see Appendix 1), and were asked which of the three ECAs would be able to help them best.
In questionnaire B, participants were further randomized over 1 of 3 groups. Each group of participants viewed three storyboards, each with a different ECA addressing a different health topic (thus, there were 9 combinations). Participants were then asked to rate the ECA on five personality characteristics, and to assess its persuasiveness and intention to use.
Measurements
All participants were asked questions about their socio-demographics, including: age, gender, retirement (y/n), highest finished education, living situation (with partner, without partner, with someone else), whether they had home-cooked dinners (in order to control for community-dwelling, y/n) and chronic diseases. In addition, we also measured health literacy, using the Three Brief Health Literacy Screeners. 28
For questionnaire A, participants reviewed the three ECAs without a health topic, or only the health topics. The following data were collected: • Ratings of the personality characteristics per ECA: friendliness, warmth, trustworthiness, concern and competence. These five characteristics were based on the previous co-creation study. The questions were measured using participant agreement with a 5-point Likert scale ranging from “1 = Strongly disagree” to “5 = Strongly agree” • Ratings of importance of the five personality characteristics. The following statement was provided: “In general, I think it is important for a coach to show the following characteristics”. The question was answered on the same 5-point Likert scale. • Preferred ECA per health topic, after seeing only the health topic. The following question was asked: “Which virtual coach would be able to help you best?” The answer option consisted of an image of each of the three ECAs.
For questionnaire B, participants reviewed three different ECAs, addressing three different health topics. The following data were collected: • Ratings of the personality characteristics per ECA. The questions were measured using the same 5-point Likert scale. • Ratings on persuasiveness and intention to comply per ECA. This was measured by a validated perceived persuasiveness scale, adapted from Drozd et al.
29
The scale consists of four questions: i) “The system would influence me.”; ii) “The system would be convincing.”; iii) “The system would be personally relevant for me.”; iv) “The system would make me reconsider my (eating) behavior. The questions were measured using the same 5-point Likert scale. • Intention to use, measured via the question “I would like to use this program”, and answered on the same 5-point Likert scale.
Data analysis
Data were analyzed using SPSS/WIN 25.0 software (IBM Corp., Armonk, NY, USA). Data are presented as mean ± standard deviation (SD) for normally distributed continuous variables, and frequencies and percentages for categorical variables. For H1, we first controlled whether all personality characteristics were deemed important (
Method focus group
Participants
We invited all 13 participants from our previous study 27 to participate in this focus group. Invitations were send per mail and post, based on the preference of the participant. In total, six participants signed up and joined the focus group.
Study design
In the focus group we discussed the findings and improved the appearance of the ECA. The duration of the session was 4 hours, including a lunch and one-hour break. The aim of the analyses was to provide deeper insights on the match between the ECA and the health topic, the appreciation of ECAs in terms of positive personality traits, and the persuasiveness of ECAs. More specific: • How should the appearance of the ECAs be improved? • How should the personality of the ECAs be designed?
Procedure
As a result of the online questionnaire, two ECAs were selected for the three different health topics. Participants were picked-up at home and we (LK and StS) met at the headquarters of the National Foundation for the Elderly. After a short introduction, participants were asked to provide written informed consent. In total, the focus group consisted of three assignments.
During the first assignment, the designer (StS) showed an image of the first ECA, and LK provided a recap of the cooking topic. Participants were asked why they thought this particular ECA was preferred for this health topic. Next, we asked how participants thought the design of the ECA could be improved. When a participant mentioned, for example, the haircut, this was adjusted at the spot and improved until satisfaction was reached. We created a set of design characteristics upfront to ask participants’ opinions about, if it had not been mentioned yet (including age, skin color, BMI, outfit, hair, eye color, accessory, other). For the second assignment we asked participants to write a background story of the ECA and share it afterwards. Next, we repeated the first and second assignment for the other ECA.
During the last assignment, we invited participants in groups of three to write a short dialogue between an ECA and a user. We provided them with post-it notes, and discussed a short general example. They were instructed to write an opening, explain something about the health topics, and write an ending. This was repeated for the other ECA. Afterwards, we discussed the dialogues plenary, and created a list of conditions for the dialogues.
Data analysis
Work-sheets were scanned. Audio recordings of the sessions were transcribed verbatim by an independent agency and reviewed by the research team for accuracy by comparing the audio recordings with the written transcripts. Pseudonyms were developed for each participant to maintain confidentiality. All data was uploaded in ATLAS.ti 8. qualitative data analysis software. Analysis was guided by a thematic analysis approach, and combined a deductive and inductive approach. 31 One researcher (LK) created a first list of codes based on the script for the session. Both LK and StS then coded the transcripts independently, and added extra codes if needed. Differences in codes and coded fragments were discussed, leading to a final and agreed upon codebook and coded transcript.
Results
Questionnaire
Characteristics of participants
The questionnaire was filled in by 732 study participants. Five participants only filled in the demographics and were therefore excluded. In total, 724 participants completed the questionnaire online, and three on paper. The mean age of the 727 study participants was 72.7 ± 8.11 years, with 83.8% being retired. Women accounted for 62.6% of participants. Half of the participants lived alone (50.5%), other participants lived with their partner (47.7%), or with someone else (1.8%). Almost all participants had home-cooked dinners (91.6%). In total, 52.5% had completed high school or some associate degree, 40.5% of all participants had completed college or university. The mean health literacy score was 6.51 ± 1.68 out of 12.
Personality characteristics of the embodied conversational agents
Rating of the importance and personality characteristics per ECA.
Note: Statistically significant differences among these means do not share a superscript.
Average rating of personality characteristics without and with health topic.
Embodied conversational agents and health topic
Preferred ECA per health topic.
1A superscript indicates a match.
Effect of a match
Difference between match and other on personality, persuasiveness and intention to use.
Focus group
Link to previous study and characteristics of participants
Based on the online experiment, Ellen was chosen as the ECA for the health contexts Food and Loneliness. Herman was chosen as the ECA for the health context Cooking. The mean age of the six study participants was 83.5 ± 7.71 years. Figure 3 shows the setting of the focus group. According to participants in the focus group, Ellen was chosen for these modules because of her competency. Participants also assumed that she knew some things of food, and that she was easy to approach. Herman was chosen for cooking because he ‘had studied for it’. Participants also assumed he was familiar with food for older adults. Setting of the focus group.
Improving the design
In general, participants were quite positive about Ellen. They agreed she had a ‘nice, open and friendly face’. A first suggestion was to broader her neck, and make her eyebrows bigger. Also, her shadow should be less visible. There was an extensive discussion about her glasses. Some participants thought it was too ‘fussy’ or ‘too serious’. While other participants thought it looked nice. In the end we agreed to change the color of the frame, so it was less remarkable. Figure 4 displays Ellen before and after the focus group. Ellen before and after the focus group.
Participants thought Herman had the right age. This was extensively mentioned by one of the participants as following: “I choose a young cook who just left the cooking school, or where does he come from. Know the last things, be aware of the possibilities that are available (…). An old cook does not know all that exactly, and a young man does.”
Participants further agreed the moustache should be smaller, but was a nice addition. His shirt should indeed be white, without any stains, and he should wear a traditional chef’s heat. Figure 5 displays Herman before and after the focus group. Herman before and after the focus group.
Personality
Although the background stories written for Ellen were different from each other, the general line was rather similar. Ellen is a loving person with some life experience, has a family and multiple children. She is easy to feel at home with. An example of a story is the following: “Ellen, a familiar appearance, loves people, a people person, has great colleagues, does this work from her heart, enjoys the work but not too many hours. Because otherwise she gets upset and you get another Ellen. Has a good home base, time for her family, a sweet husband. She is a good listener with empathy. Does she have a nice hobby, nice family, dear husband, I already said that-”
Herman is a nice and fun person, who loves to eat nice food, and is a bit overweight. He is always busy with work. Multiple participants also wrote that he played soccer in his free time, and again, is competent. “My cook should be a pleasant person. He likes good food, but together. Because he often stands alone in the kitchen and then has to taste everything. It makes him fat. So that actually belongs to a cook. The advantage is also that he is married so that his projects can be criticized. How would the cook be at home? Does he cook there?”
Requirements and example quote.
Discussion
This study investigated whether a match between the health topic and an ECA was associated with a more positive evaluation on five personality traits, persuasiveness and intention to use. The results reveal that older adults prefer an ECA that has an appearance matching a certain health topic, resulting in higher ratings on persuasiveness and intention to use. Positive personality traits are not appreciated better as a result of a match. However, we found that it is important to measure these traits embedded in a health topic in order to gain a realistic rating for when the ECA is used in practice. In a focus group we explored how both design and personality evaluations should be further improved. Multiple specific design changes were made, and we developed two background stories and a list of requirements for the tone of voice in the dialogue script.
Match with health topics
Our results show that a match between the appearance of an ECA and its health topic is preferred by end-users over a non-match. These findings are in line with a large body of literature that uncovered the effects of appropriate appearances by healthcare professionals. For example, Hatfield and colleagues 32 concluded that a standardized uniform increases perceptions of professionalism and recognition among patients. Their conclusion may also explain our findings. However, we also found that a match between ECA and a health topic did not positively effect the appreciation of the ECA personality. It was expected that, by matching appearance with a health topic, the positive aspects of stereotypes can be used, thereby boosting the end-user’s judgment of personality traits that are typically favored for a specific stereotype. For the case of an informative website about cancer screening, a match between health information design and stereotypes was shown to lead to increased message credibility and informativeness and positively effects the attitude towards cancer screening. 33 The authors explained these effects via the social judgment theory. 34 The theory posits that when people are confronted with new (sources of) information, they relate this to their current knowledge or attitudes (in this case, stereotypes). If their previous conceptions are confirmed, their appreciation of this new source is higher, and vice versa. So, when the appearance of an ECA confirms with the end-users’ initial, stereotypical conceptions, given the health topic, the persuasiveness of the ECA will increase, but not necessarily the appreciation of its personality. Tests such as the implicit association test might provide further understanding of these mechanisms.35,36
Embodied conversational agents personality
In line with previous studies,25–27 our results confirm that the traits friendliness, warmth, trustworthiness, concern, and competence, are all deemed important for an ECA. However, when only provided with an image of an ECA, people rate them differently. Thus, without considering a health topic, we found that people rated the personality traits of a peer higher than those of a cook, and the personality traits of a cook higher than those of a fantasy figure. Earlier review on ECAs in the health context already showed that fantasy figures are not often used, 6 and showed that human agents are generally preferred over cartoon-like agents.6,10 When a health context is provided, and thus dialogue is added in the form of a mock-up, it becomes evident that the personality traits of all three ECAs are rather higher, and that the order of preference remains consistent. This finding that people appreciate personality traits higher when provided with dialogue can be explained by the fact that personality is reflected in dialogue. Kampman et al. 37 proposed a neural network based fusion method, and showed that personality traits of ECAs are best predicted when audio, language and appearance are combined. This indicated that it is indeed important to consider the tone of voice, when writing dialogue for a specific ECA. In our results we provided various requirements, and argued to include disclosure. Revealing information makes people likable to others, we disclose more to those we like, and we like others we have disclosed to. 38 This general rule for humans, is also proven to be true for ECAs.6,20 Following the golden standard when designing ECAs for health, a background story and tone of voice is best created together with the potential end-user in order to increase the changes of higher use and greater effect. 6
Embodied conversational agents persuasiveness
The ECAs were perceived as more persuasive with than without a health topic. Thus, they were perceived as more influencing, convincing, relevant, and made them reconsider their behavior more. Furthermore, the intention to actually use the ECA also increased when the health topic was shown. However, one should aim to increase the persuasiveness of an ECA, rather than simply measure it. We asked end-users for their preferences regarding the appearance, and thereby aimed to tailor the design of the ECA further to the needs and wishes of community-dwelling older adults. The overview of design principles created by Oinas-Kukkonen and Harjumaa 22 was not intended to use as a checklist, but rather to select the principles most important for the system at hand. For ECAs in the health context, these seem to include tailoring and similarity, in the form of offering relevant suggestions and a background story similar to that of a potential user. With regard to the personality of the ECA, both trustworthiness and expertise are important design features. Last, we showed that the social role should be matched to the health context. Incorporating these design features, increases the changes for adherence, and maximizes the changes of actual health behavior change.
Limitations of the study
Our results show the preference of Dutch community-dwelling older adults towards the appearance and personality of various ECAs in the health context. Among other things, this includes the preference of a humanoid ECA over a fantasy figure. However, earlier studies show that certain preferences are clearly context dependent, and differ from person to person. Hence, our findings are not simply generalizable to curative interventions where the focus might be on other personality traits, or design features. Furthermore, it is known that older adults have other preferences regarding ECAs compared to youngsters. 20 Thus, research outside the context of health ECAs for older adults, should always consider tailoring their ECA to their specific target group. We hope to have offered a method to do so. Although it should be noted that our aim was to uncover design guidelines. While these guidelines are needed to develop interventions that best match a certain target group, the possibility and added value of personalizing an intervention could be considered in later stages of development.
Another limitation which should be taken into account is the setting of the study. During the questionnaire, we asked participants to rate the ECAs after an initial and single interaction. These results might be different if participants were exposed to the ECA for a longer period of time. Hence, one should ideally retake these questionnaires in the evaluation process of the ECA to make sure the suggested design guidelines still remain valid. In addition, researchers should consider the influence of possible sensory issues during the development and evaluation.
Concluding remarks
In this article, we have uncovered design guidelines for developing ECAs within the health context, with a particular focus on a match between the ECA and the health topic, ECA personality, and ECA persuasiveness. Since more engaging means of communication are rapidly taking over the text-based information which was favored in healthcare for so long, it is an important task for the human-computer interaction community to develop guidelines that can aid the visual and dialogue design of this modality. Our efforts have completed a part of the puzzle. It is now up to future studies to develop further guidelines and complete it.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This was supported by ZonMw (40-44300-98-110).
