Abstract
Chatbots are increasingly applied in various contexts including helping professions, such as organizational and life coaching. Coaching facilitates individual wellness, behavioral change, and goal attainment in a reflective, non-directive manner, and is considered one of the fastest-growing professions. The use of knowledge imparting service chatbots have been studied; however, the application of chatbots in coaching has received scant research attention, raising the question about which factors and moderating effects play a role in the adoption of reflective, non-directive coaching chatbots. In this study, we applied a modified Unified Theory of the Acceptance and Use of Technology (UTAUT) model to determine factors and moderating effects of age and gender that influence the adoption of a goal-attainment coaching chatbot. Partial least squares structural equation modeling (PLS-SEM) was used for the analysis of a cross-sectional UTAUT survey (n = 226). Performance expectancy, social influence, and facilitating conditions had significant roles as direct determinants of intent to use the coaching chatbot. Gender moderated performance expectancy and age showed a moderation tendency on effort expectancy. This study on non-directive, reflective chatbots in the organizational, and life coaching domains contributes to our understanding of how to design chatbots aimed at helping people find their own answers.
Introduction
Chatbots (conversational agents) are a category of computer software programs that engage in human conversation through either auditory, textual, or mixed methods (Hatwar et al., 2016). The use of chatbots have increased substantially in recent years with one estimate suggesting over 300,000 active chatbots in 2018 on the Facebook Messenger platform alone (Chaves & Gerosa, 2019). Chatbots are making an impact on users in a variety of settings, including education, entertainment, retail, and the helping professions, such as physical, mental health, and behavioral change (Følstad & Brandtzæg, 2017; Gnewuch et al., 2017; Kamphorst, 2017; Siddique & Chow, 2021; Xu et al., 2021). A relatively new application of chatbots is in the domain of organizational and life coaching, a fast-growing multi-billion dollar per year industry (Athanasopoulou & Dopson, 2018). Coaching is defined as “a human development process that involves structured, focused interaction and the use of appropriate strategies, tools and techniques to promote desirable and sustainable change for the benefit of the client and potentially for other stakeholders” (Bachkirova et al., 2014, p. 1). Although coaching has numerous benefits (see, e.g., Blackman et al., 2016), one of the hallmarks of coaching, that sets it apart from other helping professions, is its ability to assist with goal attainment (Grant, 2012). Coaching is also typically non-directive and reflective, which implies that limited knowledge or information is conveyed by the coach (Passmore & Fillery-Travis, 2011).
Chatbots are widely used, yet empirical research on their efficacy, design considerations, and adoption are limited or scattered across various domains (Feine et al. 2019; ter Stal et al., 2020), despite recent attempts to create chatbot design frameworks (see, e.g., Terblanche, 2020; ter Stal et al., 2020). Moreover, users interacting with chatbots often report high failure rates and levels of scepticism, suggesting that more research is needed on how to design and introduce chatbots to users (Araujo, 2018). This observation is in line with the notion that chatbot research is currently inclined toward technical aspects with only a few studies focussing on the social dimensions, such as technology adoption (Kuberkar & Singhal, 2020). Furthermore, there is very little research on chatbots’ acceptability and motivation for use in the helping professions in general (Nadarzynski et al., 2019). Given that the application of chatbots in coaching is more recent than other domains, and that coaching is typically a reflective, non-directive practice, this problem is exasperated. These gaps in knowledge raise the question this research therefore asks: What are the factors and the moderating effects of age and gender that influence the intention to use a non-directive, reflective goal-attainment coaching chatbot?
To address this question, a goal-attainment coaching chatbot named Vici was created based on goal-attainment theory (Grant, 2012) and the GROW coaching model (incorporating the four steps: Goal; Reality; Options; Will; Grant, 2011). Deployed on the Facebook Messenger platform, Vici has text-based, reflective, non-directive conversations with users, helping them to identify and set realistic goals, and specify action plans to incrementally reach their goals. Vici also helps users to track progress through a series of follow-up conversations. Vici differs from typical chatbots in that it does not convey information or content like service oriented chatbots, but creates a reflective space for users to define their own goals and action plans. In this sense Vici is similar to one of the first ever chatbots, Eliza created in 60s. Eliza imitated a human Rogerian psychotherapist by responding to human input with non-directional questions (Natale, 2019). In this study, Vici was used to investigate factors and the moderating effects of age and gender that contribute to the intention to use a goal-attainment coaching chatbot through the lens of the Unified Theory of Acceptance and Use of Technology (UTAUT) model using the PLS-SEM methodology. Insights into these adoption factors and moderators could guide designers of coaching chatbots to incorporate features that are unique to the coaching and similar domains where non-directive chatbots are employed.
Theoretical Frameworks
Factors that determine the adoption of technology have been studied extensively. A number of models are used for investigating technology uptake, the most widely of which is the Technology Acceptance Model (TAM) that considers the roles of perceived usefulness, perceived ease of use and attitude on the intention to use specific technology (Davis et al., 1989). TAM has been researched extensively and has proven to be reliable and robust in predicting user acceptance of technology and systems in the context of organizational-related studies (Chismar & Wiley-Patton, 2003). Numerous extensions and competing models have emerged through the years that led to the Unified Theory of Acceptance and Use of Technology (UTAUT) model, that attempted to integrate the fragmented theory and research on individual acceptance of information technology (Venkatesh et al., 2003). The model performed significantly better than the eight models from which it was derived, accounting for 70% of the variance in usage intention (Venkatesh et al., 2003). The UTAUT model consists of four constructs: performance expectancy, effort expectancy, social influence, and facilitation conditions. Despite its wide acceptance, more than 75% of studies that employed the UTAUT model included other constructs, not in the original UTAUT model to take into consideration the unique aspects of the specific context under investigation (Dwivedi et al., 2017),
Researchers have applied, extended, and integrated UTAUT to study technology acceptance and usage intention by individuals across diverse settings. Venkatesh et al. (2003) encouraged researchers to explore and test the theory in different contexts. Examples of application include social robot acceptance (de Graaf et al., 2019), user adoption of mobile technologies (Park et al., 2007), the use of a virtual classroom in support of teaching and learning (Aditya & Permadi, 2018), course management software (Liu et al., 2007), artificial intelligence (AI) and the internet of things (Kessler & Martin, 2017), mobile commerce (Min et al., 2008), and mobile banking (M. Y. Wu et al., 2012).
A number of technology adoption studies involving chatbots have been conducted recently. Rodríguez Cardona et al. (2019) investigated the socio-technical factors that influence the adoption and diffusion of chatbots in the insurance sector in Germany using a mixed-method approach. They identified “relative advantages” and “IS infrastructure” as the most significant adoption factors with practitioners and potential customers indicating that they prefer human support in addition to a chatbot when complex insurance decisions are made. Kuberkar and Singhal (2020) used a modified UTAUT to assess the factors that influence the intention to use an AI powered chatbot for public transport services within a Smart City (n = 463). Using structural equation modeling (SEM) they found that performance expectancy, effort expectancy, social influence, facilitating conditions, anthropomorphism, and trust directly affect the intention of using the chatbot. In a study in the retail industry, Kasilingam (2020) used partial-least squares (PLS) SEM (n = 350) to analyze users’ attitude and intention to using a smartphone chatbot for shopping. The results revealed that attitude toward chatbots was considerably influenced by perceived usefulness, perceived ease of use, perceived enjoyment, price consciousness, perceived risk, and personal innovativeness. Intention to use was directly influenced only by trust, personal innovativeness, and attitude. In addition, considerable differences were found for moderation of age, gender, and prior experience using shopping chatbots.
In yet another application, Melián-González et al. (2021) investigated the factors predicting the intentions to use chatbots for travel and tourism. Using PLS techniques (n = 476) they found that expected performance, habit, hedonic component, predisposition to using self-service technologies, social influences, and the fact that the chatbot behaves like a human, all positively predicted usage intention. Inconvenience and chatbot communication had a negative influence on usage intention. As a final example, Kim et al. (2019) studied the factors that influence the behavioral intention of using a chatbot banking service in the financial sector using the UTAUT model (n = 225). The found that performance expectancy, effort expectancy, social influence, trust of information, and trust of security were significantly related to the intention to use the chatbot service.
It is evident that UTAUT has been used with various technologies and in various contexts including chatbots, and can thus be considered a valid and reliable theory to measure and understand the influence of technology on user adoption behavior. The extended UTAUT2 which includes three additional constructs (hedonic motivation, habit, and price/value; Arenas Gaitán et al., 2015) was also considered but not selected due to the relative novelty of the coaching chatbot. The concern was that users would not have sufficient experience and knowledge about this type of chatbot to answer questions on habit and price at this stage. UTAUT was therefore selected for the present study to investigate the adoption factors of a non-directive, goal-attainment chatbot in the helping profession domain of coaching. It must be noted that most chatbot adoption studies we found relate to service-oriented chatbots where the bot imparts information or knowledge of some sort. To the best of our knowledge we could not find any adoption research using non-directive, reflective chatbots, the focus of the present study.
Hypotheses
In the present study we used the standard UTAUT model and added a fifth construct (perceived risk—discussed later). The five UTAUT constructs and subsequent hypotheses are illustrated in Figure 1 and discussed next.

Conceptual model and hypotheses.
Performance Expectancy
Performance expectancy is the degree to which an individual believes that using the system will help him or her to attain gains in performance (Venkatesh et al., 2003, p. 447). Performance expectancy has been shown to positively affect technology adoption in contexts, such as social media (Borrero et al., 2014); digital voice assistants (Wagner et al., 2019), mobile services (Shan & Lu, 2009), mobile banking (Yu, 2012), and also with service-oriented chatbots (Kasilingam, 2020; Kim et al., 2019; Kuberkar & Singhal, 2020; Melián-González et al., 2021).
Goal attainment is a focal area of coaching that leads to increased performance, progress, and achieving results (Grant, 2012). The purpose of the chatbot in the present study was specifically to facilitate goal attainment. The performance expectancy construct measures perceived performance gain and since this is the primary objective of the chatbot we therefore hypothesize that:
H1: Performance expectancy significantly affects individual intention to use a goal-attainment coaching chatbot.
Effort Expectancy
Effort expectancy refers to the degree of ease associated with the use of the system (Venkatesh et al., 2003, p. 450). Davis et al. (1989) found that people are more likely to use an application if they perceive it as easy to use. In previous studies, a positive relationship between effort expectancy and behavioral intent was found in web-based learning (Chiu & Wang, 2008), mobile banking (Shan & Lu, 2009), health information technology (Kijsanayotin et al., 2009), and as well as with service-oriented chatbots (Kuberkar & Singhal, 2020).
When using a service-oriented chatbot users typically need a specific result such as placing an order or performing a transaction. Users may be willing to put in some degree of effort in return for an immediate, tangible result. With the chatbot in the present study however the outcomes are less tangible since setting a goal does not necessarily lead to immediate results (Grant, 2012). The amount of effort required to use this chatbot may therefore significantly affect the usage intention, given the lack of immediate satisfaction. Based on these arguments we therefore state the hypothesis:
H2: Effort expectancy significantly affects individual intention to use a goal-attainment coaching chatbot.
Social Influence
The third construct, social influence, is the degree to which an individual perceives that important others believe he or she should use the new system (Venkatesh et al., 2003, p. 451). This is linked to the concept of how a person perceives their image or status within a group (Moore & Benbasat, 2001). Other studies have found social influence to be a significant predictor of intention to use technology in mobile banking (Yu, 2012), smartphone applications (Tak & Panwar, 2017), and as well as for service chatbots (Kim et al., 2019; Kuberkar & Singhal, 2020; Melián-González et al., 2021).
Goal attainment is an important, inherent human pursuit (Grant, 2012). People generally have an innate need to strive for and achieve goals. Goal theory states that people achieve goals at a higher rate when they are held accountable by other people and especially people close to them (Locke & Latham, 1990). This points to a considerable social dimension of goal attainment. The third hypothesis we therefore state is:
H3: Social influence significantly affects individual intention to use a goal-attainment coaching chatbot.
Facilitating Conditions
Facilitating conditions is the degree to which an individual believes that sufficient infrastructure is available to use the technology (Venkatesh et al., 2003, p. 456). This construct has been shown to be an important consideration in technology uptake in contexts, such as mobile banking where access to the Internet is a prerequisite for usage (Joshua & Koshy, 2011), educational systems (Nistor et al., 2013), wearable technologies (Guest et al., 2018), and service chatbots (Kuberkar & Singhal, 2020).
When working on goal attainment, many people struggle to progress when they face obstacles that are out of their control (Grant, 2012; Locke & Latham, 1990). Support such as the chatbot in the present study that aims to help people to overcome these obstacles should therefore not add additional barriers in terms of cumbersome or prohibitive infrastructure requirements. We therefore state the fourth hypothesis as:
H4: Facilitating conditions significantly affect individual intention to use a goal-attainment coaching chatbot.
Perceived Risk
We include perceived risk as a fifth and additional construct. Perceived risk is defined in this study as the degree to which the user of the goal-attainment coaching chatbot believes that he or she may be exposed to certain types of risks including breach of confidentiality and privacy and lack of trust (Featherman & Pavlou, 2003). Privacy and confidentiality have been shown to have a significant influence on perceived risk and consequently, behavioral intention to use technology (Featherman & Pavlou, 2003; Metzger, 2004; Shin, 2010; Zhang et al., 2012). Specifically, Følstad and Brandtzæg (2017) found that perceived security and privacy in a chatbot, as well as general risk perceptions influence users’ trust in a chatbot.
In human-to-human coaching, confidentiality and trust form the foundation of a strong coaching relationship, a known predictor of coaching success (Brennan & Wildflower, 2014; Terblanche & Heyns, 2020). The role of trust in facilitating disclosure may be particularly crucial in a type of coaching where chatbot-mediated communication replaces physical human contact. In an environment of reduced social cues, trust may be more difficult, yet more critical to establish than in interpersonal contexts (Boyd, 2003).
Given these considerations, we include the construct “perceived risk” in our UTAUT model and state the fifth hypothesis as:
H5: Perceived risk significantly affects individual intention to use a goal-attainment coaching chatbot.
Gender and Age as Potential Moderators
Few studies have tested the moderation effects of differences between individuals as specified in the original UTAUT model (Venkatesh et al., 2016). A key question within a coaching context is how diversity within groups can be developed as a productive asset, rather than becoming a source of conflict and prejudice (Christian et al., 2006, p. 459). Therefore, this study wanted to understand the impact of the moderating effect of gender and age on the key determinants.
Gender differences have been found in various technology adoption contexts. Natarajan et al. (2017) found that in mobile commerce for shopping (n = 675), the effect of perceived enjoyment on perceived usefulness, and personal innovativeness on satisfaction was stronger for the female respondents. On the other hand, the effect of perceived ease of use on satisfaction was stronger for the male respondents. Wang et al. (2009) used UTAUT (n = 330) to investigate factors that influence the acceptance of mobile learning and found that gender differences moderate the effects of social influence and self-management in usage intention. Joshua and Koshy (2011) found that men are more likely to use mobile banking services than women, while it has also been shown that women generally favor ease of using a technology when making adoption decisions (Venkatesh & Morris, 2000).
In terms of age, Natarajan et al. (2017) found that older people (>35 years) found that mobile shopping applications are useful and they show a higher intention to use these than younger people. Younger people (<35 years) however found the technology easier to use. In the mobile banking context, Joshua and Koshy (2011) found that younger people are more likely to use mobile banking services while Plaza et al. (2011) found age differences in mobile phone usage where older people preferred designs that suited their lifestyles. In the mobile learning context Wang et al. (2009) found that age differences moderate the effects of effort expectancy and social influence on use intention.
Chatbot technology is new and few studies have investigated the moderating effects of age and gender on adoption factors. One example we found is a study by Kasilingam (2020) that showed that chatbots for shopping were considered less risky by male respondents than by female respondents, and that younger respondents believed shopping chatbots to be useful and enjoyable to use.
Given the evidence of age and gender moderation in UTAUT studies in various technological settings, including chatbots, and the person-centered nature of coaching, it is anticipated that various age groups and genders will be differently inclined to use this form of technology, especially because it is applied in a traditionally human-centered domain of coaching.
Given the anticipated moderating effect of gender and age we therefore hypothesis that:
H6a to H6e: Gender moderates the relationships in H1 to H5
H7a to H7e: Age moderates the relationships in H1 to H5
Methods
This empirical study followed an exploratory quantitative research design that analyzed primary data (n = 226) obtained from a cross-sectional UTAUT survey. Participants completed the survey after having one goal-setting coaching conversation lasting on average, 7 minutes with the chatbot Vici. The chatbot did not impart any content of knowledge, but instead helped participants to identify a realistic goal and commit to an action plan through reflective, non-directive interaction.
Chatbot Details
Vici is a custom developed text-based chatbot deployed on the Facebook Messenger platform. The chatbot was developed using the Designing AI Coach (DAIC) framework that recommends merging aspects of strong human coaching relationship with chatbot design best practices and using proven, evidence-based coaching theories as foundation (Terblanche, 2020). In line with these recommendations Vici was designed to facilitate goal attainment according to goal theory (Locke & Latham, 2002). Vici had two types of text-based conversations with users. In the first conversation Vici helped users to specify realistic goals by questioning them on the importance, feasibility, and impact of their stated goals. Vici then helped users to commit to achievable actions that would help them reach their goals. In the second type of conversation users would check in with Vici to report on their goal and action progress, reflect on obstacles that prevented them from progressing and changing their actions plans if necessary. These conversations assisted users to monitor the progress of their goals and actions. Vici also helps users to distinguish between proximal (<6 months) and distal (>6 months) goals ( Locke & Latham, 1990). Vici was available 24/7 to the experimental group. Figure 2 provides examples of Vici’s conversations while Figure 3 shows the conversation flow.

Coach Vici chatbot conversation examples.

Coach Vici conversation flow.
Questionnaire Design
The UTAUT survey consisted of two parts: demographics (gender and age) and the UTAUT model constructs of performance expectancy, effort expectancy, social influence, facilitating conditions, perceived risk, and behavioral intent. The questionnaire items (see Table 1) were derived from previously validated research (Kijsanayotin et al., 2009; Venkatesh et al., 2003, 2016) and reworded for the context of this study (e.g., referring to “coaching chatbot”), similar to other studies that had modified the UTAUT instrument (Botero et al., 2018; Kijsanayotin et al., 2009; Yu, 2012). A 7-point Likert scale was used to align with other UTAUT studies (Venkatesh et al., 2016).
UTAUT Constructs and Survey Questions.
The questionnaire was kept simple and designed to be answered in a time-frame of 10 to 15 minutes. The initial survey was reviewed by two experts familiar with the UTAUT and coaching, after which it was pre-tested with a group of 16 participants. Based on feedback from both groups of reviewers, adjustments where made to the wording and sequencing to enhance clarity.
Survey and Sampling
Participants were sourced from the researchers’ international LinkedIn network as well as the Amazon’s Mechanical Turk (MTurk) platform. The MTurk platform is an online crowdsourcing platform designed to aid in recruiting people to complete various tasks that require human intelligence (Buhrmester et al., 2011). People who provide their services on the MTurk platform are from around the world with the majority from the USA (50%) and India (30%; Burnham et al., 2018). The reliability of data collected from MTurk is not significantly different from the data collected by other means (McDuffie, 2019). Only after participants had completed one goal-setting conversation with Vici, were they given a unique code and a link to the online UTAUT survey to ensure that the chatbot conversation did in fact take place. A total of 226 valid survey responses were received. Informed consent was obtained from all participants before they participated in the research and the researchers’ institution ethically approved the study (ref number US 11144157).
When empirical data is collected using questionnaires, data collection issues have to be addressed after the data was collected to identify missing data, suspicious response patterns (straight-lining or inconsistent answers), and outliers that could result in biases and inefficient use of data (Kock, 2018). These data issues occur when a respondent either purposely or inadvertently failed to answer one or more questions as instructed (Hair, 2017). The current research ensured validity and reliability by adopting the guidelines from Ringle et al. (2015). These authors suggest that if the number of missing values was less than 5% missing per indicator, the mean value replacement instead of case-wise deletion should be used to treat the missing values. A further guideline required disregarding observations from the dataset where missing values exceeded 15% per indicator.
Data Analysis
Partial least squares structural equations modeling (PLS-SEM) was used to investigate the model as presented in Figure 1. Possible moderating effects of age were tested within the SEM in a two-fold manner, firstly, by including interaction effects and testing for signification interaction path coefficients, and secondly, by doing a multi group analysis (MGA) where the sample was divided into two age groups (below and above the median age), and the path coefficient was compared between the two separately fitted SEM models. Gender moderation was tested using MGA. The SEM analyses were conducted using SmartPLS v3.3.2.
The moderating effects were further investigated (using a different approach), by doing “univariate” tests that considered only the three applicable variables for each moderation hypothesis. For gender, homogeneity-of-slopes analysis of covariance (ANVOVA) was used. For age, regression analysis was used by testing whether the inclusion of an interaction effect significantly improved the R2 of the regression. This was done by firstly doing a regression with behavior intent as dependent variable, and age, independent variable as predictors. This was followed by adding the age × independent variable interaction term. The F-to-remove test was done to determine whether the addition of the interaction effect significantly improved the R2 of the regression model. Findings of this univariate approach were compared with findings from the SEM model. These analyses were conducted using Statistica 13.5. A 5% significance level (p < .05) was used as guideline for rejecting hypotheses.
Results
Descriptive Statistics
Participants comprised 133 males, 91 females, and 2 others, representing 58.8%, 40.3%, and 0.9% respectively. Participants’ age groups ranged between from ≤29 (98); 30 to 39 (68); 40 to 49 (42), 50 to 59 (14); and ≥60 (4). The age category “20 to 29” had the highest frequency recording of 42.5%. The mean age was 35 with a standard deviation of 10 years.
Hypothesis Testing
To test the hypotheses and model as indicated in Figure 1, partial least squares structural equations modeling (PLS-SEM) was used. Table 2 shows the patch coefficients and p-values while Table 3 presents the reliability indicators for five of the constructs.
Outer Loadings of the PLS SEM Measurement Model.
Construct Reliability and Average Variance Extracted.
Numbers in brackets indicate results after items were removed.
Note that perceived risk was treated as a formative construct in the PLS-SEM model. In this case, the argument was that perceived risk was defined as a combination of unrelated risk elements which added up to a total risk score. The formative model was also restricted to equal weights for each risk element according to arguments put forward by Lee and Cadogan (2013), which meant that weights were not estimated in the SEM model. This implied that no reliability results were reported for perceived risk. The construct reliability of all the constructs were acceptable (>.7). For average variance extracted (AVE), two of the constructs had lower than 50% item variance explained. Inspection of item loadings found that there were some items in these two constructs which had low loadings. Removal of these items solved the AVE problem, and all subsequent item loadings were acceptable (see Table 3). A comparison of the original model with the items removed model, revealed that the latter showed little to no different findings as compared to the original model. It was therefore decided not to tamper with the original model because the constructs are previously validated and published constructs.
Discriminant validity was assessed using the method of heterotrait-monotrait ratio (Henseler et al., 2015). The closer to 1 this ratio is, the more it indicates that the two latent variables compared, do not discriminate. For the ratios, 95% confidence intervals were calculated and if the upper limit was >1, then it was taken as indication that discrimination did not hold. Two constructs did not pass this test, namely Facilitating Conditions and Effort Expectancy. From a theoretical basis, these two constructs are viewed as conceptually different, and for that reason no changes were made to the latent structure of the model. Further investigation of possible multicolinearity did not indicate that these two constructs posed a multicolinearity problem. The moderation of age and gender were investigated in separate analyses due to gender being a categorical variable which implied a different approach.
Table 4 (and Figure 4) show the path coefficients of the model fitted to investigate the age moderation effect. The paths from facilitating conditions, performance expectancy, and social influence to behavior intent all showed significantly positive coefficients (p < .01), therefore H1, H3, and H4 are supported. The p-value of perceived risk to behavior intent was close to significance (p = .07), but the path coefficient was small (−.07), and was not regarded as a trend. Age and effort expectancy were not significant. We therefore reject H2 and H5.
Path Coefficients of the PLS_SEM Model Investigating Age Moderating Effects.
Note. The interaction terms indicate the moderation coefficients.

PLS SEM model with path coefficients and p-values (in brackets).
The only moderating effect that showed a trend was age between effort expectancy and behavior intent (p = .09), partially supporting H7b and thus rejecting H7a, H7c, H7d, and H7e. Looking at the path coefficient for this moderating effect (coefficient = −.12), it indicates that as people get older, the relationship between expectancy and intent tends toward becoming a negative relationship, and for younger people it tends to a positive relationship. Further investigation using a multi-group analysis (MGA) approach showed that for respondents younger than the median age (32.5 years), the results indicated a positive relationship (path coefficient = .14), whereas for respondents older than 32.5 years, the relationship was found to be negative (path coefficient = −.18). This change in the sign of the path coefficient is supported by the non-significant relationship found in the full model (all respondents, path coefficient = −.03, p = .68).
The moderating effects were further investigated by extracting the latent variable scores from the PLS-SEM, and doing univariate moderation tests on all the postulated moderating effects, that is, including only the three applicable variables for each test. These individual tests did not support the moderating effect described above. What was interesting to note here, is that when only effort expectancy, age, and behavior intent (with the latter as dependent variable) were tested on their own, then a significant positive relationship was found between effort expectancy and behavior intent, with no interaction (moderating) age effect. This is contrary to the findings of the SEM model, which indicates that when all the variables are considered, then firstly, the relationship between effort expectancy and behavior intent disappeared, and the moderating age effect showed up. It points to possible mediating effects, which were not part of the model tested in Figure 1, and could be material for further investigation.
Gender moderation was investigated using MGA, and Table 5 summarizes these results.
Separate Path Coefficients for Males and Females With Tests for Gender Differences.
The only variable that showed a trend was performance expectancy (p = .06) with a stronger positive relationship for females (path coefficient = .68) as opposed to males (path coefficient = .20), thus partially supporting H6a and rejecting H6b, H6c, H6d, and H6e. Further investigation was also done using univariate homogeneity-of-slopes ANCOVAs (see Table 6). These results confirmed the performance expectancy trend (ANCOVA interaction p = .10), and indicated a further significant interaction for perceived risk (p = .01). If we do not control for the other variables in the model, there was a stronger positive relationship between perceived risk and behavior intent in males as opposed to females.
Results of Homogeneity of Slopes ANCOVA Analyses.
Discussion
This study set out to investigate the use of chatbots in the helping profession of life and organizational coaching and specifically the factors that influence the intention to use a goal-attainment, reflective, non-directive coaching chatbot. It is important to understand adoption factors to allow chatbot designers in the field of coaching and other similar fields to create chatbots that meet the unique requirements of reflective, non-directive interventions, which may be different to the majority of current chatbots that are service oriented and knowledge imparting.
Of the five factors tested, performance expectancy, facilitating conditions, and social influence had a statistically significant effect on behavioral intent, whereas effort expectancy and perceived risk did not. In other words, participants’ intention to use the chatbot is a function of their perception that the chatbot could assist them in their goal attainment (performance expectancy), that others who are important to them believed they should use the chatbot (social influence) and that they felt they had sufficient infrastructure (facilitating conditions) to use the chatbot. Of the three predictors, performance expectancy was the most significant.
These finding are in line with those of M. Y. Wu et al. (2012) and Loo et al. (2011), who also found these three constructs to have positive effect on intention to use technology, although not specifically to chatbot adoption. When compared to the limited chatbot adoption research available, our findings are aligned with those of Kuberkar and Singhal (2020; public transport chatbot assistant) as well as Kim et al. (2019; financial services chatbot), who also found performance expectancy, social influence, and facilitating conditions to significantly influence behavioral intention. They also found performance expectancy to be the most significant contributor. Almahri et al. (2020; university students’ adoption of chatbots) and Melián-González et al. (2021; travel and tourism chatbot) also found that performance expectancy significantly predicts students’ behavioral intent.
The significance of performance expectancy in these studies may point to the fact that, in general and regardless of context, users of service chatbots expect to gain answers to their queries and to find benefit from the effort it takes to interact with such a system. The goal-attainment chatbot used in the present study is different though, in that it does not provide answers per se, but helps users find their own answers in a reflective and non-directive coaching style (Passmore & Fillery-Travis, 2011). It is therefore significant that users still rate the value they expected to gain from the interaction as the most significant contributor to their intention to use the chatbot regardless of the level of directiveness of the chatbot.
The second significant contributor to behavioral intent in the present study (social influence) is also in line with other chatbot UTAUT assessments (Kim et al., 2019; Kuberkar & Singhal, 2020; Melián-González et al., 2021) that found social influence to be a significant factor. Social influence indicates that users cared about what their family, friends, and colleagues thought about using a coaching chatbot. This may point to the fact that chatbots are becoming popular and have positive impacts in the helping professions and that people are accepting their use.
The positive effect of facilitating conditions on behavioral intent in the present study was not reported in any other chatbot UTAUT study to our knowledge. This discrepancy may exist because participants felt that the system was readily available, that they had the requisite knowledge to use the chatbot, and that these aspects are important to their intention to use the chatbot. A possible explanation for this finding could be that the chatbot was easy to access, since participants were pre-screened to have access to the Facebook Messenger platform.
The findings therefore suggest that users were more willing to use Vici when they anticipated getting tangible value from the coaching intervention, when their friends, family, and co-workers were also using the chatbot for coaching, and if the necessary conditions exist for them to use the chatbot. These findings confirm the value-seeking and engaging relationship character requirement for chatbots within a coaching context (Grant, 2011, 2012).
Effort expectancy was not found to be significantly impacting intention to use the chatbot. This finding contradicts some other chatbot UTAUT studies (Kim et al., 2019; Kuberkar & Singhal, 2020) and points to participants in the present study not considering the effort required from them to use the chatbot as a significant contributor to their intention to use the chatbot. A possible explanation for this result is that Vici does not share any knowledge, content, or information, unlike the service chatbots in the other studies. Vici provides a reflective space where users define their own goals and action plans. This could mean that there is less effort needed and less frustration experienced, since the chatbot can never be “wrong.” This finding points to a potentially underutilized feature of chatbots as non-directive, reflective tools.
Perceived risk also did not significantly impact behavioral intent in the present study. This finding comes as somewhat of a surprise since coaching is considered a confidential, one-on-one intervention where trust has been shown to be a significant contributor to coaching success, hence its inclusion in the UTAUT model in the present study. A possible explanation is that participants may have felt that their goals are not as personal and confidential as, for example, banking and financial details, and they therefore did not mind sharing them. Ironically it could also imply that some people are more willing to share personal feelings and ideals with a chatbot than a human.
In terms of moderation, gender had a moderating tendency (p = .06) only on performance expectancy such that performance expectancy had a stronger influence on behavioral intent for females than for males. It is therefore more important for females than for males that the chatbot can deliver on their expectations. In terms of age there is a tendency (p = .09) for older people to consider the effort involved (effort expectancy) in using the chatbot as an inhibitor to their intention to use the chatbot. The age moderator finding could be linked to Kasilingam’s (2020) finding that younger people valued the usefulness and level of enjoyment of chatbots more than older people.
Implications and Future Research
This study contributes on theoretical and practice levels in two domains: technology adoptions and coaching research.
From a theoretical perspective in the technology adoption domain, we added a new construct (perceived risk) to the standard UTAUT model. Given the rising awareness of data privacy, the inclusion of this construct is seen as important in future studies where user data is involved. In terms of the theoretical contribution to the coaching domain, the finding that perceived risk, which includes trust, is not a significant contributor to adoption goes against our current understanding of necessary ingredients of coaching success in a human-to-human context. Currently, the coaching relationship is considered one of the most important determinants of coaching success (de Haan et al., 2011) with trust being a key ingredient of a strong relationship (Grant, 2014). These findings, however, suggest that trust and risk are not considered important when being coached by a chatbot. We interpret this to suggest that blindly applying what works in human-to-human-coaching to chatbot-to-human interaction may lead to sub-optimal chatbot design. Instead, chatbot designs should be based on empirical research, that is, sensitive to the usage context.
From a practice perspective, given that performance expectations play the strongest role in an individual’s intention to use a coaching chatbot, creators of coaching chatbots must base their designs on proven evidence-based theoretical models of coaching (Spence & Oades, 2011). To capitalize on the importance of social influence, early adopters or chatbot champions should be used to promote the value of using a coaching chatbot when coaching chatbots are marketed or deployed in organizations to assist employees. Given the importance of facilitating conditions, creators of coaching chatbots must also ensure that the chatbot is deployed on a platform, that is, readily available.
Coaching is typically expensive and reserved for senior managers especially in organizations. Using a chatbot coach could democratize coaching and provide some of the benefits typically associated with coaching to a much larger audience. This could be especially beneficial in developing countries with fewer financial resources.
Future research should take a longitudinal approach to measure adoption factors since goal-attainment is not a once-off event. Users’ attitudes may change over time, depending on how satisfied they are with their level of goal attainment. By administering the UTAUT at multiple time points a more accurate view of adoption factors could be obtained. As the use of coaching chatbots become more widespread, the extended UTAUT2 that includes measure for hedonic motivation, price/value, and habit should also be considered in future adoption studies using Coach Vici. Future research should also consider coaching context. Since coaching can be applied in a personal (life coaching) or organization setting, future research should source participants specific to these contexts to investigate potential contextual adoption differences. Finally, given the fact that the univariate analysis yielded different results to the PLS-SEM analysis, more research is needed into potential moderating effects of the constructs used in the present study.
Conclusion
Chatbots are growing in popularity and applied in increasingly diverse contexts. Most chatbots have a knowledge imparting, service oriented nature. In the helping professions such as life and organizational coaching, a different type of chatbot is used, that is, non-directive and reflective. Few studies have investigated adoption factors of chatbots in general and to the best of our knowledge no such studies have been conducted on non-directive chatbots. To fill this knowledge gap, a chatbot that helps people set goals and monitor progress was created and evaluated using a modified UTAUT model. In addition to the standard UTAUT model we added perceived risk. We also tested the moderation effects of age and gender. Analysis was performed using PLS-SEM.
The main findings are that performance expectancy, social influence, and facilitating conditions directly influence the intention to use the chatbot. In other words users of a non-directive reflective chatbot value the results the chatbot can help them achieve, they value that others close to them also use the chatbot, and they rate as important the convenience of using the chatbot. They do not rate as important the effort it takes or the risks associated with using the chatbot. Gender and age had slight moderation tendencies. Females placed a higher premium on the performance of the chatbot and for older people the effort involved in using the chatbot acted as an adoption inhibitor.
For non-directive reflective chatbots, and for knowledge imparting service chatbots both performance expectancy and social influence factors are important adoption factors. They differ, however in terms of effort and risk perception factors. Users do not expect an answer from a non-directive reflective chatbot, therefore likely reducing perception of effort used to interact. They may also have an inherent trust in a chatbot that they choose to share their personal goals with, therefore lowering risk perception in this type of chatbot. The findings presented here could be used as guidelines for designers of coaching chatbots to capture the unique requirements of non-directive, reflective chatbots.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
