Sage Journals: Discover world-class research

Abstract

Objective

Deliberate practice (DP) for psychological therapists involves using objective, corrective feedback to identify and improve individualised skill deficits, alongside iterative practice opportunities. Automated, prognostic feedback on session contents could enhance personalisation of DP across therapy professions and modalities. This study assessed the feasibility, acceptability and initial clinical utility of a 10-week therapist intervention integrating automated feedback on predicted prognosis with DP of individualised therapeutic skill deficits.

Methods

Participants were 97 therapy clients seen by nine therapists in the 10 weeks preceding intervention and 79 clients seen by the same therapists during the 10-week intervention period. Participating therapists, representing diverse professional backgrounds, invited their clients to consent to have sessions recorded and automatically assessed for predicted prognosis. Prognostic feedback was integrated into DP training and practice, comprising a total of 32 h intervention. Assessments of intervention feasibility, acceptability, credibility, outcome expectancy and therapists’ therapeutic skills were taken alongside qualitative interviews at baseline, 5- and 10-week follow-up. Reliable improvement in depression and anxiety was compared between clients receiving therapy in the pre-intervention and intervention periods.

Results

Findings indicated significant pre-post improvements in intervention acceptability (d_RM = 2.25, p = .008), credibility (d_RM = 1.11, p = .039) and therapist skills (total score: d_RM = 2.57, p = .008), with non-significant improvement in feasibility (d_RM = 0.67, p = .268) and clinical outcomes, including a 3% increase in the proportion of clients reporting reliable improvement for depression (58–61%) and a 10% increase for anxiety (65–75%). However, client uptake of automated feedback was low due to concerns about artificial intelligence and related trust in the system.

Conclusion

Automated feedback and DP become more acceptable to therapists through engagement, with potential to improve therapeutic skills and effectiveness. However, addressing client concerns about how technology is used for automated feedback is essential to enhance participation.

Keywords

Deliberate practice psychological therapy therapist training therapist supervision machine learning

Psychological therapies for common mental health problems

Anxiety and depressive disorders are common and disabling conditions in the general population that are often recurrent with a chronic course.^1–3 Psychological therapies are effective, first-line treatments for a number of anxiety and depressive disorders.⁴ However, despite strong evidence of effectiveness, psychological therapies are unlikely to have met their full potential to improve health.⁵ Compared to treatments for many physical health conditions, the overall effects of psychological therapies have not grown across time and, in some cases, have declined.^6–8 Although the majority of people completing a course of psychological therapy report recovery or reliable improvement in mental wellbeing, 5–10% report deterioration in spite of therapy.⁹ Furthermore, approximately 20% of people attending psychological therapy drop out prematurely¹⁰; making improvement much less likely.¹¹

A key cause for the lack of improvement in effectiveness over time is the absence of a systematic, objective, routine means of measuring the quality of psychological therapy contents.^12–14 As a result, it is harder to tell at any given time whether therapy is having the desired effect. Without such timely, content-oriented feedback, it is also harder for therapists to adjust treatment to remediate difficulties. Therapist training suffers too, because trainers may struggle to identify specific areas where trainees need to improve their practice to gain greater effectiveness.

Therapist-level differences in effectiveness

Potentially related to the lack of routine quality assessment in psychological therapy content, there are significant differences in clinical effectiveness between individual therapists and between different psychological therapy services.^15–17 Importantly, the differences between therapists become more pronounced when therapy is shorter and/or with patients who have more severe or complex difficulties.¹⁸ Furthermore, individual therapists themselves do not become more effective with time or experience, suggesting that current established support mechanisms are insufficient to promote improved effectiveness.¹⁹

Current research has focused on in-session interaction types as a key influence on therapist effectiveness and client prognosis in psychological therapy. The most effective therapists have proficient facilitative interpersonal skills, particularly in challenging therapeutic situations, compared to those achieving poorer outcomes.^20,21 The importance of interaction style and interaction type was reinforced using natural language processing and machine learning. In an analysis of text-messaging-based therapy for anxiety and depression using Machine Learning, a model identified that specific therapist interaction types (e.g. therapeutic praise) predicted greater outcome improvement, whilst others predicted poorer prognosis (e.g. interactions unrelated to therapeutic activity).²² A series of recent systematic reviews and meta-analyses have also identified that specific in-session interactions predict poorer health outcomes in addictions and anxiety disorders.^23–25

Deliberate practice for therapists

A recently developed training and practice improvement method called deliberate practice (DP) shows promise as an approach that may help therapists enhance their therapeutic skills.^26–29 In psychological therapies, DP involves establishing baseline effectiveness and then combining objective feedback with iterative practise to address individualised skill deficits.³⁰ The two systematic reviews on DP in psychological therapies suggest that it can improve therapeutic skills more effectively than traditional training and supervision,³¹ but that most studies have not included all components of DP (including individualised learning objectives, external expert support, feedback and iterative repetition).³² Despite evidence that DP can help improve clinical effectiveness,³³ unanswered questions remain.

Current best practice recommends identifying therapist skill deficits through analysis of previous therapy episodes, using Routine Outcome Monitoring (ROM) to detect non-random therapist error patterns among clients with poorer outcomes.³⁴ Engaging in DP without external feedback relies on therapists’ intuitions to choose which skills to practise. However, therapists’ intuitions are often inaccurate and could lead to less skill development.^35,36 While methods like outcome analysis reduce this risk, they also involve rigorous processes that may delay access to DP for interested therapists.

Machine learning-enhanced DP

A potential route to balancing accurate focuses for practice with accessibility in DP is using machine learning on therapy contents to provide automated, prognostic feedback from what therapists say each session. Such systems have been deployed to evaluate sessional language and offer competency feedback, partly because much current DP literature focusses on specific therapy modalities.³⁷ However, scaling DP for broader use requires accommodating therapists from diverse professional backgrounds and supporting the delivery of multiple or integrative therapy models. Additionally, DP demands significant time and effort, but it is unclear what duration of practice is needed to achieve the benefits. Such evidence is crucial for estimating DP's clinical and cost-effectiveness.

The ongoing development of ambient scribe technology (automatic production of clinical records from recordings of health consultations) has, so far, primarily focused on the benefits of reduced administrative burden.³⁸ However, integration of automated prognostic feedback from recordings of session contents has the potential to integrate with ambient scribe technology and substantially enhance the clinical benefits realised.

Study objectives

The current study assessed the feasibility, acceptability and initial clinical utility of a transtheoretical DP intervention incorporating automated, sessional feedback on prognosis based on machine learning applied to in-session linguistic content (automated feedback and deliberate practice: FDP). FDP aims to enable rapid access to DP while retaining its data-driven focus on non-random errors.

Method

Design

This study applies Bowen et al.'s³⁹ theoretical framework for investigating the feasibility of evidence-based interventions. In particular, this approach recommends comprehensive, multilevel evaluation of feasibility and acceptability, especially with interventions that may have multiple effects. This is suitable for early-stage development of a staff training intervention, to enable adjustment and adaptation prior to more formal evaluation of efficacy.

The study employed a quasi-experimental,⁴⁰ mixed-methods⁴¹ feasibility design with repeated measures, examining changes over time in perceived feasibility, acceptability, therapeutic skills and clinical effectiveness (CONsolidated Standards Of Reporting Trials (CONSORT) checklist: Supplemental Table 1). Participants were psychological therapists from three National Health Service (NHS) services in England and their clients: one service for mild-to-moderate depression and anxiety, and two services offering psychological therapies for people with specific long-term conditions and comorbid moderate-to-severe anxiety or depression. Services were selected for their commitment to ROM and therapist development, alongside a geographic focus on the Midlands region, enabling participants to attend in-person activities.

Participants

Eligible therapists were qualified and registered with the UK Health and Care Professions Council or the British Association for Behavioural and Cognitive Psychotherapies and had been in post and treating clients in their respective services for at least 3 months. Eligible therapists also had to have managerial agreement to commit time to intervention activities, including consent to routinely record therapy sessions (audio or video) and be competent to give written informed consent. Therapists were excluded if they were currently engaged in a disciplinary or fitness to practice review, were working less than 50% whole time equivalent hours or were not routinely monitoring outcomes with all clients. These criteria were used by clinical leads to identify and invite eligible therapists. Participating therapists were offered the FDP intervention. Clients of participating therapists were eligible if their sessions were conducted in English, they were aged 18 or older, and they were competent to give informed consent. The study was conducted in the East Midlands region of England between March and July 2024.

Measures

A set of validated, standardised assessments were conducted at baseline, 5- and 10-week follow-up to evaluate feasibility (Feasibility of Intervention Measure, FIM⁴²;), acceptability (Theoretical Framework of Acceptability, TFA; Acceptability of Intervention Measure, AIM; Intervention Appropriateness Measure, IAM^42,43), credibility and outcome expectancy (Credibility/Expectancy Questionnaire, CEQ⁴⁴), transtheoretical therapeutic skills associated with clinical effectiveness (adapted Facilitative Interpersonal Skills task²⁰) and clinical effectiveness (Patient Health Questionnaire 9-items, PHQ-9; Generalised Anxiety Disorder 7-items, GAD-7^45,46) over time (Table 1, Figure 1). Therapist demographics were also collected at baseline. Standard assessment procedures were conducted for all assessments apart from the Facilitative Interpersonal Skills task, where standard rating processes were used on recently developed stimulus videos.

Figure 1.
Flowchart of study processes.

Table 1.
Study assessments.

Assessment and citation Acronym Description Timepoints

Acceptability of Intervention Measure (Weiner et al., 2017)⁴² AIM Assesses perceived palatability, approval and satisfaction with a given intervention Baseline, 5 weeks, 10 weeks

Credibility/Expectancy Questionnaire (Devilly & Borkovec, 2000)⁴⁴ CEQ Assesses perceived intervention credibility and outcome expectancy Baseline, 5 weeks, 10 weeks

Intervention Appropriateness Measure (Weiner et al., 2017)⁴² IAM Assesses perceived intervention appropriateness and suitability in a given setting Baseline, 5 weeks, 10 weeks

Theoretical Framework of Acceptability questionnaire (Sekhon et al., 2022)⁴³ TFA Assesses seven components of intervention acceptability: affective attitude, burden, perceived effectiveness, ethicality, intervention coherence, opportunity costs and self-efficacy Baseline, 5 weeks, 10 weeks

Feasibility of Intervention Measure (Weiner et al., 2017)⁴² FIM Assesses key indicators of successful intervention implementation Baseline, 5 weeks, 10 weeks

Facilitative Interpersonal Skills task adapted (Anderson et al., 2009)²⁰ FIS Assesses interpersonal therapeutic skills, including verbal fluency, hope and positive expectations, persuasiveness, emotional expression, warmth, empathy, alliance bond capacity and rupture-repair responsiveness Baseline, 10 weeks

Generalised Anxiety Disorder 7-items (Spitzer et al., 2006)⁴⁶ GAD-7 Assesses symptoms of generalised anxiety disorder, focused on key diagnostic symptoms Sessional

The Patient Health Questionnaire 9-items (Kroenke et al., 2001)⁴⁵ PHQ-9 Assesses symptoms of major depressive episode, focused on key diagnostic symptoms Sessional

Feasibility was also assessed through: (1) the proportion of interested therapists who consented to participate; (2) the ability to recruit to target (n = 9–12 therapists); (3) the intervention completion rate (therapists attending ≥50% FDP sessions) and (4) the number of therapists’ clients consenting to participate.

Anonymised clinical outcomes for all clients seen by participating therapists were collected, comparing outcomes from the 10 weeks before FDP intervention (n = 97) with the 10 weeks of FDP intervention (n = 79). Semi-structured qualitative interviews with therapists at baseline and 10 weeks explored intervention expectations and experiences.

All standardised assessments used in the study are presented in the Supplemental Materials. The questionnaires used were either in the public domain or were made freely available by the respective copyright holders for non-commercial research. With the Facilitative Interpersonal Skills task (FIS), the published methodology was adapted without using the proprietary stimuli, which did not require permission, and the FIS team was contacted about the current study.

Interventions

All interventions were developed with ongoing consultation with and involvement of patients, therapists, therapist trainers and therapy service managers.

Automated feedback

Therapists recorded therapy sessions with consenting clients. These recordings were transcribed and anonymised, using Microsoft Azure's AI Speech Services Batch Transcription. Python code was used to prepare uploaded session recordings as mono .mp3 files and then request diarized transcription of the prepared files, which distinguishes between speakers. Transcripts were then rated for prognosis, based on in-session language patterns using secure automated feedback software. The natural language processing model that was used for automated feedback was a Bidirectional Encoder Representations from Transformer (BERT) model, pre-trained on clinical consultation transcripts (ClinicalBERT⁴⁷). This model was selected after evaluating the performance of a range of models on their ability to predict clinical outcomes of psychological therapy from transcripts of early sessions. The model was then trained on transcripts of therapy session recordings from two clinical trials of psychological therapies,^48,49 with linked pre- and post-therapy PHQ-9 and GAD-7 scores for participants to classify them into those who reliably improved and those who did not. This process trained the ClinicalBERT model to identify natural language patterns associated with clinical improvements or otherwise. Software provided client prognosis predictions (‘on-track’ for reliable improvement vs. ‘not-on-track’) based on the likelihood of achieving the national standard for reliable improvement on PHQ-9 or GAD-7 (reduction in PHQ-9 score ≥6, or ≥4 on GAD-7⁵⁰).

Deliberate practice

Study interventions began with a 2-day training, introducing participants to DP, its evidence base, and practical exercises following transtheoretical methods (see³⁰). The week after training, participants engaged in a 4-week schedule of DP activities, rotated twice:

Week 1: Ninety-minute individual supervision, including review of feedback received and associated session recording segments; review and revision of personalised learning objectives; individualised DP applying exercises focused on personalised learning objectives, and agreement on follow-up activities.

Week 2: Two-hour peer-supported DP in groups of four, facilitated by the intervention trainer. Sessions offered 30 min focused on each therapist, wherein they described what they were learning about their practice and the skills they were working on. Following feedback from peers, each therapist chose how to use their peers and DP to support their learning, typically including a DP role-play with peer feedback and repeated iterations.

Week 3: Two-hour individual reflection and review of targeted session recordings. Therapists reviewed automated feedback, identified skill development areas and practised identified skills through self-recording or peer review.

Week 4: Three-hour in-person progress review with the participant cohort. Sessions focused on embedding learning into routine practice and reflecting on the alignment of DP with therapists’ broader development and professional roles. Reflections were then translated into action plans for the following month.

Procedure

Interested therapists were screened by service leads, and consent-to-contact from the research team was obtained. Study researchers provided participant information and contacted potential participants via telephone or video call. The aims, methods, anticipated benefits and potential hazards of the study were explained, and any questions were answered. Then, written informed consent was obtained electronically and confirmed by email. Baseline assessments were then completed. Follow-up assessments were completed by participants with the same researcher. Participating therapists approached clients on their caseload, providing written participant information to those interested and obtaining written informed consent.

Statistical analysis

As a feasibility study, analyses were primarily descriptive, focusing on feasibility, acceptability and initial indications of clinical utility rather than formal hypothesis testing. Quantitative analyses followed the pre-registered Statistical Analysis Plan (OSF registration: osf.io/e85wx). All analyses were conducted in IBM SPSS Statistics (version 29). Descriptive statistics were calculated to summarise sample characteristics, feasibility indicators and acceptability measures. Feasibility was operationalised as recruitment, retention and adherence rates; acceptability was indexed through scores on the AIM, IAM, FIM and TFA. Given the small sample size, inferential comparisons across timepoints (baseline, mid- and post-intervention) were conducted using non-parametric Wilcoxon signed-rank tests. Changes in acceptability, feasibility, credibility, outcome expectancy and therapeutic skill (FIS) scores were tested using the completer sample (n = 8). For each comparison, repeated-measures effect sizes were calculated as Cohen's d_RM, adjusted for within-subject dependency, with corresponding 95% confidence intervals (CIs). For clinical outcomes, anonymised PHQ-9 and GAD-7 data were analysed for all clients seen by participating therapists during the 10 weeks pre-intervention and during the 10-week FDP period. Reliable change indices were computed, applying NHS Talking Therapies thresholds for reliable improvement and deterioration.⁵⁰ The proportions of clients meeting reliable improvement, no change and reliable deterioration criteria were compared between pre-intervention and intervention periods using Fisher's exact tests (Fisher–Freeman–Halton extension for multi-category tables).

Qualitative analysis

Qualitative data from semi-structured interviews were analysed using framework analysis, structured around the domains of the TFA.^43,51 This approach aims to identify, define and interpret key patterns in emergent themes by creating and then applying an analytical framework. Quantitative and qualitative findings were integrated narratively to aid the interpretation of acceptability and feasibility outcomes.

Ethical approval and pre-registration

Ethical approval was obtained from the Health Research Authority East Midlands - Leicester Central Research Ethics Committee (reference 24/EM/0073). The study was pre-registered prior to recruitment.⁵²

Results

Participants

All therapists expressing interest were recruited, and the study achieved recruitment within the target range (n = 9). Participants were from a range of professional backgrounds, including practitioner psychologists and psychotherapists applying Cognitive Behavioural Therapy, Acceptance and Commitment Therapy, Compassion-Focused Therapy, Person-Centred Experiential Therapy and integrative (incorporating multiple therapeutic modalities) therapeutic modalities. Participants were predominantly female (n = 7); White British (n = 6), had a median age of 41 (range = 29–55) and had been qualified a median of 1.5 years (range = 0–12 years).

Feasibility

Therapist recruitment, retention and adherence

Eight participants (89%) completed the 10-week intervention. However, use of the automated feedback component varied due to time constraints, technical difficulties and challenges in obtaining recordings, partly influenced by limited client uptake.

Client uptake

Less than two clients-per-therapist consented to use of automated feedback (n = 14), when outcomes were reported by a median of 15 clients-per-therapist in the same timeframe. In interviews, therapists noted that declining clients expressed concerns about the potential role of Artificial Intelligence (AI) in therapy: ‘A couple of patients said, “Are you going to be replaced by AI? I only ever want to see a real person”’ [P7].

Therapist assessments

Baseline therapist assessments on the AIM, IAM and FIM were below typical mean scores from previous research (baseline range = 14.9–16.1, typical M > 16.5; e.g.⁵³). Post-intervention, mean scores for all three measures rose to exceed this typical benchmark (Post M range = 16.6–18.9) (Supplemental Figure 1).

Statistically, this shift represented a significant improvement in perceived acceptability (Z = 2.57, p = .008, d_RM = 2.25, 95% CI [1.04, 3.47]). Improvements in appropriateness and feasibility were non-significant (ps = .060 and .268, respectively). Regarding the CEQ, therapist-perceived intervention credibility demonstrated significant improvement (Z = 2.11, p = .039, d_RM = 1.11, 95% CI [0.13, 2.09]), while outcome expectancy showed non-significant improvement (p = .148) (Supplemental Figure 2; Table 2).

Table 2.
Changes in intervention acceptability, credibility and therapist facilitative interpersonal skills

Baseline (n = 9) Post-intervention (n = 8) Change (n = 8)

Measure M SD M SD Wilcoxon Z p value Effect size (d_RM) 95% CI lower 95% CI upper

AIM, IAM, FIM (/20)

Acceptability 15.6 1.4 18.9 1.8 2.57 .008** 2.25 1.04 3.47

Appropriateness 16.1 1.8 18.3 2.2 1.88 .060 1.16 0.23 2.09

Feasibility 14.9 2.2 16.6 2.9 1.11 .268 0.67 −0.27 1.61

CEQ (/27)

Credibility 18.6 4.4 23.9 2.6 2.11 .039* 1.11 0.13 2.09

Outcome Expectancy 15.4 4.3 17.9 5.9 1.54 .148 0.49 −0.04 1.03

Adapted Facilitative Interpersonal Skills Task (/5)

Verbal Fluency 3.5 0.6 4.6 0.4 2.53 .008** 1.95 0.85 3.05

Hope 3.6 0.4 4.0 0.5 2.20 .028* 1.56 0.40 2.72

Persuasiveness 3.7 0.4 4.4 0.4 2.54 .008** 1.53 0.63 2.44

Emotional Expression 3.8 0.5 4.7 0.4 2.38 .016* 1.73 0.49 2.97

Warmth, Acceptance and Understanding 3.8 0.3 4.5 0.4 2.20 .027* 2.13 0.66 3.59

Empathy 3.7 0.5 4.4 0.5 2.38 .018* 1.49 0.56 2.42

Alliance Bond Capacity 4.0 0.4 4.5 0.3 2.21 .028* 1.18 0.30 2.07

Alliance Rupture-Repair Responsiveness 3.7 0.5 4.5 0.2 2.52 .008 1.40 0.39 2.41

Total 3.7 0.3 4.4 0.3 2.52 .008 2.57 1.21 3.94

Note: Subscales of AIM, IAM, and FIM are scored out of 20, CEQ subscales out of 27, and Facilitative Interpersonal Skills subscales out of 5. Wilcoxon signed-rank tests were conducted on the completer sample (n = 8) as per the preregistered analysis plan. Means, standard deviations, and Cohen's d effect sizes are provided for precision and transferability. Effect sizes are reported as Cohen's d_RM (repeated measures Cohen's d), where the denominator is the pre-intervention standard deviation, and the effect size is adjusted for dependency by incorporating the pre–post correlation. The confidence intervals for d_RM reflect this adjustment to account for the dependency between measurements. Interpretation of Cohen's d: 0.20 = ‘small’, 0.50 = ‘moderate’, 0.80 = ‘large’. AIM = Acceptability of Intervention Measure, IAM = Intervention Appropriateness Measure, FIM = Feasibility of Intervention Measure, CEQ = Credibility/Expectancy Questionnaire.

*p < .05, **p < .01.

Acceptability

The TFA combined quantitative data from modal appraisals of each domain of feasibility and acceptability with qualitative feedback from therapists to aid understanding of how the intervention was experienced over time. Appraisals of acceptability on the TFA showed improvement over the course of the intervention, particularly in perceptions of opportunity costs, affective attitude and general acceptability (Table 3). Early neutral responses shifted to positive endorsements as participants experienced the intervention's benefits.

Table 3.
Therapist appraisals of intervention acceptability.

Domain Question Pre-intervention Mid (5-week) Post (10-week)

Affective attitude Do you like the intervention? No opinion Strongly like Strongly like

Burden How much effort does it take to engage? A lot of effort A lot of effort A lot of effort

Perceived effectiveness The intervention will improve my clinical skills Agree Strongly agree Strongly agree

Intervention coherence It's clear how the intervention will help improve my clinical skills Agree Agree Agree

Self-efficacy How confident do you feel about engaging? Confident Confident Very confident

Opportunity costs Engaging with the intervention interferes with other priorities Agree Agree Disagree

General acceptability How acceptable is the intervention to you? No opinion Acceptable Acceptable

Note: Modal responses shown in cells. Key assessment area highlighted in bold. Responses indicating high acceptability in increasingly dark green, low acceptability responses in increasingly dark red; neutral responses in orange.

Framework analysis of therapist interview data (Table 4) revealed high acceptability of the intervention, aligning with the TFA. Emotional responses (affective attitude) were overwhelmingly positive by the end of the intervention, reflecting high satisfaction with group learning and professional development, though therapists found automated feedback less useful, because they expressed preferences for it to offer more detail. Despite the intervention's benefits, participation was deemed demanding throughout, requiring significant time, cognitive and emotional effort, with logistical and recruitment challenges. Ethicality was noted in the alignment with therapists’ values, particularly the autonomy to adapt the intervention to individual therapeutic modalities and approaches.

Table 4.
Acceptability of automated feedback and deliberate practice in terms of the theoretical framework of acceptability.

Domain and definition Subthemes Key findings and illustrative quotes

Affective attitudeHow an individual feels taking part in an intervention, with quotes coded by valence:
▪ Positive (+)

▪ Mixed (+/-)

▪ Neutral (N)

▪ Negative (-)
Overall positive feelings about participating in the intervention Value of shared learning despite barriers Mixed reactions to automated feedback , ranging from reassurance to disappointment Overall positive feelings about participating in the intervention Participants expressed overwhelmingly positive feelings about the intervention:P9: ‘I can’t speak highly enough of it really’ (+)P6: ‘I’ve really enjoyed it, the whole thing’ (+)P5: ‘I’m grateful to have been part of it’ (+) Value of shared learning despite barriers Group learning was valued but occasionally hindered by differences in approaches:P7: ‘Having access to other qualified practitioners and hearing how they do things and getting feedback from them. Wow, that was amazing for me’P8: ‘The groups – deliberate practice groups and small groups – I think that was the most valuable aspect’ (+)P3: ‘I definitely had a mixed experience of the group work. On the one hand it's really helpful because it was shared learning … I guess it's reflecting on the different therapeutic approaches that people had. I think sometimes for me that was probably a bit of a barrier’ (+/-) Mixed reactions to automated feedback Reactions to automated feedback ranged from confidence-building to unmet expectations:P5: ‘You load something to [the automated feedback system] and it comes back as ‘on track’, and you’re like ‘OK, maybe I do know what I’m doing’, but that's felt good to have that reassurance’ (+)P8: ‘But yeah, still haven't had the chance to upload my recordings’ (N)P7: ‘My expectation of what feedback I would get was up there [indicates high]. And then what I got was like that [indicates low]’ (-)

BurdenThe perceived effort required to participate, including managing training, supervision, and recruitment tasks alongside regular practice demands Time demands and scheduling challenges Cognitive and emotional demands , including productive struggle Logistical challenges, including travel and materials management Recruitment burdens on therapists Time demands and scheduling challenges Participants found time demands a significant challenge to participation:P9: ‘Time was the biggest limiting factor’P5: ‘It's been quite an intense period over the 10 weeks’P8: ‘Very time consuming, I must say… Maybe the timings, if that could be planned well in advance, that would have been more useful’ Cognitive and emotional demands Cognitive strain was high, but often described as productive:P3: ‘I've definitely noticed a bit more of like intense cognitive load in having to think more about my sessions, but that's not necessarily a bad thing’P9: ‘I’m a little bit fried by the end of that to be honest … It still feels like a struggle. But I think that's not a bad thing, you know, that's good’P5: ‘I mean, it definitely felt like intense training, like I was really tired afterwards, but I don't think there's anything I’d do differently or to prepare’ Logistical challenges Logistical hurdles included travel and managing materials, adding to the perceived effort:P5: ‘Doing a 3-h train journey to do 3 h there and then 3 h back’P6: ‘There's a lot of worksheets … having sheets all over the place’ Recruitment burdens on therapists P7: ‘I would have liked to have a few more clients agree to take part. I saw it almost as a personal failure’P8: ‘I found it really difficult to have enough patients to consent’

Ethicality – ValuesThe extent to which the intervention aligns with therapist and patient values, including ethical considerations about therapeutic autonomy and the use of AI Alignment with professional therapeutic values Concerns about AI and the importance of human connection Alignment with professional therapeutic values The intervention accommodated therapists’ individual models while supporting their values of professional self-development and continuous improvement:P3: ‘What I really valued about the deliberate practice model is it doesn’t matter what therapeutic background you come from; it's very much the individual therapist has the control over what they do with their practice… I think sometimes I might struggle when I’m being taught a new method, and it doesn’t sort of fit with my values of therapy’P7: ‘I think you're pretty adept anyway at what you do, but it's now getting into the minutia of what you do … because you know generally what you want to do but you want to get better and keep getting better’ Concerns about AI and the importance of human connection Some patients expressed concern about the role of AI in therapy, emphasising their preference for human interaction:P7: ‘A couple of clients said, “Are you going to be replaced by AI?… I only ever want to see a real person”’

Intervention coherenceThe extent to which participants understand the intervention principles and how it works Strong understanding of deliberate practice principles and components Limited understanding and perceived reliability of automated feedback Strong understanding of deliberate practice principles and components Deliberate practice components were well-understood and appreciated by participants:P8: ‘Working with other people, learning from other people, but also making space to practice these skills, as well as the individual supervision, was really helpful’P3: ‘So, I guess having the time and space to do the role plays within the groups and in the individual one-to-ones and then being able to translate that into therapy sessions with clients, it just seemed quite seamless then’P7: ‘Despite the fact that I totally disliked roleplaying, I can definitely see the use for them’ Limited understanding and perceived reliability of automated feedback Automated feedback was perceived as uninformative and sometimes unreliable:P3: ‘It just says, ‘on track’…I’ve got no clue as to what that means’P5: ‘Sometimes the transcription would make little mistakes, which would be hilarious.And I would be like, is the AI going to judge that?’P4: ‘Would I go to a client and say, ‘we’ve got this automated system which I don't know what the quality is, and it's saying this and this’?’

Opportunity costsThe extent to which benefits, profits, or values must be given up to engage in the intervention Challenges balancing participation with existing demands Trade-offs in prioritising development areas Challenges balancing participation with existing demands Participants reported challenges balancing intervention demands with existing responsibilities:P2: ‘The idea of making a regular slot for working individually on deliberate practice, or with colleagues, on paper is very easy but not in practice’P4: ‘There's a lot of extra work … but the service wasn’t really accounting for that’P9: ‘Not really, because the service has allowed me to have a slightly reduced caseload for this period of time, which is fine’ Trade-offs in prioritising development areas Time constraints led participants to focus on specific development areas, leaving others aside:P3: ‘There's so many different areas that I could focus on, and I would want to focus on, but actually given the time restraints of the study, I have to keep it really focussed – so then, narrowing it down’

Perceived effectivenessThe extent to which participants perceive the intervention as effective in enhancing therapeutic practice and patient outcomes Increased self-awareness and intentionality in therapeutic approach Building confidence in skills and approach Renewed work engagement through intervention Uncertainty about the intervention's impact on patient outcomes Participants experienced increased self-awareness, confidence, and engagement in their therapeutic work: Increased self-awareness and intentionality in therapeutic approach P8: ‘I’m being a lot calmer in my response rather than panicky in sessions’.P7: ‘I was a lot more aware of my practice, I became a lot more intentional’P5: ‘I feel like in sessions with patients now that there's like this extra layer of consciousness of what I'm saying and how I'm saying it’ Building confidence in skills and approach P2: ‘It's a good learning from that and I think I've taken confidence as well in my practice’P6: ‘I’ve got the confidence to nip issues in the bud or get the buy-in from the start more effectively’P5: ‘I think I was a little surprised on the effect it's had on my confidence. When you get the feedback from people after the role plays and they say you’ve done something well’ Renewed work engagement through intervention P6: ‘I think it's got me enjoying my job again’P7: ‘I think doing this a lot more allowed me to just decompress and then go into it knowing that I was equipped to go into it … You're almost excited to go and try it’However, they found it difficult to discern whether patient outcomes improved directly due to the intervention: Difficulty assessing impact on patients P3: ‘I've got no sense of whether as a result of doing the deliberate practice, and the intentional micro skills, has that made a difference to the session outcome, or has it not made any difference?’Interviewer ‘Have you had any feedback from others to suggest that there is some change in your practice or how you’re doing things over time?’P5: ‘Not from patients, but from the other people in the study, definitely’

Self-efficacyThe participants’ confidence that they can perform the behaviour(s) required to participate in the intervention Proficiency gained through support and familiarity Participants gained confidence as they became more familiar with the intervention in a supportive environment:P4: ‘It's a bit out of your comfort zone at the beginning, but everybody was so nice and there was no judgment’P2: ‘Especially at the beginning, just having more time to get to grips with everything and perhaps even having a session together on just like what we do and when we do it’

Intentionality towards future useParticipants’ intentions and plans for integrating the intervention into their future practice Plans to integrate the intervention into future professional practice Participants expressed strong intentions to integrate the intervention into their ongoing practice:P3: ‘So yeah, I definitely do want to continue doing it. I think it's just finding a way in pre-existing systems to slot it in basically’P6: ‘I mean, I’ve been using it in my private practice as well because that's something I’ve just started’P8: ‘I’m definitely going to go back to it continuously and regularly’

Note: Px = Participant identification code.

However, concerns were raised about client scepticism toward AI. Intervention coherence was strong for DP principles but weaker for automated feedback, as more detailed feedback was desired to inform DP. Therapists acknowledged opportunity costs (the need to sacrifice other activities for intervention engagement) but valued the intervention's impact, reporting increased self-awareness, confidence and work engagement. Self-efficacy grew as participants gained proficiency with DP procedures, supported by a collaborative learning environment. Finally, therapists expressed clear intentions to integrate the approach into future practice.

Clinical utility

Therapist skills

All domains of the adapted facilitative interpersonal skills task significantly improved from baseline to post-intervention (Z range = 2.20–2.54, all ps < .05), demonstrating large effect sizes across subscales (d_RM range = 1.18–2.13). The total FIS score reflected this overall enhancement in therapeutic skills (Z = 2.52, p = .008, d_RM = 2.57, 95% CI [1.21, 3.94]; Supplemental Figure 3).

Client outcomes

Outcomes for participating therapists’ clients were compared between 97 clients reporting outcomes in the 10 weeks directly prior to intervention (baseline) with 79 clients reporting outcomes in the 10-week intervention period (intervention). There was a median of 18 clients-per-therapist in the baseline period and 15 clients-per-therapist in the intervention period. The proportion of clients reporting reliable improvement in depression (PHQ-9) increased from 58% to 61%, and reliable improvement in anxiety (GAD-7) rose from 65% to 75%.⁵⁴ However, these changes were non-significant for both measures (Fisher–Freeman–Halton exact test PHQ-9, p = .568; GAD-7, p = .448; Figure 2).

Figure 2.
Average change in therapist outcomes from baseline to intervention period.

Discussion

This study aimed to assess the feasibility and acceptability of offering automated, prognostic feedback to psychological therapists and providing a structured DP training programme. Overall, findings suggest that the intervention was acceptable to participating therapists and feasible to fit alongside their clinical practice commitments, with some specific time protected for the intervention. However, the intervention was less acceptable to therapists’ clients, with low uptake of the automated feedback component. By the end of the intervention, therapists from a range of professional backgrounds found the FDP intervention highly acceptable, with improvements in perceived opportunity costs, affective attitude and general acceptability over time. Qualitative interviews supported these changes with participants identifying increased work engagement and enjoyment as a result of their participation. Yet, qualitative reports also highlighted mixed experiences with automated feedback that may have limited any benefits. Initial clinical utility was indicated with significant improvements in therapists’ facilitative interpersonal skills – abilities that are associated with greater clinical effectiveness among therapists.²¹ In addition, proportionally more clients reliably improved in depression and anxiety symptoms. The difference in impact on depression (3% more clients improving) versus anxiety (10% more clients improving) could be due to differences in FDP effects on depression versus anxiety symptoms, psychometric differences between PHQ-9 and GAD-7, or simply a chance difference given that the changes were non-significant. While feasibility was demonstrated by high therapist retention and adherence, low client uptake and mixed experiences with the automated feedback component highlight the need for its development.

This study is consistent with prior research showing that DP improves therapists’ skills³¹ and adds initial evidence that DP may also enhance clinical effectiveness over a relatively short, intensive training period. This suggests that this kind of intervention could play a role in helping to improve the effectiveness of psychological therapies that have eluded the field for long periods.²¹ Extending beyond DP studies focussed on modality-specific competencies,²⁷ these findings suggest that skill and effectiveness gains have the potential to be achieved across multiple and integrated therapeutic modalities within a data-driven, personalised approach. This study also highlights the less commonly discussed issue of the practical and emotional impact of DP on practitioners. Although the challenges of using DP in psychotherapy have been identified previously,⁵⁵ this may be a topic that requires greater discussion in future research to optimise the beneficial effects observed on therapists’ wellbeing and work engagement, alongside minimising the burden that comes with consistent focus on skill limitations.

Limitations

As a feasibility study, the results obtained are insufficient for reliable conclusions on effectiveness to be drawn. Furthermore, the absence of any comparator group makes it unclear whether the changes observed would occur under conditions without the intervention. The absence of a sample size calculation identified a priori also reduces the confidence that can be placed on the results observed. By comparison to other therapist-level interventions, this study had a small sample size, and a larger sample would be required to obtain sufficient variability in the sample for generalisability. The short-term follow-up period means the durability of any effects also remains unclear. Powering future research for specific effects, longer-term, with a randomised comparator group would produce more dependable results. The low acceptability and mixed evaluation of the automated feedback indicate that improvements in its design, such as more detailed prognostic feedback and clearer explanations, are needed. Alternatively, established automated feedback methods for specific therapeutic models could serve as a foundation for developing transtheoretical feedback, offering sufficiently generalisable insights to support personalisation across modalities.³⁷ These adjustments would be required before integration with any ambient scribe technology to facilitate and support adoption and implementation. Although improvements in therapist skills were observed in this study, the use of different stimulus videos as part of a standardised assessment may threaten the validity of these findings.

Future research

Future research should evaluate the FDP intervention in a randomised controlled trial that is powered to detect differences in therapist skills and effectiveness. This would give a clearer understanding of clinically important effects, the size of any effects and the dosage of intervention required to achieve them. If found to be effective, this type of study would enable services to more readily estimate the return on investment of time for therapists to apply FDP. Adaptive trial designs could be applied in future research to enable within-study adaptation and improvement of the automated feedback system using pre-specified rules.⁵⁶ Future research could also assess and identify active ingredients of FDP by comparing different automated feedback systems, with and without the transtheoretical DP approach. All future studies should include longer follow-up periods to better understand the durability of effects over time.

Implications

This study suggests that it may be possible for therapists to improve their skills and effectiveness over time with an intervention targeted at improving personalised skill deficits. It also indicates that, in practice, there may be impacts on therapists beyond clinical effectiveness and skills, including potential impacts on their satisfaction with and enjoyment of work. Conversely, the intervention can also be emotionally and cognitively demanding with time demands that can have an impact even with protected time allocated. Therefore, future implementation of related interventions should account for the breadth of impacts, positive and negative, that therapists may experience to enhance acceptability. It is notable that measures of intervention acceptability improved over the 10 weeks of intervention. This suggests that evaluating the impact of FDP over time may be more meaningful than initial perceptions alone, given that they may change with the use of the intervention. Furthermore, co-design of automated feedback systems and integration of processes that enable AI and machine learning processes to be more understandable and explainable could help address the concerns raised by therapists and clients about their use in therapy.⁵⁷ This study highlights the value of the ‘human in the loop’ approach to responsible health developments involving AI; accelerating and scaling up technology of this kind is greatly enhanced by human feedback to support acceptable and responsible improvement. This type of development could support larger-scale application of FDP in training and practice.

Conclusion

This study suggests that FDP can be feasibly and acceptably applied with therapists from a range of theoretical orientations, providing therapy to varied patient populations. This intervention has the potential to improve therapeutic skills and clinical effectiveness, though further study is required to formally evaluate this and, if found to be effective, to estimate the size and durability of the effects. Patients’ concerns about the application of AI in this context must be addressed to increase uptake, and close attention should be paid to the processes through which clients are socialised to the approach.

Supplemental Material

sj-docx-1-dhj-10.1177_20552076251413316 - Supplemental material for The feasibility and acceptability of automated feedback and deliberate practice in psychological therapies for anxiety and depression

Supplemental material, sj-docx-1-dhj-10.1177_20552076251413316 for The feasibility and acceptability of automated feedback and deliberate practice in psychological therapies for anxiety and depression by Sam Malins, Grazziela Figueredo, David Saxon, Kate Horton, Jeremie Clos, Thomas Trimble, Kavan Fatehi, David Waldram, Fred Higton, Gillian E Hardy, Michael Barkham, Jonathan Couldridge and Nima Moghaddam in DIGITAL HEALTH

Supplemental Material

sj-docx-2-dhj-10.1177_20552076251413316 - Supplemental material for The feasibility and acceptability of automated feedback and deliberate practice in psychological therapies for anxiety and depression

Supplemental material, sj-docx-2-dhj-10.1177_20552076251413316 for The feasibility and acceptability of automated feedback and deliberate practice in psychological therapies for anxiety and depression by Sam Malins, Grazziela Figueredo, David Saxon, Kate Horton, Jeremie Clos, Thomas Trimble, Kavan Fatehi, David Waldram, Fred Higton, Gillian E Hardy, Michael Barkham, Jonathan Couldridge and Nima Moghaddam in DIGITAL HEALTH

Assessment and citation	Acronym	Description	Timepoints
Acceptability of Intervention Measure (Weiner et al., 2017)⁴²	AIM	Assesses perceived palatability, approval and satisfaction with a given intervention	Baseline, 5 weeks, 10 weeks
Credibility/Expectancy Questionnaire (Devilly & Borkovec, 2000)⁴⁴	CEQ	Assesses perceived intervention credibility and outcome expectancy	Baseline, 5 weeks, 10 weeks
Intervention Appropriateness Measure (Weiner et al., 2017)⁴²	IAM	Assesses perceived intervention appropriateness and suitability in a given setting	Baseline, 5 weeks, 10 weeks
Theoretical Framework of Acceptability questionnaire (Sekhon et al., 2022)⁴³	TFA	Assesses seven components of intervention acceptability: affective attitude, burden, perceived effectiveness, ethicality, intervention coherence, opportunity costs and self-efficacy	Baseline, 5 weeks, 10 weeks
Feasibility of Intervention Measure (Weiner et al., 2017)⁴²	FIM	Assesses key indicators of successful intervention implementation	Baseline, 5 weeks, 10 weeks
Facilitative Interpersonal Skills task adapted (Anderson et al., 2009)²⁰	FIS	Assesses interpersonal therapeutic skills, including verbal fluency, hope and positive expectations, persuasiveness, emotional expression, warmth, empathy, alliance bond capacity and rupture-repair responsiveness	Baseline, 10 weeks
Generalised Anxiety Disorder 7-items (Spitzer et al., 2006)⁴⁶	GAD-7	Assesses symptoms of generalised anxiety disorder, focused on key diagnostic symptoms	Sessional
The Patient Health Questionnaire 9-items (Kroenke et al., 2001)⁴⁵	PHQ-9	Assesses symptoms of major depressive episode, focused on key diagnostic symptoms	Sessional

	Baseline (n = 9)	Post-intervention (n = 8)	Change (n = 8)
AIM, IAM, FIM (/20)
Acceptability	15.6	1.4	18.9	1.8	2.57	.008**	2.25	1.04	3.47
Appropriateness	16.1	1.8	18.3	2.2	1.88	.060	1.16	0.23	2.09
Feasibility	14.9	2.2	16.6	2.9	1.11	.268	0.67	−0.27	1.61
CEQ (/27)
Credibility	18.6	4.4	23.9	2.6	2.11	.039*	1.11	0.13	2.09
Outcome Expectancy	15.4	4.3	17.9	5.9	1.54	.148	0.49	−0.04	1.03
Adapted Facilitative Interpersonal Skills Task (/5)
Verbal Fluency	3.5	0.6	4.6	0.4	2.53	.008**	1.95	0.85	3.05
Hope	3.6	0.4	4.0	0.5	2.20	.028*	1.56	0.40	2.72
Persuasiveness	3.7	0.4	4.4	0.4	2.54	.008**	1.53	0.63	2.44
Emotional Expression	3.8	0.5	4.7	0.4	2.38	.016*	1.73	0.49	2.97
Warmth, Acceptance and Understanding	3.8	0.3	4.5	0.4	2.20	.027*	2.13	0.66	3.59
Empathy	3.7	0.5	4.4	0.5	2.38	.018*	1.49	0.56	2.42
Alliance Bond Capacity	4.0	0.4	4.5	0.3	2.21	.028*	1.18	0.30	2.07
Alliance Rupture-Repair Responsiveness	3.7	0.5	4.5	0.2	2.52	.008**	1.40	0.39	2.41
Total	3.7	0.3	4.4	0.3	2.52	.008**	2.57	1.21	3.94

Domain	Question	Pre-intervention	Mid (5-week)	Post (10-week)
Affective attitude	Do you like the intervention?	No opinion	Strongly like	Strongly like
Burden	How much effort does it take to engage?	A lot of effort	A lot of effort	A lot of effort
Perceived effectiveness	The intervention will improve my clinical skills	Agree	Strongly agree	Strongly agree
Intervention coherence	It's clear how the intervention will help improve my clinical skills	Agree	Agree	Agree
Self-efficacy	How confident do you feel about engaging?	Confident	Confident	Very confident
Opportunity costs	Engaging with the intervention interferes with other priorities	Agree	Agree	Disagree
General acceptability	How acceptable is the intervention to you?	No opinion	Acceptable	Acceptable

Domain and definition	Subthemes	Key findings and illustrative quotes
Affective attitudeHow an individual feels taking part in an intervention, with quotes coded by valence: ▪ Positive (+) ▪ Mixed (+/-) ▪ Neutral (N) ▪ Negative (-)	Overall positive feelings about participating in the intervention Value of shared learning despite barriers Mixed reactions to automated feedback , ranging from reassurance to disappointment	Overall positive feelings about participating in the intervention Participants expressed overwhelmingly positive feelings about the intervention:P9: ‘I can’t speak highly enough of it really’ (+)P6: ‘I’ve really enjoyed it, the whole thing’ (+)P5: ‘I’m grateful to have been part of it’ (+) Value of shared learning despite barriers Group learning was valued but occasionally hindered by differences in approaches:P7: ‘Having access to other qualified practitioners and hearing how they do things and getting feedback from them. Wow, that was amazing for me’P8: ‘The groups – deliberate practice groups and small groups – I think that was the most valuable aspect’ (+)P3: ‘I definitely had a mixed experience of the group work. On the one hand it's really helpful because it was shared learning … I guess it's reflecting on the different therapeutic approaches that people had. I think sometimes for me that was probably a bit of a barrier’ (+/-) Mixed reactions to automated feedback Reactions to automated feedback ranged from confidence-building to unmet expectations:P5: ‘You load something to [the automated feedback system] and it comes back as ‘on track’, and you’re like ‘OK, maybe I do know what I’m doing’, but that's felt good to have that reassurance’ (+)P8: ‘But yeah, still haven't had the chance to upload my recordings’ (N)P7: ‘My expectation of what feedback I would get was up there [indicates high]. And then what I got was like that [indicates low]’ (-)
BurdenThe perceived effort required to participate, including managing training, supervision, and recruitment tasks alongside regular practice demands	Time demands and scheduling challenges Cognitive and emotional demands , including productive struggle Logistical challenges, including travel and materials management Recruitment burdens on therapists	Time demands and scheduling challenges Participants found time demands a significant challenge to participation:P9: ‘Time was the biggest limiting factor’P5: ‘It's been quite an intense period over the 10 weeks’P8: ‘Very time consuming, I must say… Maybe the timings, if that could be planned well in advance, that would have been more useful’ Cognitive and emotional demands Cognitive strain was high, but often described as productive:P3: ‘I've definitely noticed a bit more of like intense cognitive load in having to think more about my sessions, but that's not necessarily a bad thing’P9: ‘I’m a little bit fried by the end of that to be honest … It still feels like a struggle. But I think that's not a bad thing, you know, that's good’P5: ‘I mean, it definitely felt like intense training, like I was really tired afterwards, but I don't think there's anything I’d do differently or to prepare’ Logistical challenges Logistical hurdles included travel and managing materials, adding to the perceived effort:P5: ‘Doing a 3-h train journey to do 3 h there and then 3 h back’P6: ‘There's a lot of worksheets … having sheets all over the place’ Recruitment burdens on therapists P7: ‘I would have liked to have a few more clients agree to take part. I saw it almost as a personal failure’P8: ‘I found it really difficult to have enough patients to consent’
Ethicality – ValuesThe extent to which the intervention aligns with therapist and patient values, including ethical considerations about therapeutic autonomy and the use of AI	Alignment with professional therapeutic values Concerns about AI and the importance of human connection	Alignment with professional therapeutic values The intervention accommodated therapists’ individual models while supporting their values of professional self-development and continuous improvement:P3: ‘What I really valued about the deliberate practice model is it doesn’t matter what therapeutic background you come from; it's very much the individual therapist has the control over what they do with their practice… I think sometimes I might struggle when I’m being taught a new method, and it doesn’t sort of fit with my values of therapy’P7: ‘I think you're pretty adept anyway at what you do, but it's now getting into the minutia of what you do … because you know generally what you want to do but you want to get better and keep getting better’ Concerns about AI and the importance of human connection Some patients expressed concern about the role of AI in therapy, emphasising their preference for human interaction:P7: ‘A couple of clients said, “Are you going to be replaced by AI?… I only ever want to see a real person”’
Intervention coherenceThe extent to which participants understand the intervention principles and how it works	Strong understanding of deliberate practice principles and components Limited understanding and perceived reliability of automated feedback	Strong understanding of deliberate practice principles and components Deliberate practice components were well-understood and appreciated by participants:P8: ‘Working with other people, learning from other people, but also making space to practice these skills, as well as the individual supervision, was really helpful’P3: ‘So, I guess having the time and space to do the role plays within the groups and in the individual one-to-ones and then being able to translate that into therapy sessions with clients, it just seemed quite seamless then’P7: ‘Despite the fact that I totally disliked roleplaying, I can definitely see the use for them’ Limited understanding and perceived reliability of automated feedback Automated feedback was perceived as uninformative and sometimes unreliable:P3: ‘It just says, ‘on track’…I’ve got no clue as to what that means’P5: ‘Sometimes the transcription would make little mistakes, which would be hilarious.And I would be like, is the AI going to judge that?’P4: ‘Would I go to a client and say, ‘we’ve got this automated system which I don't know what the quality is, and it's saying this and this’?’
Opportunity costsThe extent to which benefits, profits, or values must be given up to engage in the intervention	Challenges balancing participation with existing demands Trade-offs in prioritising development areas	Challenges balancing participation with existing demands Participants reported challenges balancing intervention demands with existing responsibilities:P2: ‘The idea of making a regular slot for working individually on deliberate practice, or with colleagues, on paper is very easy but not in practice’P4: ‘There's a lot of extra work … but the service wasn’t really accounting for that’P9: ‘Not really, because the service has allowed me to have a slightly reduced caseload for this period of time, which is fine’ Trade-offs in prioritising development areas Time constraints led participants to focus on specific development areas, leaving others aside:P3: ‘There's so many different areas that I could focus on, and I would want to focus on, but actually given the time restraints of the study, I have to keep it really focussed – so then, narrowing it down’
Perceived effectivenessThe extent to which participants perceive the intervention as effective in enhancing therapeutic practice and patient outcomes	Increased self-awareness and intentionality in therapeutic approach Building confidence in skills and approach Renewed work engagement through intervention Uncertainty about the intervention's impact on patient outcomes	Participants experienced increased self-awareness, confidence, and engagement in their therapeutic work: Increased self-awareness and intentionality in therapeutic approach P8: ‘I’m being a lot calmer in my response rather than panicky in sessions’.P7: ‘I was a lot more aware of my practice, I became a lot more intentional’P5: ‘I feel like in sessions with patients now that there's like this extra layer of consciousness of what I'm saying and how I'm saying it’ Building confidence in skills and approach P2: ‘It's a good learning from that and I think I've taken confidence as well in my practice’P6: ‘I’ve got the confidence to nip issues in the bud or get the buy-in from the start more effectively’P5: ‘I think I was a little surprised on the effect it's had on my confidence. When you get the feedback from people after the role plays and they say you’ve done something well’ Renewed work engagement through intervention P6: ‘I think it's got me enjoying my job again’P7: ‘I think doing this a lot more allowed me to just decompress and then go into it knowing that I was equipped to go into it … You're almost excited to go and try it’However, they found it difficult to discern whether patient outcomes improved directly due to the intervention: Difficulty assessing impact on patients P3: ‘I've got no sense of whether as a result of doing the deliberate practice, and the intentional micro skills, has that made a difference to the session outcome, or has it not made any difference?’Interviewer ‘Have you had any feedback from others to suggest that there is some change in your practice or how you’re doing things over time?’P5: ‘Not from patients, but from the other people in the study, definitely’
Self-efficacyThe participants’ confidence that they can perform the behaviour(s) required to participate in the intervention	Proficiency gained through support and familiarity	Participants gained confidence as they became more familiar with the intervention in a supportive environment:P4: ‘It's a bit out of your comfort zone at the beginning, but everybody was so nice and there was no judgment’P2: ‘Especially at the beginning, just having more time to get to grips with everything and perhaps even having a session together on just like what we do and when we do it’
Intentionality towards future useParticipants’ intentions and plans for integrating the intervention into their future practice	Plans to integrate the intervention into future professional practice	Participants expressed strong intentions to integrate the intervention into their ongoing practice:P3: ‘So yeah, I definitely do want to continue doing it. I think it's just finding a way in pre-existing systems to slot it in basically’P6: ‘I mean, I’ve been using it in my private practice as well because that's something I’ve just started’P8: ‘I’m definitely going to go back to it continuously and regularly’

Footnotes

ORCID iDs

Sam Malins

David Saxon

Jeremie Clos

Nima Moghaddam

Ethical considerations

Ethical approval was obtained from The Health Research Authority East Midlands – Leicester Central Research Ethics Committee, 26 March 2024 (Reference 24/EM/0073).

Author contributions

Sam Malins: conceptualisation, methodology, data curation, writing–original draft, and writing–review and editing. Grazziela Figueredo: conceptualisation, writing–original draft, and writing–review and editing. David Saxon: data curation, supervision, writing–original draft, and writing–review and editing, Kate Horton: conceptualisation, methodology, writing–original draft, and writing–review and editing. Jeremie Clos: conceptualisation, writing–original draft, and writing–review and editing. Thomas Trimble: software, validation, formal analysis, and writing–review and editing. Kavan Fatehi: software, validation, formal analysis, and writing–review and editing. David Waldram: conceptualisation, methodology, writing–original draft, and writing–review and editing. Fred Higton: conceptualisation, methodology, writing–original draft, and writing–review and editing. Gillian E Hardy: conceptualisation, data curation, supervision, writing–original draft, and writing–review and editing. Michael Barkham: conceptualisation, data curation, supervision, writing–original draft, and writing–review and editing. Jonathan Couldridge: software, validation, formal analysis, and writing–review and editing. Nima Moghaddam: conceptualisation, methodology, formal analysis, investigation, visualisation, data curation, writing–original draft, and writing–review and editing.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and publication of this article: This study was funded through a Health Education England (HEE)/National Institute for Health and Care Research (NIHR) Clinical Lectureship (Sam Malins, NIHR301292). The views expressed are those of the authors and not necessarily those of the NIHR, the National Health Service or the Department of Health and Social Care.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

Supplemental material

Supplemental material for this article is available online.

References

Kessler

Chiu

Demler

, et al. Prevalence, severity, and comorbidity of 12-month DSM-IV disorders in the national comorbidity survey replication. Arch Gen Psychiatry 2005; 62: 617–627.

Steel

Marnane

Iranpour

, et al. The global prevalence of common mental disorders: a systematic review and meta-analysis 1980–2013. Int J Epidemiol 2014; 43: 476–493.

Penninx

Nolen

Lamers

, et al. Two-year course of depressive and anxiety disorders: results from The Netherlands study of depression and anxiety (NESDA). J Affect Disord 2011; 133: 76–85.

Cuijpers

Sijbrandij

Koole

, et al. The efficacy of psychotherapy and pharmacotherapy in treating depressive and anxiety disorders: a meta-analysis of direct comparisons. World Psychiatry 2013; 12: 137–148.

Lambert

. The efficacy and effectiveness of psychotherapy. In: Lambert

(eds) Bergin and Garfield's handbook of psychotherapy and behaviour change. 6th ed. Somerset, NJ: Wiley, 2013, pp.169–217.

Johnsen

Friborg

. The effects of cognitive behavioral therapy as an anti-depressive treatment is falling: a meta-analysis. Psychol Bull 2015; 141: 747.

Kilbourne

Beck

Spaeth-Rublee

, et al. Measuring and improving the quality of mental health care: a global perspective. World Psychiatry 2018; 17: 30–38.

Prochaska

Norcross

Saul

. Generating psychotherapy breakthroughs: transtheoretical strategies from population health psychology. Am Psychol 2020; 75: 996.

Lambert

. Prevention of treatment failure: The use of measuring, monitoring and feedback in clinical practice. Washington, DC: American Psychological Association, 2010.

10.

Swift

Greenberg

. Premature discontinuation in adult psychotherapy: a meta-analysis. J Consult Clin Psychol 2012; 80: 47.

11.

Clark

Layard

Smithies

, et al. Improving access to psychological therapy: initial evaluation of two UK demonstration sites. Behav Res Ther 2009; 47: 910–920.

12.

Waller

Turner

. Therapist drift redux: why well-meaning clinicians fail to deliver evidence-based therapy, and how to get back on track. Behav Res Ther 2016; 77: 129–137.

13.

Essock

Covell

Weissman

. Inside the black box: the importance of monitoring treatment implementation. Schizophr Bull 2004; 30: 613.

14.

Perepletchikova

. On the topic of treatment integrity. Clin Psychol: Sci Pract 2011; 18: 148–153.

15.

Barkham

Lutz

Lambert

, et al. Therapist effects, effective therapists, and the law of variability. In: Castonguay

Hill

(eds) How and why are some therapists better than others? Understanding therapist effects. Washington, DC: American Psychological Association, 2017, pp.13–36.

16.

Okiishi

Lambert

Nielsen

, et al. Waiting for supershrink: an empirical analysis of therapist effects. Clin Psychol Psychother 2003; 10: 361–373.

17.

Okiishi

Lambert

Eggett

, et al. An analysis of therapist treatment effects: toward providing feedback to individual therapists on their clients’ psychotherapy outcome. J Clin Psychol 2006; 62: 1157–1172.

18.

Johns

Barkham

Kellett

, et al. A systematic review of therapist effects: a critical narrative update and refinement to review. Clin Psychol Rev 2018; 67: 78–93.

19.

Goldberg

Rousmaniere

Miller

, et al. Do psychotherapists improve with time and experience? A longitudinal analysis of outcomes in a clinical setting. J Couns Psychol 2016; 63: 1.

20.

Anderson

Ogles

Patterson

, et al. Therapist effects: facilitative interpersonal skills as a predictor of therapist success. J Clin Psychol 2009; 65: 755–768.

21.

Anderson

Crowley

MEJ

Himawan

, et al. Therapist facilitative interpersonal skills and training status: a randomized clinical trial on alliance and outcome. Psychother Res 2015; 26: 511–529.

22.

Ewbank

Cummins

Tablan

, et al. Quantifying the association between psychotherapy content and clinical outcomes using deep learning. JAMA Psychiatry 2019; 77: 35–43.

23.

Magill

Apodaca

Borsari

, et al. A meta-analysis of motivational interviewing process: technical, relational, and conditional process models of change. J Consult Clin Psychol 2018; 86: 140.

24.

Magill

Bernstein

Hoadley

, et al. Do what you say and say what you are going to do: a preliminary meta-analysis of client change and sustain talk subtypes in motivational interviewing. Psychother Res 2018; 29: 1–10.

25.

Magill

Gaume

Apodaca

, et al. The technical hypothesis of motivational interviewing: a meta-analysis of MI’s key causal model. J Consult Clin Psychol 2014; 82: 973.

26.

Chow

Miller

Seidel

, et al. The role of deliberate practice in the development of highly effective psychotherapists. Psychotherapy 2015; 52: 337.

27.

Westra

Norouzian

Poulin

, et al. Testing a deliberate practice workshop for developing appropriate responsivity to resistance markers. Psychotherapy 2021; 58: 175.

28.

Larsson

Werthén

Carlsson

, et al. Does deliberate practice surpass didactic training in learning empathy skills?–A randomized controlled study. Nord Psychol 2023; 77: 1–14.

29.

Yamin

Cannoy

Gibbins

, et al. Experiential training of mental health graduate students in emotional processing skills: a randomized, controlled trial. Psychotherapy 2023; 60: 512–524.

30.

Miller

Chow

Malins

, et al. The field guide to better results: Evidence-based exercises to improve therapeutic effectiveness. Washington: American Psychological Association, 2023.

31.

Nurse

O’shea

Ling

, et al. The influence of deliberate practice on skill performance in therapeutic practice: a systematic review of early studies. Psychother Res 2024; 35: 1–15.

32.

Diamond

Wlodek

Arthey

, et al. A systematic review of deliberate practice in psychotherapy: definitions, operationalization, and preliminary outcomes. Psychotherapy 2025; 62: 113–131.

33.

Goldberg

Babins-Wagner

Rousmaniere

, et al. Creating a climate for therapist improvement: a case study of an agency focused on outcomes and deliberate practice. Psychotherapy 2016; 53: 367.

34.

Miller

Hubble

, et al. Identifying your “what” to practice. In: Miller

Chow

Malins

(eds) The field guide to better results: evidence-based exercises to improve therapeutic effectiveness. Washington, DC, US: American Psychological Association, 2023, pp.7–24.

35.

Walfish

McAlister

O'Donnell

, et al. An investigation of self-assessment bias in mental health providers. Psychol Rep 2012; 110: 639.

36.

Macdonald

Mellor-Clark

. Correcting psychotherapists’ blindsidedness: formal feedback as a means of overcoming the natural limitations of therapists. Clin Psychol Psychother 2015; 22: 249–257.

37.

Flemotomos

Martinez

Chen

, et al. Automated evaluation of psychotherapy skills using speech and language technologies. Behav Res Methods 2022; 54: 690–711.

38.

Duggan

Gervase

Schoenbaum

, et al. Clinician experiences with ambient scribe technology to assist with documentation burden and efficiency. JAMA Network Open 2025; 8: e2460637–e2460637.

39.

Bowen

Kreuter

Spring

, et al. How we design feasibility studies. Am J Prev Med 2009; 36: 452–457.

40.

Maciejewski

. Quasi-experimental design. Biostatistics & Epidemiology 2020; 4: 38–47.

41.

Leech

Onwuegbuzie

. A typology of mixed methods research designs. Qual Quant 2009; 43: 265–275.

42.

Weiner

Lewis

Stanick

, et al. Psychometric assessment of three newly developed implementation outcome measures. Implement Sci 2017; 12: 1–12.

43.

Sekhon

Cartwright

Francis

. Development of a theory-informed questionnaire to assess the acceptability of healthcare interventions. BMC Health Serv Res 2022; 22: 279.

44.

Devilly

Borkovec

. Psychometric properties of the credibility/expectancy questionnaire. J Behav Ther Exp Psychiatry 2000; 31: 73–86.

45.

Kroenke

Spitzer

Williams

. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001; 16: 606–613.

46.

Spitzer

Kroenke

Williams

JMG

. A brief measure for assessing generalised anxiety disorder; the GAD-7. Arch Intern Med 2006; 146: 1092–1097.

47.

Wang

Liu

Ying

, et al. Optimized glycemic control of type 2 diabetes with reinforcement learning: a proof-of-concept trial. Nat Med 2023; 29: 2633–2642.

48.

Barkham

Saxon

Hardy

, et al. Person-centred experiential therapy versus cognitive behavioural therapy delivered in the English improving access to psychological therapies service for the treatment of moderate or severe depression (PRaCTICED): a pragmatic, randomised, non-inferiority trial. Lancet Psychiatry 2021; 8: 487–499.

49.

Morriss

Patel

Malins

, et al. Clinical and economic outcomes of remotely delivered cognitive behaviour therapy versus treatment as usual for repeat unscheduled care users with severe health anxiety: a multicentre randomised controlled trial. BMC Med 2019; 17: 16–29.

50.

Clark

Oates

. Improving Access to Psychological Therapies: Measuring improvement and recovery Adult Services. Version 2. London: NHS England, 2014.

51.

Goldsmith

. Using framework analysis in applied qualitative research. Qualitative Report 2021; 26: 2061–2076.

52.

Malins

Moghaddam

Saxon

, et al. The feasibility and acceptability of automated feedback on session contents in psychological therapies for anxiety and depression. OSF, 2024, osf.io/e85wx2.

53.

Kang

Chen

Foster

. Implementation strategies for occupational therapists to advance goal setting and goal management. Frontiers in Health Services 2023; 3: 1042029.

54.

Jacobson

Truax

. Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. J Consult Clin Psychol 1991; 59: 12–19.

55.

Miller

Hubble

Chow

. Better results: Using deliberate practice to improve therapeutic effectiveness. Washington, DC: American Psychological Association, 2020.

56.

Pallmann

Bedding

Choodari-Oskooei

, et al. Adaptive designs in clinical trials: why use them, and how to run and report them. BMC Med 2018; 16: 29.

57.

Andrews

Rawsthorne

Manolescu

, et al. Involving psychological therapy stakeholders in responsible research to develop an automated feedback tool: learnings from the ExTRAPPOLATE project. J Responsible Technol 2022; 11: 100044.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.08 MB

0.06 MB

	Baseline (n = 9)		Post-intervention (n = 8)		Change (n = 8)
Measure	M	SD	M	SD	Wilcoxon Z	p value	Effect size (d_RM)	95% CI lower	95% CI upper
AIM, IAM, FIM (/20)
Acceptability	15.6	1.4	18.9	1.8	2.57	.008**	2.25	1.04	3.47
Appropriateness	16.1	1.8	18.3	2.2	1.88	.060	1.16	0.23	2.09
Feasibility	14.9	2.2	16.6	2.9	1.11	.268	0.67	−0.27	1.61
CEQ (/27)
Credibility	18.6	4.4	23.9	2.6	2.11	.039*	1.11	0.13	2.09
Outcome Expectancy	15.4	4.3	17.9	5.9	1.54	.148	0.49	−0.04	1.03
Adapted Facilitative Interpersonal Skills Task (/5)
Verbal Fluency	3.5	0.6	4.6	0.4	2.53	.008**	1.95	0.85	3.05
Hope	3.6	0.4	4.0	0.5	2.20	.028*	1.56	0.40	2.72
Persuasiveness	3.7	0.4	4.4	0.4	2.54	.008**	1.53	0.63	2.44
Emotional Expression	3.8	0.5	4.7	0.4	2.38	.016*	1.73	0.49	2.97
Warmth, Acceptance and Understanding	3.8	0.3	4.5	0.4	2.20	.027*	2.13	0.66	3.59
Empathy	3.7	0.5	4.4	0.5	2.38	.018*	1.49	0.56	2.42
Alliance Bond Capacity	4.0	0.4	4.5	0.3	2.21	.028*	1.18	0.30	2.07
Alliance Rupture-Repair Responsiveness	3.7	0.5	4.5	0.2	2.52	.008**	1.40	0.39	2.41
Total	3.7	0.3	4.4	0.3	2.52	.008**	2.57	1.21	3.94