Abstract
Background:
Annual national diabetes audit data consistently shows most people with diabetes do not consistently achieve blood glucose targets for optimal health, despite the large range of treatment options available.
Aim:
To explore the efficacy of a novel clinical intervention to address physical and mental health needs within routine diabetes consultations across health care settings.
Methods:
A multicenter, parallel group, individually randomized trial comparing consultation duration in adults diagnosed with T1D or T2D for ≥6 months using the Spotlight-AQ platform versus usual care. Secondary outcomes were HbA1c, depression, diabetes distress, anxiety, functional health status, and healthcare professional burnout. Machine learning models were utilized to analyze the data collected from the Spotlight-AQ platform to validate the reliability of question-concern association; as well as to identify key features that distinguish people with type 1 and type 2 diabetes, as well as important features that distinguish different levels of HbA1c.
Results:
n = 98 adults with T1D or T2D; any HbA1c and receiving any diabetes treatment participated (n = 49 intervention). Consultation duration for intervention participants was reduced in intervention consultations by 0.5 to 4.1 minutes (3%-14%) versus no change in the control group (−0.9 to +1.28 minutes). HbA1c improved in the intervention group by 6 mmol/mol (range 0-30) versus control group 3 mmol/mol (range 0-8). Moderate improvements in psychosocial outcomes were seen in the intervention group for functional health status; reduced anxiety, depression, and diabetes distress and improved well-being. None were statistically significant. HCPs reported improved communication and greater focus on patient priorities in consultations. Artificial Intelligence examination highlighted therapy and psychological burden were most important in predicting HbA1c levels. The Natural Language Processing semantic analysis confirmed the mapping relationship between questions and their corresponding concerns. Machine learning model revealed type 1 and type 2 patients have different concerns regarding psychological burden and knowledge. Moreover, the machine learning model emphasized that individuals with varying levels of HbA1c exhibit diverse levels of psychological burden and therapy-related concerns.
Conclusion:
Spotlight-AQ was associated with shorter, more useful consultations; with improved HbA1c and moderate benefits on psychosocial outcomes. Results reflect the importance of a biopsychosocial approach to routine care visits. Spotlight-AQ is viable across health care settings for improved outcomes.
Keywords
Introduction
Person-centered health care is crucial to effective diabetes management and support; however, what that means in practice is poorly understood and poorly applied in both research settings and clinical care. 1 The result is a failure to adequately deliver truly person-centered care, despite numerous regulatory bodies globally including these as key outcomes in health care delivery. Despite the World Health Organization Charter in 1948 defining health as a “state of complete physical, mental and social well-being,” 2 health care systems globally have struggled to structure chronic condition health care along those constructs.
Physical health measures have long been used as metrics for glycemic control and risk of diabetes-related complications. 3 HbA1c being the most commonly recognized, with time in target glycemic range increasingly used. Furthermore, the psychological and social impact of diabetes and its treatments on people living with type 1 or type 2 diabetes (T1D or T2D) are rarely considered equally alongside physical health, thus parity of esteem remains far from a reality. 1
It is important, therefore, to recognize and incorporate the unique needs of each individual and support self-management behaviors that reflect individual empowerment to achieve physical and mental health well-being. International Diabetes Federation (IDF) data shows that 10% of global health expenditure is spent on diabetes ($760 billion), predicted to rise to $825 billion by 2030. 4 Indirect diabetes-related and societal costs from premature death, disability, and other health complications add approximately 35% to global health expenditure each year. 4 There are additional, often intangible personal costs, however that are less visible but include worry, anxiety, discomfort, pain, loss of independence, concerns about managing diabetes, fears for future complications, and their potential impact on quality of life.
Depression is commonly reported to be 2 to 3 times more prevalent in people with diabetes than the general population. 5 This figure perhaps overshadows the significant number of individuals who do not report symptoms of depression, but experience diabetes distress. In T2D, distress (but not depression) is related to suboptimal glycemic control and change in distress (but not change in depressive symptoms) is associated with both short- and long-term change in glycemic control. 6 Similar relationships are found in T1D with diabetes-specific emotional distress related to glycemic control and linked to worsening diabetes management over time. 7
Burnout amongst health care professionals is a key challenge affecting health care practice, safety, and quality of care with 20% to 30% of frontline health care workers leaving medicine. 8 Approximately half of US doctors experience substantial symptoms of burnout, with burnout almost twice as prevalent among US doctors than workers in other fields. 9 Nurses also experience a similarly high prevalence of burnout and depression, with 43% reporting high degrees of emotional exhaustion. There are significant correlations between a doctor’s sense of depersonalisation and patient satisfaction with their hospital care, and between a doctor’s job satisfaction and patient satisfaction with their health care and patient-reported adherence to medical advice. 9 The aim of this study was to evaluate the efficacy of a clinical tool, consisting of preclinic assessment identifying patient priority concerns and mapping those to evidence-based, theory-driven resources to address those concerns. Furthermore, using artificial intelligence including machine learning and natural language processing to explore the reliability of the relationship between questionnaire settings and concerns and to investigate the contributing factors underlying the observed differences between T1D and T2D patients and between diabetes patients with different HbA1c levels. This addresses an urgent unmet need in routine care to deliver patient-centered health care and improve health care professional work-related quality of life.
Methods
Study Design
We conducted an exploratory multicentre, parallel group, randomized controlled trial in primary (GPs) and secondary care (endocrinologists and diabetolologists) NHS sites in adults with T1D or T2D attending routine outpatient appointments for their diabetes care. Due to challenges with COVID and exceptional pressures in the NHS, recruitment had to be shorted, so it was not possible to achieve the 172 participants needed for the power calculation. Following provision of informed consent, participants were randomized on a 1:1 basis using computerized randomization software. Those randomized to the intervention group were asked to complete study questionnaires every three months and the Spotlight-AQ preclinic assessment (approximately 3.5 minutes to complete) within a week prior to their schedule in-person or remote routine outpatient diabetes appointment. The workflow for participants is shown in Figure 1.

Participant Spotlight-AQ workflow.
The results were discussed during the outpatient visit along with mapped care pathways in an partnered best-fit action plan. Health care professionals completed the Maslach Burnout Inventory at baseline and six months.
Study Questionnaires
All participants were asked to complete the following validated questionnaires, which represent psychosocial domains and important factors relevant to quality of life and self-management behaviors for people with diabetes:
Diabetes Distress Scale and all subscales, ie, emotional distress, physician-related distress, treatment-related distress, and so on.
PHQ2 and GAD7 to assess depression and anxiety.
EQ5D for health economic analyses and functional health status.
WHO-5 well-being index.
As well as additional questions pertaining to acceptability, treatment satisfaction, and relevance of the Spotlight-AQ. Health care professional participants completed the Maslach Burnout Inventory.
Question-Concern Mapping
To validate the reliability of question-concern association, a subset of 64 question-concern pairs were collected from the Spotlight-AQ platform. Each question represented an area of concern related to psychological well-being, therapy issues, self-management knowledge, or social support. In this study, an AI-based language model, BERT 10 (bert-base-cased) was employed to analyze the data. There are several important reasons for using a large language model (LLM). First, as has been shown by recent advances in LLMs, these models are capable of creating a nuanced understanding of natural language and enable a representation that is inherently machine-understandable. Moreover, the large amounts of data these models have been trained on allows them to pick up on language that humans understand (such as the specific wording of questions) and use that to develop a better understanding of the answers for tasks such as classification. Finally, using an AI model allows the classification/regression we perform on this data to be data-driven. We can then query this model to identify which aspects of the data the AI model considered important and correlate that to known clinical models.
The model is pretrained on a large data set of English-language text before it is applied to the Spotlight-AQ questions. This pretraining allows the model to learn how to embed natural English text (such as the answers to Spotlight-AQ questions) into a representation that is machine interpretable. As is standard practice in AI-based classification and regression, the Spotlight-AQ data must be split into a training and test set. Stratified splitting, which allows the distribution of the training and test set to be consistent, was performed on the entire data set. The questions were split into a training set comprising 48 questions and a test set consisting of 16 questions. This step allows the AI language to be “fine-tuned” to the Spotlight-AQ data set. More explicitly, the model has a general understanding of English text, but must be refined into a model that can understand the distribution of Spotlight-AQ questions. The test set allows the evaluation of the performance of the model. The BERT pre-trained auto-tokenizer was employed to tokenize the questions. Tokenisation allows English text to be split into a series of “Tokens” or language units to be processed by the AI model. Subsequently, a model for token sequence classification from the Huggingface transformer library was utilized to fit the training set in this multiclass classification task. The accuracy on test set was used to assess the performance of the fine-tuned model.
Machine Learning Model Development
To investigate whether there are differences between people with T1D or T2D and between individuals with different HbA1c levels and to explore the contributing factors underlying the observed differences, two machine learning models were developed based on the intervention group responses. First, the data set containing questions and answers from 42 participants was gathered. As Spotlight-AQ is a “smart” adaptive patient questionnaire and the number of questions participants need to answer varies based on their responses to previous questions, not all questions were answered by every patient. Questions that were answered by more than 90% of participants were retained. The answers the patients gave to those questions can then be considered as a “feature” or, more explicitly, a marker that can be used to determine certain characteristics of that person. As an oversimplified example, a person indicating stress or concern about glucose levels might indicate that their HbA1c is elevated. The features are then defined by the answers to the questionnaire, with each answer carrying a weight depending on the strength of the answer given. To accommodate for missing values, a standard k-nearest neighbors (KNN) algorithm was used. For each participant, we find the five participants with the most similar answers and impute the missing answers with the median across the five nearest participants.
Sample set Partitioning based on a joint X-Y distances (SPXY) was performed to split the data set into a training set (80%) and test set (20%). SPXY allows us to select representative train and test sets by ensuring distances of each selected sample satisfy a maximum-minimum distance constraint. A binary classification algorithm using a Random Forest classifier was employed to predict whether a person has T1D or T2D. Random forests are ensemble classifiers which use a set of random decision trees to reach a consensus decision. They are well suited to smaller amounts of data, requiring less training data than an equivalent neural network due to the stochastic, uncorrelated nature of each tree. For HbA1c level prediction, participants’ HbA1c were categorized into three ranges, less than 7% (< 53 mmol/mol), 7% to 8.5% (between 53 and 69 mmol/mol), and greater than 8% (more than 70 mmol/mol). A similar random forest multiclass classifier was developed to predict patients’ HbA1c levels according to their answers to the questionnaire. Both models were evaluated by their accuracy on test set using leave-one-out cross-validation scores. Feature importance values were collected from both models to explore the contributing factors underlying the observed differences.
Results
n = 98 adults with T1D or T2D; any HbA1c and receiving any diabetes treatment participated (n = 49 intervention), see Table 1 for demographic details.
Participant Demographics.
Intervention: n = 32 T1D, n = 16 T2D; Control: n = 29 T1D, n = 21 T2D.
Abbreviations: BG, blood glucose; T1D, type 1 diabetes; T2D, type 2 diabetes.
The primary outcome of consultation duration for intervention participants showed a significant reduction in consultation duration in intervention consultations of 0.5 to 4.1 minutes (3%-14%, P ≤ .001) compared with no change in the control group (−0.9 to +1.28 minutes). HbA1c improved in the intervention group by 6 mmol/mol (range 0-30; 0.55%) versus control group 3 mmol/mol (range 0-8; 0.25%). Baseline HbA1c between groups was similar, ie intervention HbA1c mean 63.3 mmol/mol (41-94; 7.9%) versus mean 66.7 mmol/mol (34-116; 8.2%) in the control group. The result was statistically significant (P ≤ .0001).
Moderate improvements were observed in psychosocial outcomes for the intervention group for functional health status (EQ5D); reduced anxiety (GAD-7), depression (PHQ-2), and diabetes distress (DDS) and improved well-being (WHO-5). None were statistically significant. Self-reported treatment satisfaction was improved for adults with T1D but not change for adults with T2D in the intervention group. This was the opposite for control participants where adults with T2D reported greater satisfaction than those with T1D. There were no intervention-associated adverse events.
HCPs reported improved communication and greater focus in consultations. Free text responses showed that prior to use of the Spotlight-AQ intervention, HCPs felt frustration at not being able to deliver the high-quality care they are capable of due to high absence rates, having to cover, constant pressure of failure to meet targets, high DNA rates by patients and perceptions that patients simply do not listen to them or take their advice. Following use of the intervention, HCPs reported high levels of satisfaction particularly around the great focus on solution-based care pathways and improved understanding of the challenges participants face beyond glycemic control.
Question-Concern Association
Based on the results obtained from the analysis, it can be inferred that the question-concern pairs collected from the Spotlight-AQ platform were highly reliable. The multiclass classification model achieved an accuracy of 87.5% on the test set. This indicates that the questions were closely related to the concerns they were intended to address.
Machine Learning Model
The two machine learning models (Type 1/2 Classfier and HbA1c Regressor) are evaluated for results. In both cases, the machine learning models have access to all question/answer pairs and must use the entire data set to perform the classification/regression task. Once the models are trained, we evaluate their performance on the test set and subsequently use interpretability metrics, namely, SHAP values (SHapley Additive exPlanations), to pinpoint the most important features used by the AI to perform the task.
The Type1/2 classifier showed that the questions related to psychological burden and therapy issues occupied the largest part in feature importance (see Figure 2), which emphasizes that adults with T1D and T2D exhibit diverse levels of psychological well-being and therapy-related concerns.

(a) Feature importance of the random forest classification model for predicting the type of diabetes. (b) Feature importance of the random forest classification model for predicting the participants’ HbA1c levels.
The classifier for predicting HbA1c levels achieved 75% and leave-one-out cross-validation score is 61%. In a similar way, questions related to psychological burden and therapy issues occupy the largest part in feature importance, which emphasized patients with varying levels of HbA1c exhibit diverse levels of psychological burden and therapy issue-related concerns.
It should be noted that identifying these features as the most important is an entirely data-driven decision, which was not biased by the training or test regime in any way. It is a clear indication that there are important psychological and therapy-related concerns which differ between type 1 and 2 participants, as well as for people with varying HbA1c scores.
Discussion
Ninety-eight adults with T1D or T2D participated in a multicenter randomized controlled trial to determine the efficacy of the Spotlight-AQ platform in routine care. Results showed the primary outcome was achieved with a significant reduction in consultation length for intervention participants of 0.5-4.1 minutes (3-14%) versus no change in the control group (−0.9 to +1.28 minutes) (P ≤ .0001). Furthermore, a significant improvement in glycemic control, as measured by HbA1c, was observed in the intervention group of 6 mmol/mol (range 0-30) versus control group 3 mmol/mol (range 0-8) (P ≤ .001). Although improvements were observed on psychosocial outcomes, these were not statistically significant between intervention and control group. No participants in either group demonstrated a deterioration in psychosocial outcomes.
High-quality and patient-centered health care is dependent upon the well-being and safety of health care professionals. 11 Overrunning clinics take a toll on health care professionals and deny them time to refocus between patient visits. The time-saving nature of the intervention enables clinics to both run on time, but also offers health care professionals time to pause, refocus, and commit to the next patient their full attention with active listening and empathy.
Artificial intelligence analyses were used to validate the Spotlight-AQ platform. Therapy and psychological burden were identified as the most important factors in predicting HbA1c levels amongst participants, irrespective of whether they had T1D or T2D. Machine learning and natural language processing semantic analyses confirmed the mapping relationship between questions and their corresponding concerns. Furthermore, machine learning modeling confirmed that adults with T1D or T2D have different concerns regarding the social support and unmet knowledge needs. Due to the different treatment regimens required for those with type 1 compared with T2D, these differences were anticipated and the Spotlight-AQ platform is designed to accurately detect the uniqueness of each individual across a biopsychosocial perspective. The fact that self-management skills feature behind therapy issues and psychological well-being could demonstrate that key messages are retained from education received and that the daily burden of living with diabetes becomes more prominent in terms of support required. This is counter to the medical model of health care that focuses on physical health and education to achieve it.
Health care professional burnout is at an all-time high8,9 with recent warnings from the BMJ about the associated harms. 12 Burnout is both detrimental not only to HCP well-being but also to patient care and health care costs. 9 It is well-reported that burnout is associated with medical errors, poor quality of care and low patient satisfaction. 9 The complexity is somewhat lost in this data; however, because job performance can still be maintained even when staff are so burntout they lack mental or physical energy, prioritizing high-priority tasks but neglecting so-called low priority tasks such as reassuring patients. 13 Facilitating breathing space between patient visits enables HCPs to refocus and being the next consultation with the same energy and quality of care they started the day with. These burnout-associated costs also add considerable burden to the overall costs of diabetes, with IDF data already showing a steep increase in estimated global expenditure by 2030.
It has long been acknowledged that the medical model of health care is a poor fit for the support and treatment of long-term conditions including diabetes. A considerable challenge has been, however, that it has not been possible to deliver a biopsychosocial model within the constraints of existing routine health care structures. So, while there has been a willingness to be more holistic in delivery of health care, there has also been a reticence about its achievability in the real world. Consultation length was selected as the primary outcome to determine if this biopsychosocial model of health care was deliverable within the constraints of existing health care structures. This clinical trial assessing the efficacy of Spotlight-AQ platform, along with previous research including underpinning theoretical model of care, 1 pilot study, 14 and feasibility study 15 clearly demonstrated that it is indeed possible. Furthermore, both patient and health care professional participants reported that the intervention resulted in more focused and meaningful consultations, delivered within that shorter time.
It is recognized that other external variables may have contributed to the change in HbA1c for participants. The purpose of Spotlight-AQ is to provide clarity of priority concerns of each individual person living with diabetes, along biopsychosocial domains so that routine visits could meet unmet needs that are not usually addressed. As such, this broader focus facilitates discussion beyond HbA1c and glycemic parameters and enables wider person-centered support to be offered. This is in line with recommendations from the World Health Organization, NICE and the American Diabetes Association.
The external validation using advanced AI analyses and methodologies further demonstrates the robustness of the intervention and its accuracy in understanding and meeting unmet priority concerns of people with T1D or T2D. This scrutiny further shows that the therapies and devices people are offered as well as reducing the psychological burden of diabetes management are the biggest factors in reducing HbA1c. These data reinforce the biopsychosocial approach as not only deliverable, but desirable and necessary in routine care for people with diabetes.
The strengths of this study are that it is the first to demonstrate that the biopsychosocial model is deliverable within routine care. Furthermore, that such an approach results in shorter more focused consultations, improved glycemic control, improved psychosocial outcomes for people with diabetes, and reduced burnout for health care professionals. The study is limited, however, by the need to close recruitment early due to COVID restrictions and staff demands at the NHS trust. Future research is planned to adapt the existing intervention for use with those with low literacy and poor access to digital technologies.
Conclusion
Spotlight-AQ was associated with shorter, more useful consultations; with significantly improved HbA1c and moderate benefits on psychosocial outcomes. Results reflect the importance of a biopsychosocial approach to routine care visits. Spotlight-AQ is viable across health care settings for improved outcomes.
Footnotes
Acknowledgements
The authors thank all of our participants for their willingness to take part, and thank Southern Health NHS Foundation Trust for their sponsorship.
Abbreviations
T1D, type 1 diabetes; T2D, type 2 diabetes; AQ, Algorithmic Questionnaire.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Spotlight-AQ owns the preclinic assessment platform. RCK and KBK are founders and shareholders in Spotlight-AQ. All other authors have no financial or other competing interests.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
