Abstract
Objectives:
Burnout is an impactful and highly prevalent concern within healthcare systems. Identifying a valid, accessible measure of this occupational phenomenon is crucial to its identification, intervention, and evaluation. The objective of this study was to compare a newly developed, positively phrased, burnout question to the Mini-Z single-item emotional exhaustion and a single-item depersonalization item.
Methods:
Using cross-sectional survey data from four rural hospitals in the United States Mountain West (n = 457), we utilized Cohen’s kappa statistics, intraclass correlation coefficients, and Bland–Altman plots to assess agreement for dichotomous and continuous versions of the new, single-item burnout measure with the gold standard emotional exhaustion and depersonalization single items.
Results:
Based on the new Utah single-item measure, the prevalence of burnout was 30.4%; however, it was 39.4% and 19.3% as measured by emotional exhaustion and depersonalization, respectively. Our analysis demonstrated a substantial level of agreement with the Mini-Z single-item emotional exhaustion (Cohen’s kappa: 0.61; intraclass correlation coefficient: 0.64) and a fair level of agreement with the single-item depersonalization (Cohen’s kappa: 0.36; intraclass correlation coefficient: 0.49).
Conclusions:
Burnout can be easily, quickly, and routinely screened among healthcare professionals. Further, when medical professionals are regularly assessed for increased burnout, proactive team and system-level steps can be implemented to prevent burnout from occurring or worsening. Our findings suggest initial validation of a new, positively phrased single-item burnout measure, showing substantial agreement with the widely used Mini-Z emotional exhaustion item.
Introduction
Burnout is an occupational phenomenon characterized by symptoms such as emotional fatigue, cynicism, depersonalization (DP), and a diminished sense of efficacy and accomplishment in the workplace.1–3 The prevalence of burnout among physicians and physicians-in-training has been reported to be near or exceed 50%.4–8 Nearly 32% of nurses reported leaving their current employment due to burnout in 2018 and the prevalence of emotional exhaustion (EE) among nurses has been increasing since the COVID-19 pandemic began, with one study reporting a prevalence of 40.6% in 2019 and 49.2% in 2021.9,10 The estimated cost of physician-related turnover and reduced clinical hours attributable to burnout each year in the United States is $4.6 billion. 11 Burnout percentages in rural medical settings, ranging from 25% to 39%, although lower than national rates or urban medical settings are still notable.12–14 Overall, these trends highlight the pervasiveness of burnout within the medical field. Burnout significantly impacts the mental health of medical providers, in some cases leading to depression, substance use, family issues, suicide, and other significant psychosocial concerns.15–17 In addition, it has been demonstrated to impact relationships between healthcare staff and the cohesion of medical teams.16,17
One study, evaluated measures of resilience within a physician population and the general population, found that resilience was higher among physicians than the general working population in the United States. 16 However, even among the most resilient physician subgroups, these researchers observed significant rates of burnout. 18 Burnout has reached epidemic levels within medical systems, 19 and has only been increased by the COVID-19 pandemic.10,15 Thus, burnout of medical providers is a concern within healthcare and even in broader contexts.
Consequences of occupational burnout extend beyond medical professionals, producing negative effects on patient care and treatment.7,20 Burnout has been found to be associated with increased rates of medical errors and patient dissatisfaction.8,16,17 Therefore, burnout can produce unnecessary added risk in healthcare settings. The impact of burnout within the healthcare system is thus a significant issue, as its consequences are far reaching. Improving understanding of contributing causes, possible solutions to this critical health issue, and ways to evaluate or reliably, readily, and easily measure this phenomenon are important tasks within medical research.
Previous cross-sectional studies have evaluated physician burnout measurements via assessments of the gold standard Maslach Burnout Inventory (MBI) within differing samples of medical providers.20–24 The MBI consists of 22 questions, assessing the following three domains: EE, DP, and personal accomplishment. 25 In studies validating different measures of burnout against the gold standard MBI, each of these research teams reached a similar two-part conclusion: (1) medical professionals are experiencing high rates of burnout and (2) a single-item EE measure (as in the Mini-Z survey) is significantly correlated with the MBI EE subscale,26,27 indicating that this single-item may serve as a successful alternative to the longer survey form for assessing EE.21–24 Additionally, these findings showed that the Mini-Z single question can serve as a successful alternative when EE is of primary interest. These findings remain true for assessing DP, as single-item DP measures have also been validated against the DP subscale of the MBI,20,24 though less frequently than the EE subscale.
The accessibility of multiple options for burnout assessment has helped further evaluation of well-being across healthcare systems. However, measuring burnout through survey administration remains challenging due to time and financial constraints, potential survey fatigue, limitations with survey structure and items, low response rates, or the perceived need to focus on other matters. 28 These barriers may be even more pronounced in rural healthcare settings. 29
In this study, we aimed to validate a new, single-item burnout measure against gold standard measures of burnout, as defined by EE and DP, 28 via the Mini-Z’s single-item burnout question within a population of rural healthcare workers in Utah and eastern Idaho. 27 This was done by first validating the new measure against EE and DP individually and then against a combined measure of EE and DP. We hypothesized that we would see significant agreement with EE and DP. We hoped that the new burnout measure could capture the essence of burnout, without the time and cost of using the MBI and requiring staff to complete multiple engagement questionnaires. This study specifically addresses the need for a valid single-item burnout measure that aligns with positively worded Likert scales required by many institutional survey platforms, where the Mini-Z item cannot be integrated directly.
New contributions
This study introduces a novel, single-item burnout measure for rural healthcare professionals, marking a significant advancement in occupational health assessment. Our innovative, positively phrased measure diverges from traditional multi-item scales, offering a succinct and user-friendly tool validated against established Mini-Z EE and DP items. This methodological rigor, demonstrated through Cohen’s kappa statistics, intraclass correlation coefficients (ICCs), and Bland–Altman plots, is particularly pertinent for underrepresented rural healthcare settings where resources for extensive surveys are limited. Our findings not only provide a particular understanding of burnout prevalence in these settings but also suggest the measure’s potential applicability in diverse healthcare environments. The simplicity and effectiveness of this new tool, aligning well with existing employee engagement surveys, enhance its practicality and pave the way for future research in burnout measurements across various populations and settings. This contribution is significant in advancing how burnout is assessed, understood, and addressed in healthcare environments, particularly in rural areas.
Methods
Settings
This was a cross-sectional, quality improvement study using voluntary survey data collected for program evaluation purposes. The Resiliency Center at University of Utah Health was awarded a 3-year grant from the U.S. Health Resources and Services Administration (HRSA) in part for promoting resilience and mental health among the rural health care workforce (#U3MHP45387). To establish a baseline, between May and September 2022, well-being surveys were administered at three rural hospitals in Utah and one in Idaho. These surveys were administered as part of the program evaluation of the grant activities. The surveys were voluntary, and no incentives were provided. The measures included in the survey were selected to allow comparison with local and national data on burnout, providing context for the survey results within the local hospital systems. The total number of employees across the four hospitals was ~1500, and 457 individuals completed the survey, yielding a response rate of 30.5%. This evaluation was acknowledged as nonhuman subjects research by the University of Utah Institutional Review Board (IRB #00151642).
Invited participants included all employees and contracted providers of the hospital at the time of the survey; no exclusionary criteria were applied. Participants were invited to complete the survey via Research Electronic Data Capture with an initial email invitation and three follow-up reminders. Per IRB #00151642, documentation of informed consent was waived for this nonhuman subjects quality-improvement project; the invitation served as an information sheet, and survey completion constituted implied consent. The email sent to participants read, “The University of Utah Health Resiliency Center is partnering with [your] Hospital to better understand and improve professional well-being at your organization. As part of this collaboration, we invite you to complete the following survey to help us gather information on topics such as work-related stress and burnout. This is an annual survey that will take ~10 minutes to complete. The information collected in this survey will be used to identify areas of need and help improve professional well-being at your organization. This work is part of the HRSA grant ‘Promoting Resilience and Mental Health Among Health Professional Workforce.’ If you have any questions or concerns, please reach out to [contact].” The surveys remained open for 2 weeks.
Participants worked in clinical (e.g. physicians, pharmacists, registered nurses, social workers) or nonclinical (administrative, education, finance, human resources, information technology, medical records, other) roles. Participants were asked a range of questions relating to workplace well-being, as well as demographic information. Participants were included in the analytic cohort if they provided responses to all questions.
Measures
New burnout measure
The following new, single-item burnout question was evaluated in this research study: “Burnout is not a problem for me.” This item was originally developed in order to measure burnout in an employee engagement survey administered to faculty and staff working in Health Sciences at University of Utah Health. The brief survey, at the time administered by Dialogue™, was designed to have all items answered using the same five-point, positive direction Likert-type agreement scale.30,31 With no current single-item burnout measures meeting this criterion, a new, single item to assess burnout was created out of necessity. This new burnout question was piloted in 2019 and included in every annual Health Sciences employee engagement survey through 2022. It was also included on every quarterly engagement survey for University of Utah Health Hospitals and Clinics employees, using a different survey platform, between fall 2020 through summer 2023. For the remainder of this article, we will refer to this new burnout item as the “Utah single-item burnout measure” or the “Utah item.” This item was developed independently for survey integration and was subsequently validated against, rather than derived from, the Mini-Z EE and DP single items.
The Utah single-item burnout measure was rated on the following Likert scale: (0) strongly agree, (1) agree, (2) neither agree nor disagree, (3) disagree, (4) strongly disagree. This item was intentionally phrased in the positive direction (“Burnout is not a problem for me”) to conform with institutional survey design requirements, which mandated positively worded agreement-scale items. Such formats are common in organizational assessments and, when validated appropriately, do not necessarily compromise the construct validity of the measure. 32 The scale was then reversed in the scoring process to have higher scores be more indicative of burnout. This item was also dichotomized into a binary indication by classifying scores ⩽2 as “no burnout” and scores ⩾3 as “burnout.” Because positive wording can lower endorsement, we assessed its impact via sensitivity analyses (e.g. reclassifying “neither agree nor disagree”), and report resulting shifts in prevalence and agreement. Supplemental analyses adjusted the dichotomization to have “neither agree nor disagree” as part of burnout, so scores ⩽1 were classified as “no burnout” and scores ⩾2 were classified as “burnout.” For clarity, the Mini-Z EE/DP items served solely as comparators in validation analyses; the Utah item’s wording and content were not adapted from those instruments.
Because this is a single item intended to capture the global construct directly, traditional scale-level content-validity indices were not applicable; wording was developed and refined via expert consensus review to ensure conceptual representativeness and face/content relevance.
Validated burnout measures
The validated, single-item measures consisted of EE,33–35 and DP21,22,24,27,36; which are the most foundational elements of burnout and correlate strongly with the respective MBI domains.20,36 EE is measured by asking the following, “Overall, based on your definition of burnout, how would you rate your level of burnout?” with Likert-scale responses of (1) “I enjoy my work. I have no symptoms of burnout,” (2) “Occasionally I am under stress, and I don’t always have as much energy as I once did, but I don’t feel burned out,” (3) “I am definitely burning out and have one or more symptoms of burnout, such as physical and EE,” (4) “The symptoms of burnout that I’m experiencing won’t go away. I think about frustration at work a lot,” and (5) “I feel completely burned out and often wonder if I can go on. I am at the point where I may need some changes or may need to seek some sort of help.” The item was also dichotomized by classifying scores ⩽2 as “no burnout” and scores ⩾3 as “burnout.” DP is measured by asking, “How often do you feel you’ve become more callous toward people since you took this job?” with Likert-scale responses of (1) never, (2) a few times a year or less, (3) once a month or less, (4) a few times a month, (5) once a week, (6) a few times a week, and (7) every day. The item was also dichotomized by classifying scores ⩽4 as “no burnout” and scores ⩾5 as “burnout.” The English version of the questionnaire used in this study is provided as Appendix A in the Supplementary Material.
Demographic information
Demographic characteristics measured included age, years since completing training, sex, race/ethnicity, Veteran status, rural background, disadvantaged background, and job type (clinical or nonclinical.)
Statistical analysis
Characteristics were presented overall, as well as stratified by hospital. When comparing between hospitals, Fisher–Freeman–Halton exact tests were used for categorical variables, one-way analysis of variance for normally distributed continuous variables, and Kruskal–Wallis tests for non-normally distributed continuous variables.
The Utah single-item burnout measure was compared against EE and DP measures in a dichotomous and continuous fashion. For dichotomous outcomes, agreement between measures was calculated using Cohen’s kappa (k) statistic (and 95% confidence intervals (CIs)), 37 which measures the level of agreement between two methods, that is, or is not due to chance alone. Scores above 0 are more indicative of real agreement beyond chance, and scores of 1 indicate perfect agreement. Additionally, associations between measures were tested with McNemar’s test, Cohen’s g effect sizes, and logistic regressions. Discriminative validity was tested with sensitivity, specificity, positive predictive value, negative predictive value, and the area under the receiver operating characteristic curve (AUROC). Supplemental analyses classified Utah item scores of “neither agree nor disagree” as “burnout” instead of “no burnout” to assess how results changed.
For continuous outcomes, measures were left in their original ordinal scales. The Utah single-item burnout measure was rescaled to be on the same scale as EE and DP in separate analyses. Agreement between measures was calculated using the ICC and 95% CIs with scores <0.50 indicating “poor agreement” and scores >0.90 indicating “excellent agreement.” Additionally, Bland–Altman plots were constructed which plotted the difference between the two measures (bias) on the y-axis and the average of the two measures (magnitude) on the x-axis.38,39 These were used to determine if the two measures differed from each other more or less as participants were scored at higher and lower levels of burnout. Lines were drawn for the mean of the difference to determine if one measure was higher or lower than the other on average. Paired t-tests and 95% CIs for zero bias were provided to determine if differences were significantly different from zero. Tests for independence of bias from magnitude were conducted with Pearson (r) product-moment correlations. Lines were also drawn that reflected the limits of agreement (along with 95% CIs) to give an estimate of where 95% (two standard deviations (SDs)) of the differences should lie (assuming differences are normally distributed). In other words, it was expected that the 95% limits would include 95% of differences between two measurement methods. Additionally, associations between measures were tested using Pearson (r) product-moment correlations and linear regressions along with adjusted R2’s measuring the percentage of variation in EE/DP as determined by the Utah measure. Residual diagnostics confirmed linear associations as well as model goodness-of-fit. Supplemental analyses rescaled all measures (Utah and EE/DP) to standard normal and all previously mentioned continuous analyses were repeated.
Finally, metrics were repeated while comparing between the different hospitals. All hypothesis tests were two-sided, with a significance level of 5%. All analyses were performed in R version 4.0.2 (R Foundation for Statistical Computing, Vienna, Austria).
Alternative validation
As an alternative validation, the Utah measure was compared against a combined EE/DP measure. The combined EE/DP measure was obtained by taking a weighted average of the two single-item measure scores for each respondent. Assuming EE was more important than DP,40,41 the weighted average was calculated in the following way:
Sample size calculation
With 80% power and 5% significance, and assuming (i) the percentage of burnout in rural healthcare workers to be the lower bound of 25%, 13 (ii) the minimum acceptable kappa of 0.40, and (iii) the expected kappa of 0.55, the minimum sample size required for analyses was 373 rural healthcare workers.
Results
Descriptive statistics
The preliminary sample size was 507, which after restricting to those with no missing answers left a final analytic cohort of 457. Of these, 59 were from Hospital 1, 63 were from Hospital 2, 221 were from Hospital 3, and 114 were from Hospital 4. Participants had an average (SD) age of 41.8 (13.0) years. The majority of participants were female (77.5%), non-Hispanic White (87.3%), and from a rural background (65.2%). There were 3.7% of participants that were American Indian and Alaskan Native, 3.1% that were Hispanic, and 1.3% that were mixed raced. There were 2.8% of participants that served as Veterans and 24.9% came from a disadvantaged background. There were 42.9% of participants working in clinical care. The median (first quartile (Q1) and third quartile (Q3)) years since training completion was 2 (Q1 = 1 and Q3 = 4). When comparing across hospitals, significant differences were found in race/ethnicity, Veteran status, rural background, disadvantaged background, clinical care, and years since training completion (Table 1).
Characteristics of participants a of affiliates survey.
ANOVA: analysis of variance; SD: standard deviation.
Participants removed if not answering any demographics or survey items.
Column %’s.
Fisher–Freeman–Halton test (unless otherwise noted).
Mean (SD).
One-way ANOVA.
Combination of separate fields of races and ethnicity, positive indications to “Asian,” “AIAN,” “Black,” and “Native Hawaiian or other Pacific Islander” races took precedence in classification regardless of ethnicity, “Hispanic” classified by those specifically indicating Hispanic race or indicating Hispanic ethnicity and White race, White classified by those indicating White race and no Hispanic ethnicity, because respondents could answer to multiple races, if multiple races indicated then classified as “mixed,” those indicating “prefer not to answer” on race or everyone else indicating “prefer not to answer” on ethnicity classified as such.
Combined into one field.
“Direct patient care” classified as “yes,” everything else as “no.”
Median (Q1, Q3).
Kruskal–Wallis test.
Dichotomous agreement
Results are presented in Tables 2 and 3 for measures dichotomized to “yes” or “no” burnout. Additionally, the “neither agree nor disagree” category in the Utah single-item burnout measure is primarily used to be classified toward “no burnout.” As measured by the Utah item, the prevalence of burnout was 30.4%, whereas the prevalence was 39.4% and 19.3% as measured by EE and DP, respectively. When comparing the inter-measure agreement between the Utah item and EE, the kappa statistic (95% CI) was 0.61 (0.54–0.69), indicating 61% greater agreement between the two measures than by chance alone (p < 0.001). This indicated substantial agreement. When comparing inter-measure agreement between the Utah item and DP, the kappa statistic (95% CI) was lower at 0.36 (0.27–0.45). While indicating only fair agreement, it still was significantly greater than chance alone (p < 0.001). Again, comparing the Utah item to EE, the AUROC was 0.79, while comparing the Utah item to DP yielded an AUROC of 0.72. Repeating analyses with the adjustment of “neither agree nor disagree” to be classified toward “yes burnout” in the Utah single-item, burnout prevalence doubled at 61.7%. Kappa statistics (95% CIs) and AUROCs fell when comparing to EE (kappa: 0.47 (0.40–0.54); AUROC: 0.76, Supplemental Tables 1(a) and (b)) and to DP (kappa: 0.20 (0.14–0.25); AUROC: 0.68; Supplemental Tables 1(a) and (b)).
Dichotomous association, agreement, and discrimination of Utah single-item burnout with emotional exhaustion and depersonalization.
CI: confidence intervals; NPV: negative predictive value; PPV: positive predictive value.
Yes: strongly disagree/disagree. No: neither agree nor disagree, agree, strongly agree (to question: “Burnout is not a problem for me”).
Four out of total (457).
McNemar’s test.
Cohen’s g, g = 0.05–<0.15: “small effect size”; g = 0.15–<0.25: “medium effect size”; and g ⩾ 0.25: “large.”
Cohen’s kappa (k) ranging from −1 (disagreement) to 1 (perfect agreement) with k = 0 meaning agreement is not better than what we would get by chance. Strength of agreement: <0 “poor”; 0.01–0.20 “slight”; 0.21–0.40 “fair”; 0.41–0.60 “moderate”; 0.61–0.80 “substantial”; 0.81–1 “almost perfect.”
PPV.
NPV.
Yes: I am definitely burning out and have one or more symptoms of burnout, for example, emotional exhaustion. The symptoms of burnout that I’m experiencing won’t go away. I think about work frustrations a lot. I feel completely burned out. I am at the point where I may need to seek help. No: I enjoy my work. I have no symptoms of burnout. I am under stress and don’t always have as much energy as I did, but I don’t feel burned out.
Yes: every day/a few times a week/once a week. No: a few times a month/once a month or less/a few times a year or less/never (to question: “How often do you feel you’ve become more callous toward people since you took this job?”).
Dichotomous association/discrimination of Utah single-item burnout with emotional exhaustion and depersonalization.
AUROC: area under the receiver operating characteristic curve; CI: confidence intervals; GOF: goodness-of-fit; OR: odds ratio.
OR calculated via logistic regression.
Profile-likelihood CI.
Calculated via a z-statistic.
Model GOF via Hosmer–Lemeshow significance test.
Predictive discrimination via AUROC. Discrimination strength: <0.5 “no discrimination”; 0.5–0.7 “poor discrimination”; 0.7–0.8 “acceptable discrimination”; 0.8–0.9 “excellent discrimination”; >0.9 “outstanding discrimination.”
Continuous agreement
The results are presented in Tables 4 and 5 for measures left in their original scales. Additionally, the Utah single-item burnout measure was rescaled to be on the same scale as EE and DP. The mean (SD) for comparing against EE was 2.90 (1.11) for Utah and 2.37 (1.06) for EE. The mean (SD) for comparing against DP was 3.85 (1.66) for Utah and 3.07 (1.73) for DP. The ICC (95% CI) between Utah and EE was 0.64 (0.36–0.78) indicating moderate agreement, whereas the ICC (95% CI) between Utah and DP was 0.49 (0.32–0.61), which is near the threshold of poor and moderate agreement. Adjusted R2 for comparing Utah against EE was 0.50, whereas it was 0.29 when comparing against DP. While adjusting scales of all measures (Utah and EE/DP) to standard normal, the ICC (95% CI) between Utah and EE was 0.71 (0.66–0.76) and 0.54 (0.48–0.61) between Utah and DP, all indicating moderate agreement. R2 were unchanged for both comparisons (Supplemental Tables 2(a) and 2(b)).
Continuous association and agreement of Utah single-item burnout with EE and DP.
CI: confidence intervals; DP: depersonalization; EE: emotional exhaustion; ICC: intraclass correlation coefficient; SD: standard deviation.
Answers reversed (i.e. strongly agree = strongly disagree, agree = disagree, etc.) to represent higher indications of burnout, answers also rescaled to be on the same scale as single-item EE and single-item DP.
Pearson product-moment correlation coefficient (r).
Test of association for Pearson (r).
ICC to measure strength of inter-rater (or inter-measure) agreement, <0.50: “poor”; 0.50–0.75: “moderate”; 0.75–0.90: “good”; >0.90: “excellent.”
F-test.
Continuous association of Utah single-item burnout with emotional exhaustion and depersonalization.
CI: confidence interval.
Beta-hat calculated via linear regression.
Wald CI.
Calculated via a t-statistic.
Adjusted R2, additional model goodness-of-fit via residual diagnostics.
Bland–Altman plots are displayed in Figures 1 and 2 for the comparison of Utah single-item burnout measure to EE and DP, with the Utah item scaled to EE and DP. In both comparisons the mean of the differences is below 0, indicating that measures from EE/DP tend to be lower on average compared to the Utah Item. With intervals not overlapping zero, this indicated significant bias between measures (p < 0.001). While comparing the differences between measures (bias) against the means of measures (magnitude), for both comparisons, the bias increased greatly at more mid-range magnitudes of measures. However, that bias reduced to zero as the measure magnitudes decreased to low or high. This means that participants signifying indications of very low or very high burnout saw near-perfect agreement between the two measures in scoring. However, when participants signified more mid-range indications of burnout, the measures drastically differed between each other. Due to the symmetric pattern of the data, no significant trend was found (Utah/EE Pearson r (p value): −0.06 (0.21); Utah/DP Pearson r (p value): 0.05 (0.28)). When adjusting all scales to standard normal, no bias was observed between Utah and EE/DP, nor were there bias/magnitude trends detected (all p > 0.99). However, mid-range scores between both measures continued to yield higher bias compared low or high scores between measures (Supplemental Figures 1(a) and (b)). Burnout prevalence, and agreement/discrimination statistics are presented overall, and by hospital, for both dichotomous and continuous measures in Table 6.

Bland–Altman plot for comparison of emotional exhaustion and Utah single-item burnout (Utah rescale to EE). (1) Test for zero bias via paired t-test (t (456) = −13.73, p < 0.001). (2) Test of independence of bias (difference between MBI EE and Utah burnout) and magnitude (average of MBI EE and Utah burnout) via Pearson product-moment correlation (−0.06, p = 0.21).

Bland–Altman plot for comparison of depersonalization and Utah single-item burnout (Utah rescale to DP). (1) Test for zero bias via paired t-test (t (456) = −10.27, p < 0.001). (2) Test of independence of bias (difference between MBI DP and Utah burnout) and magnitude (average of MBI DP and Utah burnout) via Pearson product-moment correlation 0.05, p = 0.28).
Burnout prevalence, and agreement and discrimination statistics, overall and by hospital.
AUROC: area under the receiver operating characteristic curve; DP: depersonalization; EE: emotional exhaustion; ICC: intraclass correlation coefficient; NPV: negative predictive value; PPV: positive predictive value.
Assuming “neither agree nor disagree” is part of “no” burnout.
On ordinal scale, assuming rescaling of Utah single-item burnout to EE and DP scales.
When comparing the Utah measure against the combined EE/DP measure, the results were nearly identical to those of comparison between the Utah measure and EE alone (Supplemental Tables 3(a) and (b) and Supplemental Figure 2). The ICC (95% CI) was 0.63 (0.34–0.78), again indicating moderate agreement, and the adjusted R2 was 0.51. The Bland–Altman plot indicated lower average combined EE/DP scores than Utah and higher bias at midrange magnitude scores than extreme scores.
Discussion
The findings of this study demonstrate a similar mapping to the gold standard MBI as has been reported in other studies,21,22,24 in that the new Utah measure agreed moderately to substantially with the single-item EE. When comparing against the combined EE and DP the agreement was nearly identical, signifying the agreement largely driven by EE. Only poor to fair agreement was observed between the Utah measure and single-item DP alone. We observed an adjusted R2 of 0.50 when comparing the Utah item against EE and an adjusted R2 of 0.29 when comparing against DP. The EE metric lies within the same range of comparability as other studies who have done similar assessments,21,22,24 as two similar studies yielded R2 values of 0.50.22,24 Thus, though we only observed borderline substantial agreement, our results are comparable with similar work in the scientific literature and our capture of variance is performing as expected. This pattern aligns with the conceptual framing of the Utah item, which, like the Mini-Z EE item, targets general perceptions of burnout. The DP item, in contrast, focuses on interpersonal detachment (callousness), a more specific facet of burnout. Therefore, the stronger alignment with EE and weaker alignment with DP is theoretically consistent and not unexpected. This agreement was also observed both when assessed on the traditional EE scale and when standardized to the standard normal distribution. Despite the item’s positive phrasing, its performance remained robust across all agreement metrics, consistent with prior psychometric guidance. 32 The lower prevalence relative to Mini-Z EE is consistent with expected effects of positive phrasing and the neutral category, particularly at mid-range scores, without undermining criterion validity given the observed κ/ICC and AUROC. This pattern is consistent with known effects of item polarity and acquiescent responding in Likert-type measures.42,43 Accordingly, the item’s broad self-assessment wording supports content validity by aligning with established single-item burnout measures (e.g. the Mini-Z EE item) while preserving a format suitable for system-wide surveillance. Consistent with a surveillance role, the single item is designed to flag elevated risk at scale and frequency, after which organizations can administer established multi-item instruments for diagnostic confirmation and intervention planning. This aligns with prior use of single-item burnout indicators (e.g. Mini-Z EE) to enable frequent, low-burden monitoring at scale. As a surveillance screen, the single-item format confers clear advantages, minimal respondent burden, ease of platform integration, and the ability to measure more frequently, benefits that have been recognized for well-validated single-item indicators in time-constrained settings.44,45
Additional key findings include the identification of a discrepancy, or “gray area,” observed within our results that we attribute to the “neither agree nor disagree” response category of the single-item measure in the Utah item. The Utah single-item burnout measure has an inherently neutral category (i.e. “neither agree nor disagree”), as opposed to the EE that has a middle categories that very clearly classify individuals into strict yes or no burnout categories (i.e. “I am definitely burning out and have one or more symptoms of burnout,” “I am under stress and don’t always have as much energy as I did, but I don’t feel burned out”). When looking at dichotomous outcomes, the shifting of that neutral category from “yes burnout” to “no burnout” classification resulted in the Utah item prevalence measure doubling. Comparisons to EE/DP also revealed diminished agreement metrics upon switching that category. When looking at continuous outcomes, middle-range scores (most likely inclusive of that neutral category) resulted in high variability in bias between EE/DP and Utah measure, while low- or high-range scores (not inclusive of that neutral category) resulted in more precisely low bias. We attribute these findings to the fact that this neutral category in the Utah item includes a very heterogeneous group of individuals with a propensity to be mapped in various ways when looking at the EE/DP measures. This is evidence that the Utah item neutral category should be adjusted to improve comparability with other measures.
Using the overall new Utah single-item burnout measure, we observed a burnout prevalence of 30.4%. For the single-item EE and DP measures, we observed prevalence of 39.4% and 19.3%, respectively. Thus, we observed a similar burnout prevalence among studies assessing burnout among rural medical professionals, and a lower burnout prevalence in comparison to studies with larger sample sizes that included urban settings, which found prevalence of 45.8% and over 50%.8,46 These findings are consistent with previous research comparing burnout among rural and urban medical professionals. 13 However, this discrepancy could also be due to the dichotomization phenomenon described above and may differ when breaking up the neutral category into distinct binary responses. Thus, further findings from this study suggest that future efforts can be targeted toward improving the measuring capability of the neutral, or middle, portion of the measure of burnout in the Utah item.
Limitations
This study has three primary limitations. First, the analysis compares the Utah single-item burnout measure to validated single-item proxies (Mini-Z EE and DP) rather than the full MBI. As a result, it is possible that we could have observed different results if we had utilized the full MBI; however, this would have required administration outside of the annual human resources survey, which incurred licensing fees, and an increase in required items to complete. All these reasons would therefore decrease participation and impact our ability to compare burnout data with other engagement and communication questions. Given these practical constraints, we followed an approach supported in the literature by treating the Mini-Z item as a practical reference standard, as it has been previously validated against the full MBI.21,22,24 Second, our results may suffer from nonresponse bias and recall bias as these data were pulled from a survey. Positive wording and treatment of the neutral option may bias estimates downward; while suitable for system-wide surveillance, the single item should trigger follow-up with established multi-item tools when elevated risk is suspected. As a single item, the measure necessarily offers limited dimensional coverage and cannot estimate internal consistency; applications requiring diagnostic depth should use multi-item instruments, with the single item serving to flag where such follow-up is warranted.44,45 In addition, we did not conduct a formal Content Validity Index (CVI)/Delphi procedure in this initial single-item validation; future work can apply those methods47–49 when multi-item development is undertaken. Finally, our data were limited to rural healthcare professionals in Utah and eastern Idaho, which may limit the representativeness of our findings on the national scale, involving only a regional fraction of all health systems. Future research can assess the validity of this tool when utilized in different populations, including other national and international samples in both rural and urban settings. Furthermore, our study population was drawn from regions with a relatively high proportion of members of The Church of Jesus Christ of Latter-day Saints. Prior literature suggests that high religiosity may be protective against burnout. 50 Because religious affiliation was not collected, we could not examine its potential influence on burnout prevalence or measure performance. This factor should be considered when interpreting generalizability.
Conclusion
Assessing burnout among healthcare professionals is a critical task within medical research, as its measurement and evaluation aid in the process of identification, tailoring of interventions, and evaluation of intervention methods. This study demonstrates substantial agreement between the single-item measure of burnout used at University of Utah Health and the single-item Mini-Z EE measure, each validated against the MBI EE subscale. Thus, initial observations from our study suggest that the measure from Utah is a reliable and valid measure when compared to the validated EE measure, pending future studies on its predictive validity across diverse populations and outcomes such as provider attrition or disengagement. The implementation of a single-measure assessment promotes the ease of measuring burnout within medical contexts and ensures a quicker, more practical assessment that can be implemented routinely through brief clinical staff surveys to assess and track burnout within healthcare systems. Surveying health care professionals on the single-item measures of EE and DP can be vital to preventing and lessening the burden of burnout on this population. For survey instruments that do not accommodate the Mini-Z EE question, the Utah item can allow a look at the professional well-being of a population without requiring a separate survey. This has the potential to increase response rates, acceptability to the organization, and integration into existing action planning processes. While there is inter-question variability in rates of burnout, a single question can compare both between units and within a single unit when used over time.
Supplemental Material
sj-docx-1-smo-10.1177_20503121251393441 – Supplemental material for Initial validation of a single-item burnout measure among rural healthcare professionals
Supplemental material, sj-docx-1-smo-10.1177_20503121251393441 for Initial validation of a single-item burnout measure among rural healthcare professionals by Fares Qeadan, Amy Locke, Benjamin Tingey, Jamie Egbert, Ellen Morrow, Aisha Arshad, Mindy J. Vanderloo and Megan Call in SAGE Open Medicine
Footnotes
Acknowledgements
Data used in this study were collected for a funded grant by the Health Resources and Services Administration (HRSA) of the U.S. Department of Health and Human Services (HHS) as part of an award #U3MHP45387. The contents are those of the author(s) and do not necessarily represent the official views of, nor an endorsement, by HRSA, HHS, or the U.S. Government.
Ethical Considerations
The University of Utah Institutional Review Board (IRB) reviewed the umbrella HRSA protocol and determined this project to be nonhuman subjects research (quality improvement; IRB #00151642).
Consent to participate
The IRB waived documentation of informed consent. Participants received an information sheet in the survey invitation; participation was voluntary, and survey completion constituted implied consent.
Author contributions
A.L. and M.C. conceptualized the idea of the study. F.Q. developed the methodology and designed the study. B.T. and J.E. handled the data and conducted statistical analysis. A.A., B.T., and J.E. drafted the initial article. F.Q., A.L., B.T., J.E., E.M., A.A., M.J.V., and M.C. reviewed and edited multiple versions of the draft. F.Q. supervised the study. All authors revised the article content and approved the final article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the U.S. Health Resources and Services Administration (HRSA) under grant number U3MHP45387.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
