The “Twilight Zone” Is a Danger Zone: Why the Occupational-Clinical Divide in Burnout Assessment Is a False Dichotomy

Abstract

De Beer (2026) argues that burnout’s ICD-11 classification as an “occupational phenomenon” necessitates a stricter separation between organizational screening and clinical diagnosis. This commentary challenges that “dual approach” as a false dichotomy that underestimates the biological reality of chronic stress. Drawing on biomarker research from 2024 and 2025 and the validation logic of the Burnout Assessment Tool that De Beer published on extensively in the past, I argue that the occupational-clinical split risks functioning as a liability shield rather than a care pathway. When we deny the diagnostic validity of workplace screening, we create a “Twilight Zone” where employees with severe symptoms can be given organizational “risk” labels without corresponding clinical recognition or care. I propose an Integrated Continuum Model where workplace screening functions as valid Stage 1 assessment within a coordinated care pathway, ensuring that high-risk scores trigger immediate clinical triage rather than administrative limbo. The failure to integrate these approaches does not protect employees. It leaves them without a clear path to appropriate care.

Keywords

burnout burnout assessment tool ICD-11 diagnostic validity occupational health psychosocial risks

Burnout remains one of the most contested constructs in occupational health psychology (Nadon et al., 2022). Despite over fifty years of research, we still lack consensus on whether burnout constitutes a distinct clinical entity or a constellation of stress-related symptoms (Bianchi & Schonfeld, 2025; Parker & Russo, 2025). The World Health Organization’s, 2019 decision to classify burnout as an “occupational phenomenon” rather than a medical condition has done little to resolve this tension. If anything, it has intensified debates about who should assess burnout, using what tools, and for what purposes.

In his commentary, De Beer (2026) attempts to navigate what he calls the “Twilight Zone” between “burning out” (complaints that may resolve over time) and “burnt out” (a more persistent syndrome). This distinction, while useful as a heuristic, raises important questions about the nature of progression between these states and whether a clear threshold separates the two (cf. Schaufeli & De Witte, 2023). De Beer’s (2026) proposed solution is a strict division of labour: organizations should screen for risk to guide management decisions, while clinicians should diagnose the condition using separate frameworks. The logic appears pragmatic. Workplace surveys are designed for population-level surveillance, not individual diagnosis. Clinicians are trained for diagnostic assessment, and conflating these roles creates confusion.

Yet this “dual approach” rests on a premise that warrants careful scrutiny: that the setting of an assessment changes the nature of the condition being assessed. While De Beer (2026) is correct that the purpose and methodology of assessment differ between screening and clinical settings (a point I return to below), the underlying condition being detected does not change depending on who administers the instrument. In other words, a thermometer reading of 40°C does not become less medically significant because it was taken in an office rather than a clinic. Similarly, a severe burnout score on a validated instrument does not become less clinically meaningful because human resources administered it rather than a physician. The instrument’s primary purpose and psychometric properties remain constant across settings, even though the appropriate response to an individual obtaining a high-risk burnout score may differ depending on the context in which its administered (e.g., organization vs. within a clinic) (Schaufeli et al., 2023).

This commentary argues that the occupational-clinical divide in burnout assessment is, in fact, a false dichotomy. Moreover, I suggest that this divide may function less as a care pathway and more like a liability shield that protects organizations from the legal and ethical obligations that would follow from acknowledging the clinical significance of their screening data. The structural consequence of the proposed separation is that it creates institutional space in which organizational interests can be prioritized over employee welfare, regardless of whether this outcome is intended. What follows is a substantive challenge to a position that, while pragmatically motivated, carries consequences that the field cannot afford to overlook.

The Biological Fallacy of the “Occupational” Label

In his argument, De Beer (2026) draws on the ICD-11 classification of burnout as an “occupational phenomenon” to argue that it is a distinct experience from medical conditions. He suggests that workplace screening serves to “identify environmental factors” rather than diagnose individual health status. While De Beer’s (2026) interpretation of the ICD-11 classification is technically accurate at face value, the practical implication of this position is that “occupational” becomes a biological boundary, implying that stress incurred at work remains “occupational” until a clinician relabels it “clinical.”

Recent research demonstrates that this distinction is administrative rather than physiological (Vandenabeele et al., 2025). A comprehensive integrative scoping review published in 2025 found that the “burnout complaints” De Beer categorizes as pre-clinical are frequently associated with dysregulated hypothalamic-pituitary-adrenal (HPA) axis activity and elevated cortisol levels that are difficult to distinguish from clinical depression (Vandenabeele et al., 2025). The review synthesized evidence from over 2,000 studies and concluded that chronic stress, whether labelled “occupational” or otherwise, produces consistent patterns of HPA-axis dysregulation, immune impairment, autonomic imbalance, and elevated allostatic load.

Kuzmin and colleagues (2024) provided direct evidence of this biological continuity. Examining healthcare workers with high burnout scores, they found significantly elevated concentrations of both dehydroepiandrosterone sulfate (DHEA-S) and cortisol, regardless of whether these workers had received formal diagnoses. While no single biomarker is yet sufficient for individual diagnosis of burnout (Vandenabeele et al., 2025), the pattern of findings suggests that “risk” scores on screening tools correlate directly with objective neuroendocrine dysfunction. The body does not distinguish between stress that human resources calls “occupational” and stress that a psychiatrist calls “clinical.”

A 2025 systematic review examining circadian and endocrine disruption in burnout reached similar conclusions (Ungurianu & Marina, 2025). Burnout was frequently associated with blunted diurnal cortisol variation and irregular melatonin secretion, patterns that represent objective physiological dysregulation rather than subjective complaint. If a workplace diagnostic assessment detects “severe exhaustion” (a BAT “Red” score), it is detecting a biological reality, and not just a management metric. This seems to be at odds with De Beer’s (2026) argument that workplace-related screenings of burnout should not establish clinical diagnostic standards. Treating these scores as merely “informational” denies the employee recognition of their condition. As Bianchi and Schonfeld (2025) have argued, refusing to acknowledge the clinical magnitude of severe burnout symptoms often leads to misdiagnosis and inadequate treatment. The “occupational” label in ICD-11 describes context, not a ceiling on severity.

The BAT Paradox: When Instruments Outpace Theory

The most striking tension in De Beer’s (2026) position is the disconnect between his current argument and the validation logic of the Burnout Assessment Tool (BAT), which he has published extensively on in the past (De Beer et al., 2024; Schaufeli et al., 2020). This tension is not peripheral. It sits at the centre of the debate about what workplace burnout instruments are for and what obligations their outputs create.

De Beer (2026) asserts that workplace surveys are not normally designed to deliver individual clinical diagnoses and that applying clinical expectations to them creates a methodological impasse. Yet the primary justification for developing the BAT was precisely that existing tools like the Maslach Burnout Inventory lacked diagnostic validity (Schaufeli et al., 2020). Schaufeli et al. (2023) explicitly developed traffic light cutoff scores for the BAT to discriminate between “healthy” (Green), “at risk” (Orange), and “clinically burnt out” (Red) populations. These cutoffs were established using receiver operating characteristic analyses against samples of employees who had received clinical burnout diagnoses.

The BAT validation study reported diagnostic accuracy (area under the curve) ranging from good to excellent (AUC >.85) for identifying severe burnout (cf. Schaufeli et al., 2023). An important distinction in measurement theory holds that screening instruments are typically designed for high sensitivity, casting a wide net to minimize false negatives, whereas clinical diagnosis demands high specificity to minimize false positives (Schaufeli & De Witte, 2023). If the BAT were a conventional screening tool, De Beer’s position would be straightforward: high scores would flag candidates for separate clinical evaluation, and the screening result itself would carry no diagnostic weight. But the BAT was not validated as a conventional screener. Its cutoff development explicitly optimized for both sensitivity and specificity against clinically diagnosed benchmark samples. The instrument was designed to bridge the gap between screening and clinical identification, not to reinforce it. De Beer’s et al. (2024) own validation study of the BAT in Norway confirmed these properties by demonstrating that the instrument has “exceptional strength” in its global burnout factor and achieves full measurement invariance across genders (De Beer et al., 2024).

So my question is: If workplace tools should not function as proxies for diagnosis, why does the BAT Manual prioritize sensitivity and specificity analysis against clinical interviews?

Table 1 summarizes this tension, contrasting De Beer’s (2026) current theoretical position with the empirical validation approach that underpins the BAT. The comparison reveals a gap between the instrument’s demonstrated capabilities and its recommended use under the dual approach framework.

Table 1

The BAT Tension: Key Differences Between De Beer’s (2026) Commentary and BAT Validation Logic

Dimension	De Beer (2026) commentary	BAT validation logic (Schaufeli et al., 2023)
Purpose of workplace tools	Provide “management information” only; not designed for clinical diagnosis	Developed to overcome diagnostic limitations of prior measures (MBI)
Cutoff scores	Should identify “risk” for management purposes, not diagnostic categories	“Traffic light” cutoffs discriminate between healthy, at-risk, and clinically burnt out populations
Validation criteria	Applying clinical expectations to workplace tools creates methodological problems	Prioritizes sensitivity and specificity against clinical interviews (AUC >.85)
Relationship to diagnosis	Diagnosis belongs solely in clinical settings	High BAT scores function as diagnostic indicators with demonstrated predictive validity
Implied function	Organizational screening informs workplace interventions only	BAT scores predict clinical status and should inform care pathways

Note. AUC = area under the curve; MBI = Maslach Burnout Inventory; BAT = Burnout Assessment Tool.

By separating occupational screening from clinical reality, De Beer (2026) retreats to the very ambiguity the BAT was designed to resolve. The MBI era was characterized by uncertain cutoffs, inconsistent application, and the inability to distinguish “at risk” from “clinically significant.” The BAT was developed to address precisely this problem. Arguing that its diagnostic properties should be set aside in organizational contexts risks returning the field to the measurement limitations the instrument was built to overcome.

The Real Issue: Validity Versus Liability

There is an additional implication of separating workplace-related burnout screening from clinical practice as this may serve a very specific institutional function within organizations. Specifically, this separation might serve as a liability shield to protect organizations from their duty of care. De Beer (2026) notes that managers are “apprehensive to potential legal consequences if workplace surveys are used to diagnose.” This observation deserves more attention than it receives in his commentary.

Acknowledging that a Red score on a workplace survey constitutes a clinically significant finding triggers significant employer obligations. Under the European Union’s Framework Directive on Safety and Health at Work (89/391/EEC), employers bear a legal responsibility to ensure that workplace risks are properly assessed and controlled (European Agency for Safety and Health at Work, 2023). Recent policy developments have intensified this pressure. In 2024, the European Commission conducted peer reviews specifically examining legislative and enforcement approaches to psychosocial risks at work (European Commission, 2024). The European Trade Union Confederation has called for a dedicated directive on psychosocial risks (European Trade Union Confederation, 2024).¹

In a legal analysis of mental health concepts in workplace regulation, Lerouge (2025) notes that employers face increasing exposure to liability when they possess knowledge of psychosocial risks and fail to act. Several EU member states, including Belgium, Finland, Ireland, and Spain, now permit proceedings based on breaches of occupational safety and health legislation. If a workplace survey is accepted as having diagnostic validity, an employer who surveys a team, identifies substantial distress, and fails to respond may face legal consequences. This creates understandable institutional anxiety about the implications of valid screening data (Lerouge, 2025).

For clarity, De Beer’s (2026) argument is not focused on liability concerns but is rather grounded in measurement theory and clinical practice considerations. However, the structural effect of his “dual approach” is that it allows organizations to measure burnout for “management information” without accepting the obligations that would follow from treating high scores as clinically meaningful. Screening data remains advisory rather than actionable. Whatever the intention, the practical consequence is that employees may be left in the “Twilight Zone,” told they are “at risk” by human resources but without a clear pathway to clinical support.

A further concern is that this approach may ultimately increase rather than decrease organizational risk. Employees who receive a vague “at risk” classification without the appropriate intervention or support may develop more severe symptoms, require longer recovery periods, and become more likely to pursue compensation claims. Research on mental health more broadly suggests that delays in appropriate intervention are associated with symptom escalation and prolonged recovery trajectories (cf. Altmann et al., 2024). While direct evidence specific to burnout is limited, the pattern is consistent with what we know about chronic stress-related conditions: early identification paired with appropriate response produces better outcomes than administrative ambiguity (Vandenabeele et al., 2025). As such, the short-term reduction in potential liability by disclaiming the diagnostic validity of these instruments may be offset by longer-term costs of untreated pathology.

Toward an Integrated Continuum Model

Rather than reinforcing the separation between workplace screening and clinical diagnosis, I argue that we need a unified pathway to treatment and care. The “Twilight Zone” which De Beer (2026) describes exists because we treat “occupational risk” and “clinical disorder” as binary states rather than points on a trajectory. This binary thinking contradicts everything we know about the development of stress-related pathology, which progresses along a continuum from mild symptoms through moderate impairment to severe dysfunction (Vandenabeele et al., 2025).

As such, I propose an Integrated Continuum Model (see Figure 1) that reconceptualizes the relationship between screening and diagnosis. In this model, workplace screening is not “distinct” from clinical diagnosis but rather functions as the first stage of assessment within a multi-actor, coordinated pathway toward care. Critically, the model does not collapse screening and diagnosis into a single process. Screening identifies individuals who may require further evaluation, whereas clinical diagnosis involves the multi-modal, differential assessment process needed to rule out comorbid conditions, establish severity, and determine appropriate intervention. These are different activities requiring different competencies. What the model changes is not the distinction between them but the connection. It structures the transition from one to the other rather than leaving employees to navigate that transition alone, if they navigate it at all.

Stage 1 (Workplace Screening). Validated tools such as the BAT identify employees across the severity continuum. Green scores indicate healthy functioning requiring no intervention. Orange scores identify individuals at risk who may benefit from preventive support and monitoring. Red scores represent probable cases warranting prompt clinical evaluation.

Stage 2 (Triage, Not Referral). Instead of vague “management information,” Orange and Red scores trigger mandatory, confidential clinical triage. This is not bureaucratic referral to wait for an appointment in three months. It is immediate resource allocation. Occupational health professionals or embedded mental health consultants assess the employee within days, not weeks, and determine appropriate intervention intensity. In jurisdictions where occupational health infrastructure is less developed, Stage 2 could take the form of structured referral protocols with defined response timelines.

Stage 3 (Clinical Assessment and Treatment). Qualified clinicians conduct the differential diagnostic process, ruling out comorbid conditions such as major depressive disorder or generalized anxiety disorder. Employees meeting clinical thresholds receive appropriate treatment, which may include psychotherapy, pharmacological intervention, or medical leave. Critically, clinicians simultaneously feed anonymized data back to the organization to address systemic causes. This feedback loop ensures that individual treatment does not become a substitute for organizational change.

Figure 1

Comparison of the Dual Approach Model and the Proposed Integrated Continuum Model.

The Integrated Continuum Model bridges the “Twilight Zone” by structuring the transition from screening to diagnosis rather than collapsing them into a single process. It acknowledges that workplace tools with demonstrated diagnostic accuracy function as valid first-stage assessments regardless of who administers them, while preserving the clinical rigor of formal diagnosis as a distinct subsequent step. It recognizes that employers who possess valid information about an employee’s mental health status have ethical and legal obligations to act on that information. And it ensures that the data organizations collect serves employees rather than merely documenting their distress.

Implications for Evaluators and Practitioners

The arguments presented above carry concrete implications for those who design, administer, and interpret burnout assessments in organizational settings. First, it might be argued in some contexts that when an organization administers a validated burnout instrument at the individual level, it enters into an implicit contract with its employees: it asks individuals to disclose their psychological state on the assumption that the information will be used to their benefit. Administering a tool with clinical-grade cutoffs without establishing response protocols is not neutral data collection. It is an extraction of vulnerable self-disclosure without reciprocal accountability. Organizations considering the use of instruments such as the BAT should therefore establish what I term pre-commitment protocols before data collection begins: written agreements specifying what actions follow from each score range, who initiates those actions, and within what timeframe. If the answer to “what will we do with a Red score?” is “nothing beyond aggregate reporting,” the survey should not be administered at the individual level.

Second, instrument developers bear a responsibility that extends beyond measurement precision. When a research team establishes clinically validated cutoffs, it creates a tool that generates obligations for those who use it. A cutoff score is not merely a statistical threshold. It is a decision point that creates a duty of response and an obligation for care. Researchers who publish cutoffs without accompanying guidance on response protocols are, in effect, handing organizations a fire alarm without an evacuation plan. Finally, the Integrated Continuum Model must accommodate the reality that regulatory frameworks and occupational health infrastructure vary across jurisdictions. In contexts where occupational health physicians are embedded in organizational structures, triage can be immediate and internal. In systems where occupational health is less integrated, structured referral networks with defined timelines may be required. The model’s architecture is transferable. Its specific protocols must be locally adapted.

Conclusion

De Beer’s (2026) attempt to navigate the “Twilight Zone” by reinforcing the walls between occupational and clinical approaches reflects a pragmatic instinct, but it carries consequences that deserve serious scrutiny. It underestimates the biological reality of stress that produces measurable neuroendocrine dysfunction regardless of administrative labels. It stands in tension with the psychometric advances of his own research, which established the BAT’s capacity to discriminate clinical severity. And it overlooks the idea that identifying distress is itself a form of recognition that creates an obligation to respond (cf. Altmann et al., 2024).

The “Twilight Zone” is not a conceptual mystery that requires a philosophical resolution. It is a structural consequence of treating screening and diagnosis as separate enterprises rather than stages in a coordinated process. When valid diagnostic information is generated but institutional structures do not require a response, the result is administrative ambiguity rather than care. We do not need to protect managers from the implications of screening data. We need to equip them with clear protocols that specify appropriate responses.

It is worth acknowledging that De Beer’s position reflects legitimate concerns about diagnostic overreach in non-clinical settings, about the qualifications needed to establish a formal diagnosis, and about the regulatory complexity of different jurisdictions. These are serious considerations. But the solution is not to sever the link between screening and clinical care. It is to build bridges between them.

If a workplace survey indicates severe burnout, that finding carries clinical weight regardless of the setting. The regulatory question of who is authorized to make a formal diagnosis is separate from the empirical question of whether the instrument has detected a condition that warrants clinical attention. To argue otherwise is to suggest that a blood pressure reading of 180/120 becomes less urgent because it was taken at a pharmacy kiosk rather than a cardiology suite. The reading does not constitute a diagnosis of hypertension, nor does it authorize medication, but it creates an obligation to seek clinical evaluation. This is precisely the relationship between screening and diagnosis that the Integrated Continuum Model proposes for burnout.

The deeper issue this debate reveals is that we have built measurement systems that are better at documenting distress than responding to it. The real question before us is not whether workplace screening can contribute to diagnosis. The validation research has substantially addressed that question. The real question is whether we possess the institutional will to build systems that act on what our instruments reveal. If a workplace survey identifies severe burnout, we should not debate whether it is an “occupational finding” or a “clinical finding.” We should ensure that employees receive the care they need.

Footnotes

ORCID iD

Llewellyn Ellardus van Zyl

Ethical Considerations

Ethical clearance was not necessary for this manuscript.

Consent to Participate

Obtaining informed consent was not applicable to this manuscript.

Author Contributions

LEVZ conceptualised and wrote the first draft of the manuscript.

Data Availability Statement

No data was generated for this article.*

Generative AI

We used Claude 4 (Anthropic) and Grammarly for language editing at the sentence and paragraph level. Their suggestions addressed grammar, punctuation, concision, clarity, flow, and highlighted inconsistencies in tone, and pointed out minor APA style phrasing mistakes. No text or references were generated by these tools. All conceptual framing and writing was done by the author. The authors reviewed, verified, and accepted or rejected every suggested wording change. No sensitive participant data, raw datasets, or confidential materials were uploaded to external systems. The authors accept full responsibility for the integrity and originality of the manuscript.

Note

References

Altmann

Fleischer

Tse

Haslam

(2024). Effects of diagnostic labels on perceptions of marginal cases of mental ill-health. PLOS Mental Health, 1(3), Article e0000096. https://doi.org/10.1371/journal.pmen.0000096

Bianchi

Schonfeld

I. S.

(2025). Beliefs about burnout. Work & Stress, 39(2), 116–134. https://doi.org/10.1080/02678373.2024.2364590

De Beer

L. T.

(2026). Burnout as an occupational phenomenon: Navigating the Twilight Zone. Psychiatry Research. Advance online publication. https://www.sciencedirect.com/science/article/abs/pii/S0165178126000089

De Beer

L. T.

Christensen

Sørengaard

T. A.

Innstrand

S. T.

Schaufeli

W. B.

(2024). The psychometric properties of the Burnout Assessment Tool in Norway: A thorough investigation into construct-relevant multidimensionality. Scandinavian Journal of Psychology, 65(3), 479–489. https://doi.org/10.1111/sjop.12996

European Agency for Safety and Health at Work . (2023). The role of sanctions in European labour inspection policy and practice. OSHwiki. https://oshwiki.osha.europa.eu/en/themes/role-sanctions-european-labour-inspection-policy-and-practice

European Commission . (2024). Peer review on legislative and enforcement approaches to address psychosocial risks at work. Directorate-General for Employment, Social Affairs and Inclusion.

European Trade Union Confederation . (2024). ETUC resolution on specific demands for a European directive on the prevention of psychosocial risks at work. Executive Committee.

Kuzmin

M. Y.

Sholokhov

L. F.

Akhmedzyanova

M. R.

(2024). Biomarkers of burnout and their relationship with psychological characteristics in healthcare practitioners. Russian Open Medical Journal, 13(4), Article e0402. https://doi.org/10.15275/rusomj.2024.0402

Lerouge

(2025). The concepts of ‘mental health in the workplace’ and ‘psychosocial risks’: A clarification from a legal perspective. European Labour Law Journal, 16(3), 377–383. https://doi.org/10.1177/20319525251336018

10.

Nadon

De Beer

L. T.

Morin

A. J. S.

(2022). Should burnout be conceptualized as a mental disorder? Behavioral Sciences, 12(3), 82. https://doi.org/10.3390/bs12030082

11.

Parker

Russo

(2025). Current issues in relation to burnout’s definition, measurement, prevalence and management: A narrative review. Psychiatry Research, 352, Article 116709. https://doi.org/10.1016/j.psychres.2025.116709

12.

Schaufeli

W. B.

Desart

De Witte

(2020). Burnout Assessment Tool (BAT)—development, validity, and reliability. International Journal of Environmental Research and Public Health, 17(24), 9495. https://doi.org/10.3390/ijerph17249495

13.

Schaufeli

W. B.

De Witte

(2023). A fresh look at burnout: The Burnout Assessment Tool (BAT). In Krägeloh

C. U.

Alyami

Medvedev

O. N.

(Eds.), International handbook of behavioral health assessment (pp. 1–24). Springer. https://doi.org/10.1007/978-3-030-89738-3_54-1

14.

Schaufeli

W. B.

De Witte

Hakanen

J. J.

Kaltiainen

Kok

(2023). How to assess severe burnout? Cutoff points for the Burnout Assessment Tool (BAT) based on three European samples. Scandinavian Journal of Work, Environment & Health, 49(4), 293–302. https://doi.org/10.5271/sjweh.4093

15.

Ungurianu

Marina

(2025). The biological clock influenced by burnout, hormonal dysregulation and circadian misalignment: A systematic review. Clocks & Sleep, 7(4), 63. https://doi.org/10.3390/clockssleep7040063

16.

Vandenabeele

Joosen

M. C. W.

van Dam

(2025). Chronic stress in relation to clinical burnout: An integrative scoping review of definitions and measurement approaches. Frontiers in Psychology, 16(1), Article 1712340. https://doi.org/10.3389/fpsyg.2025.1712340

17.

World Health Organization . (2019). Burn-out an “occupational phenomenon”: International classification of diseases. https://www.who.int/news/item/28-05-2019-burn-out-an-occupational-phenomenon-international-classification-of-diseases