Abstract
Study Design
Literature review.
Objective
To describe whether practice variation studies on surgery in patients with lumbar degenerative disc disease used adequate study methodology to identify unwarranted variation, and to inform quality improvement in clinical practice. Secondary aim was to describe whether variation changed over time.
Methods
Literature databases were searched up to May 4th, 2021. To define whether study design was appropriate to identify unwarranted variation, we extracted data on level of aggregation, study population, and case-mix correction. To define whether studies were appropriate to achieve quality improvement, data were extracted on outcomes, explanatory variables, description of scientific basis, and given recommendations. Spearman’s rho was used to determine the association between the Extreme Quotient (EQ) and year of publication.
Results
We identified 34 articles published between 1990 and 2020. Twenty-six articles (76%) defined the diagnosis. Prior surgery cases were excluded or adjusted for in 5 articles (15%). Twenty-three articles (68%) adjusted for case-mix. Variation in outcomes was analyzed in 7 articles (21%). Fourteen articles (41%) identified explanatory variables. Twenty-six articles (76%) described the evidence on effectiveness. Recommendations for clinical practice were given in 9 articles (26%). Extreme Quotients ranged between 1-fold and 15-fold variation and did not show a significant change over time (rho= −.33, P= .09).
Conclusions
Practice variation research on surgery in patients with degenerative disc disease showed important limitations to identify unwarranted variation and to achieve quality improvement by public reporting. Despite the availability of new evidence, we could not observe a significant decrease in variation over time.
Introduction
Degenerative lumbar spine disorders lead to disability, sick leave, and high societal and healthcare costs.1,2 The most frequent disorder is degenerative disc disease leading to a herniated disc. 3 In 1934, Mixter and Barr’s paper on 19 surgically treated patients with root compression, officially opened the era of spine surgery. 4 In the last decades, the number of lumbar spine procedures for lumbar back pain and leg pain increased substantially and large variation in surgical rates was observed between and within regions.5-8
Unwarranted variation in surgical rates is variation that cannot be explained by differences in patient needs and preferences. 9 Hence, it can be driven by the lack of high-quality evidence on indications for surgery or differences in surgeons’ beliefs about the effectiveness of procedures. High-quality research on the effectiveness of surgical treatment in degenerative lumbar disc disease is still lacking for some procedures, but has increased significantly in the last decades. 10 Implementation of Evidence-Based Medicine (EBM) to improve healthcare can be achieved by the development of clinical guidelines. 11 Subsequently, appropriate studies on unwarranted variation can be used as feedback to clinicians and policy makers to improve implementation of these guidelines. Public reporting of these studies can be a first step toward change and might close the loop between EBM and clinical practice.12,13 However, analyzing and explaining practice variation is challenging because multiple factors influence variation in surgical rates. 9 Previously, it was described that clinical audits or practice variation research must: 1. Select a diagnosis, 2. Provide a scientific basis to demonstrate the gap between actual and desired practice, and 3. Define warranted and unwarranted use of the target outcome.13-16 Ideally, the article should investigate the causes of practice variation to identify specific areas for improvement and provide recommendations for clinical practice.13,17
We aimed to describe whether practice variation studies on lumbar disc surgery in patients with lumbar degenerative disc disease used adequate study methodology to identify unwarranted variation and to inform quality improvement in clinical practice. Secondary, we were interested whether variation changed over time. We hypothesize that the availability of new evidence and guidelines combined with the attention for unwarranted variation in spine surgery of the past decades led to lower variation in more recent years. 18
Methods
This literature review was reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. 19 We did not register a study protocol.
Databases and Selection Process of Studies
A literature search was performed by a trained librarian on PubMed, Embase, and Web of Science on May 04th, 2021 to identify articles on practice variation in lumbar disc surgery for degenerative disc disease in adults. Title and abstract screening as well as full text screening was performed by two reviewers independently (JM, VW). A third and fourth reviewer (FA, WP) were consulted in case of conflicts between the two reviewers. Records were managed through Rayyan; 20 specific software for managing bibliographies.
Search Strategy and Inclusion Criteria
We focused on peer-reviewed studies on practice variation in lumbar disc surgery for degenerative disease. Our search strategy consisted of three main concepts and variations thereof: “spinal diseases or low back pain,” “spine surgery,” and “practice variation or small area analysis.” The full search strategy can be found in the supplemental files. Articles that focused on lumbar disc surgery without specification of the disease in the methods were also included. Hereby, we aimed to identify on which diagnoses these articles focused and which diagnosis codes and procedure codes were included. We excluded articles on lumbar disc surgery for malignancies, traumatic fractures, spinal deformities, and congenital diseases, as the pathophysiological mechanisms differ from degenerative lumbar disc disease. Furthermore, we excluded articles that analyzed practice variation in cervical or thoracic spine surgery only. Also, survey studies using case scenarios were excluded. Lastly, articles were excluded if no full text was available and if articles were not written in English.
Data-Extraction
Two reviewers (JM and VW) extracted data on the study characteristics. Indicators for appropriate study design to identify unwarranted variation and optimal study design to achieve quality improvement were based on previous research and frameworks.13,14,17,21-23 First, characteristics were extracted to describe whether study design was appropriate to identify unwarranted variation: the level of aggregation, study population (inclusion and exclusion criteria), description and selection of the diagnosis or diagnosis group, and variables used for case-mix correction. Second, characteristics were extracted to describe whether study design was optimal to achieve quality improvement: variation in clinical outcomes, analyzed explanatory variables (other than the variables adjusted for), scientific basis for treatment effectiveness described (i.e., practices compared against clinical guidelines), and if recommendations for clinical practice or future research were given. Third, we described coding used for the procedures and the diagnoses. Lastly, we described whether significant variation was concluded by the authors (yes vs. no) and the Extreme Quotient (EQ, highest/lowest surgical rate).
Analysis
Spearman’s rho was used to determine the association between time (year of publication) and study outcome (EQ). We hypothesized lower variation in more recent years due to the increased number of studies on effectiveness of surgical procedures. For this analysis, studies that did not describe the highest and lowest population-based surgical rates or EQ and studies that focused on fusion surgery only were excluded.
Results
The process of study identification and selection is presented in Figure 1. In total, 34 articles published between 1990 and 2020 were included. Nineteen articles (56%) investigated practice variation in the USA. Most articles (n = 30) used administrative healthcare databases for the analyses. Three articles used national spine registries for data collection,24-26 and one article used hospital records.
27
In all but three articles,25,28,29 significant variation of surgical rates was concluded by the authors. Highest and lowest population-based surgical rates were described in 26 articles (76%). The EQ ranged from 1-fold to 15-fold in surgical rates. We observed a median EQ of 4-fold variation (Interquartile Range 2.0–7.3). We did not observe a significant decrease in EQ over time (Figure 2, rho = −.24, P = .2). If articles were depicted twice, practice variation in different type of procedures was analyzed. Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow diagram on study selection. Numbers of publications and Extreme Quotient over time by type of surgery.

Identifying Unwarranted Variation
Article Characteristics Important to Identify Unwarranted Variation.
aExtreme Quotient (Highest/lowest surgical rate).
bNot applicable.
Investigated Population And Included Diagnosis Codes.
aNot applicable: no description of included codes.
bICD-10 coding.
cyears.
dLow back pain.
eNational coding Norway and Canada.
f739.
g996.
hLumbar Disc Herniation.
iDegenerative Disease of the Lumbar Spine.
j772.
kLumbar Spinal Stenosis.
lData from national spine registries.
Case-mix correction was performed in 23 articles (68%)6,25,29,30,32,35-51 (Figure 3). No article investigated timing of surgery or adjusted for disease severity to define unwarranted variation in treatment choice. In thirteen articles (38%), adjustment for referral cases was accomplished by analyzing practice variation on the level of Hospital Service Areas (HSAs). However, HSAs were defined in different ways. For example, Keller et al. defined spine service area, using discharges for spine problems only,
36
whereas other articles based the HSAs on neurosurgery and major cardiovascular procedures, or on all discharges. Adjustment for case-mix in 34 articles analyzing practice variation in degenerative lumber spine surgery.
Coding of Procedures and Diagnoses
Included Procedure Codes.
aNot applicable (no description).
bNational coding Canada and Norway.
cLoeser–Volinn algorithm used in Australia.
dCanadian Classification of Procedures.
eBelgian nomenclature.
fNOMESCO classification of surgical procedures.
gCurrent Procedure Technology.
hCHOP treatment classification.
iOperation and procedure codes.
jICD-10.
kMedicare Australia Codes.
lData from national spine registries.
The differences in definition of the diagnosis could partly be explained by focusing on different diagnosis groups. However, if similar coding for the diagnoses was used, variation in definition of the disease occurred. For example, Nilasena et al. 40 described a long list of problems including “nonspecific backache” and “instability” for the same codes that Birkmeyer et al. 42 defined as “spinal stenosis or lumbar disc herniation.” Similarly, if approximately the same definition of the disease (i.e., lumbar degenerative disease) was used, coding varied widely. Ten articles (29%) included diagnoses for low back pain without sciatica.29,36,38,40,42,46,48,51,52,55 Investigated procedures and procedure codes varied as well between articles (Table 3). Twenty-three articles (68%) investigated practice variation in discectomies, twenty articles (59%) investigated laminectomies, and twenty-seven articles (79%) investigated fusion. Furthermore, four articles included the code “Lysis of adhesions of spinal cord”38,42,46,50; three articles included the code “Internal fixation of bone”38,42,46; two articles included the code for “Insertion of spinal disc prosthesis”50,56; and two articles included codes for refusion.50,52
Study Design Optimal for Quality Improvement
Article Characteristics Important to Achieve Quality Improvement.
aOther than the variables adjusted for.
bNot Applicable: article described lack of evidence on effectiveness.
Discussion
Although we observed large differences in variation between studies, we did not observe a decrease in practice variation in surgical rates for lumbar degenerative disc disease over time between 1990 and 2020. The most recent studies on variation implied that the practice variation is still problematic, despite the fact that evidence on effectiveness and timing of surgery improved in the last decades.57-60 However, the largest variations in surgical rates were described in one of the very first articles for both general spine surgery and fusion.46,61 Furthermore, we observed important limitations in the study design of practice variation research on lumbar disc surgery in patients with degenerative disc disease. Not all articles used adequate study design to identify unwarranted variation and to be able to inform improve quality improvement. Moreover, we observed substantial heterogeneity in study methodology, hampering the comparison of practice variation studies over time and between regions and countries.
Not all study designs were appropriate to identify unwarranted variation. First, the diagnosis group of interest was not clearly defined in all studies. Moreover, most articles that did define the diagnosis group included multiple diagnoses. We were surprised by these findings since the indication for surgical treatment depends on the patient’s diagnosis. Second, none of the studies included timing of surgical treatment in the analysis, while this is an important indication for surgery in patients suffering sciatica due to lumbar disc herniation. 57 Lastly, not all studies adjusted for relevant case-mix factors. Future practice variation studies should specify both the diagnosis and procedure, exclude repeat surgery, and adjust for relevant case-mix, including timing of surgery and severity of the disease if possible.
The limitations in methodology might partly be caused by the limitations of administrative healthcare databases. These databases have important advantages and disadvantages. These databases enable investigation of large geographic areas and coverage over multiply years, which is an advantage for measuring practice variation. 62 Although quality of administrative databases improved over the last decades, they also have some drawbacks for measuring practice variation in spine surgery. For timing of surgery, linking between primary care and hospital databases is necessary, which is not always possible. Linking these databases will enable analysis of the full care path, including nonoperative guidance by the general practitioner and physiotherapist, and use timing of surgery as a quality indicator. Additionally, not all the drivers of practice variation can be measured as case-mix variables due to the lack of availability in databases.
The differences in coding within similar diagnosis and procedure groups hamper comparison between articles. This might be caused by the focus of the research question. For example, some articles specifically focused on variation in fusion procedures for degenerative spondylolisthesis. It is no surprise that different diagnosis codes were included in these studies compared to studies that focus on degenerative disc disease. However, methodology differed within studies investigating similar diagnosis groups as well. Another reason for this finding might be the differences in available codes between administrative databases. Moreover, coding will depend on provider registration and interpretation of medical coders. Standardized terminology and coding based on ICD codes within all countries can also improve the quality of comparisons within and between international databases.
Lastly, most administrative healthcare databases do not include clinical outcomes. Reporting on variation in clinical outcomes as a result of variation in clinical practice can contribute to the intrinsic motivation of physicians to deliver the best care for their patients and thereby facilitate the quality improvement process. 17 Not all studies used optimal study design to identify areas for quality improvement and close the loop between EBM and clinical practice. Although public reporting might be the first step towards change, 12 optimal study design can improve the impact of an article. 17 Future research should ideally compare practices against clinical practice guidelines, include outcome variables, and give recommendations for clinical practice. Advancements in the quality and comprehensiveness of administrative databases and linking between clinical outcome databases and administrative databases will facilitate the possibility to use methodology important for quality improvement.
To the best of our knowledge, this is the first review that described differences and limitations in methods for practice variation research in lumbar degenerative disc disease. A strength of this review was the systematic search and data selection providing a comprehensive overview of all relevant methodological and clinical aspects regarding practice variation in lumbar degenerative spine surgery. Furthermore, not only spinal neurosurgeons, but also a neurologist and methodological experts were involved in our team giving knowledge on clinical and methodological features.
Our study has limitations as well. First, we focused on variation in surgical treatment, whereas variation in conservative treatments, such as physiotherapy or the use of opioids and variation in outcomes are important areas for quality improvement as well. Second, the number of papers was too small to make a proper comparison between the EQ and methodological features. For example, case-mix correction adjusts for the effect of patient characteristics on treatment choice, potentially leading to lower variation in surgical rates. Third, we only included peer-reviewed articles, missing articles published by national institutes. This includes publications from the Dartmouth Institute of Healthcare, 63 although data were described in peer-reviewed articles as well.42,44-46 Lastly, we were unable to describe all country specific regulations of spine care, such as specialized spine clinics and the presence or absence of health care insurance, which might contribute to regional differences and must be included in the analysis in order to identify unwarranted variation. Therefore, the non-significant change in practice variation over time is only an indication and must be interpreted with caution.
Conclusions
Practice variation research on lumbar disc surgery in patients with degenerative disc disease showed important limitation in used methodologies that contribute to the possibility of identifying unwarranted variation and improve quality in clinical practice. Furthermore, significant heterogeneity in study designs was observed. This finding could not fully be explained by differences in investigated diagnosis groups. Despite the availability of new evidence, we did not find evidence of a clear decrease in variation over time. However, questions might be raised about the comparability of these studies. Future practice variation studies should specify both the diagnosis and procedure, exclude repeat surgery, and adjust for relevant case-mix, including timing of surgery and severity of the disease if possible. Furthermore, specific national regulations of spine care should be included in the analysis. Lastly, future research should ideally compare practices against clinical practice guidelines, include outcome variables and give recommendations for clinical practice. Hopefully, future practice variation studies will identify areas for quality improvement and close the loop between EBM and clinical practice to improve patient outcomes.
Supplemental Material
sj-pdf-1-gsj-10.1177_21925682211064855 – Practice Variation Research in Degenerative Lumbar Disc Surgery: A Literature Review on Design Characteristics and Outcomes
Supplemental material, sj-pdf-1-gsj-10.1177_21925682211064855 for Practice Variation Research in Degenerative Lumbar Disc Surgery: A Literature Review on Design Characteristics and Outcomes by Juliëtte J. C. M. Van Munster, Vera de Weerdt, Ilan J.Y. Halperin, Amir H. ZamanipoorNajafabadi, Peter Paul G. van Benthem, Guus G. Schoonman, Wouter A. Moojen, Wilbert B. van den Hout, Femke Atsma, and Wilco C. Peul in Global Spine Journal
Footnotes
Acknowledgements
We thank Jan Schoones from the Walaeus Library (Leiden University Medical Center) for designing and performing the literature search
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
