Sage Journals: Discover world-class research

Abstract

Study Design

Systematic Review.

Introduction

Randomized controlled trials (RCTs) on lumbar endoscopic decompression inform treatment decisions for disk disease, radiculopathy, and lumbar spinal stenosis. This study assessed the fragility of statistical outcomes in these RCTs.

Methods

PubMed, Embase, and MEDLINE were queried for RCTs reporting dichotomous outcomes with at least 1 endoscopic decompression arm. The fragility index (FI) and reverse FI (rFI) represented the number of event reversals needed to change significance for significant and nonsignificant outcomes, respectively. The fragility quotient (FQ) was calculated by dividing FI by sample size. Subgroup analysis was performed by outcome type.

Results

37 RCTs met the inclusion criteria for analysis. A total of 160 outcomes were analyzed. The median FI was 4 (IQR: 3-5) and FQ 0.038 (IQR: 0.017-0.067). Significant outcomes (n = 23) had a median FI of 7 (IQR: 2-13), FQ 0.024 (IQR: 0.012-0.056); nonsignificant outcomes (n = 137) had FI 4 (IQR: 3-5), FQ 0.041 (IQR: 0.020-0.068). Revisions/reoperations were most robust (FI: 5, FQ: 0.037); microscopic outcomes most fragile (FI: 4, FQ: 0.022). Pain outcomes had FI 4 (FQ: 0.051); complications FI 4 (FQ: 0.038). In 47.5% of outcomes, patients lost to follow-up exceeded FI.

Conclusions

Findings from RCTs on lumbar endoscopic decompression are vulnerable to small changes in outcome events. In nearly half of outcomes, patients lost to follow-up outnumbered the FI. Reporting FI and FQ with P-values may improve interpretation and reliability of trial results.

Keywords

endoscopic decompression statistical fragility reoperations fragility index lumbar

Introduction

Endoscopic lumbar decompression is a minimally invasive surgical technique used to alleviate pressure on spinal nerves caused by conditions like spinal stenosis, herniated discs, or degenerative changes.¹ This procedure enables the removal of a herniated disc, hypertrophic ligamentum flavum, and bony overgrowths through small incisions using an endoscope, minimizing tissue damage and preserving spinal stability. Depending on the condition and location of nerve compression, surgeons may utilize the transforaminal approach to access herniated discs or the interlaminar approach to address central or lateral recess stenosis.² Compared to traditional open surgery endoscopic lumbar decompression offers numerous advantages including reduced postoperative pain, shorter recovery time, and shorter hospital stays by minimizing muscle dissection and tissue damage.³ This innovative technique has seen increasing adoption among surgeons for patients seeking effective relief from symptoms like back pain, leg numbness, and weakness while minimizing the risks associated with more invasive surgeries.^1,4

Randomized controlled trials (RCTs) represent the highest level of evidence for evaluating clinical outcomes and guiding clinical decision-making in spine surgery due to their rigorous methodology and controlled study design. However, conclusions drawn from these trials rely heavily on P values, which have been criticized for overlooking important factors such as patient loss to follow-up and study design.⁵ The concept of fragility index (FI) was first introduced by Feinstein et al to complement the P value and address its limitations. The FI quantifies the minimum number of patients whose outcome status must change to convert a statistically significant result (P < 0.05) to non-significance (P ≥ 0.05). It offers insight into the trial’s susceptibility to minor changes in data, emphasizing the potential fragility of its conclusions.⁶ The Reverse Fragility Index (RFI) is the minimum number of outcome reversals required to convert a statistically non-significant result into a significant one, helping evaluate the stability of non-significant findings.^7,8 The Fragility Quotient (FQ) is the ratio of the Fragility Index (FI) or Reverse Fragility Index (RFI) to the total sample size, providing a standardized measure of result fragility relative to study size.^6,9,10 In conjunction with the P-value, the FI and FQ offer a more comprehensive assessment of a trial’s fragility. Studies with low susceptibility to fragility yield stronger, more reliable conclusions than those with high susceptibility, enabling readers to critically evaluate the literature and enhance clinical decision-making based on evidence-based principles.¹¹

The purpose of this study was to determine the overall fragility of outcomes in randomized controlled trials evaluating lumbar endoscopic decompression techniques by utilizing the FI, RFI, and FQ metrics. Furthermore, we aimed to evaluate the statistical fragility of these RCT findings according to outcome type. We hypothesized that statistical outcomes reported in the lumbar endoscopic decompression literature would be fragile, with only a few outcome-event reversals altering significance. We further hypothesized that significant outcomes would be especially fragile and that statistical fragility would be observed across the various outcome types assessed.

Methods

Systematic Search Strategy

This study systematically searched PubMed, Embase, and MEDLINE databases to identify randomized controlled trials (RCTs) published between January 1, 2010, and July 16, 2024. The review adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.¹² The search strategy employed various Boolean combinations of keywords, synonyms, and term variations, including “endoscopic,” “spine surgery,” “lumbar,” “discectomy,” “laminectomy,” “laminotomy,” and “decompression.” The comprehensive search strings are detailed in the supplemental material.

Eligible studies included RCTs that reported dichotomous outcomes and featured at least 1 treatment arm involving endoscopic lumbar decompression. Exclusion criteria encompassed studies published in non-English languages or those utilizing cadaveric, biomechanical, animal, in vitro, or non-RCT designs. Additionally, studies without full-text availability were excluded. Titles and abstracts were screened by 2 independent reviewers, followed by a full-text review, with conflicts resolved by a third independent reviewer. Reasons for exclusion were documented, and the senior author confirmed the final study selection.

The revised Cochrane Risk of Bias tool was used to evaluate bias in the included RCTs.¹³ This review focused on the statistical reporting and significance of outcomes rather than direct clinical outcomes. Thus, it did not meet the criteria for registration with the International Prospective Register of Systematic Reviews (PROSPERO). Only publicly accessible studies were analyzed, eliminating the need for institutional review board (IRB) approval.

Study Screening and Data Extraction

Key data extracted from the selected studies included the first author, year of publication, journal title, experimental and control group interventions, reported outcomes, results, number of patients lost to follow-up, and P-values where available.

The primary outcome was the fragility index and fragility quotient for outcomes in each included randomized controlled trial. The secondary outcome was the proportion of studies in which the number of patients lost to follow-up exceeded the calculated FI. Subgroup analyses were considered tertiary outcomes.

Outcome measures were categorized into subgroups of complications/adverse events, revision/readmission rates, and patient reported pain. Additional subgrouping was applied for studies assessing microscopic endoscopic approaches, which were reported as comparator subgroup outcomes. A separate subgroup was also created for studies specifically employing a biportal endoscopic approach, which were reported as a subanalysis within endoscopic techniques. Reviewers performed data extraction independently using standardized forms to ensure consistency. Figure 1 displays a PRISMA flow chart detailing the screening process and literature search outcomes.

Figure 1.

PRISMA Flowchart of Study Selection Process

Fragility Analysis

Fragility analysis was conducted using a two-tailed Fisher’s exact test to evaluate the statistical significance of reported outcomes at a threshold of P < 0.05. The fragility index (FI) was calculated for significant outcomes by determining the minimum number of event reversals required for the P-value to rise above 0.05, rendering the results no longer statistically significant (Figure 2).

Figure 2.

Demonstration of Statistical Significance Reversal Using a 2 × 2 Contingency Table With a Resulting Fragility Index (FI) = 3. P-Values Were Calculated Using a Two-Tailed Fisher Exact Test

For non-significant outcomes, the reverse Fragility Index (rFI) was calculated by manipulating event outcomes until the P-value dropped below 0.05. The Fragility Quotient (FQ) was derived by dividing the FI or rFI by the study sample size, reflecting the proportion of patients needing an outcome reversal to alter statistical significance. Subgroup analyses were performed based on outcome type and statistical significance. Fragility analysis results were summarized as medians with interquartile ranges (IQRs).

Results

Search Results

Study characteristics were summarized in Table 1. After removing duplicates, the initial database searches yielded 144 studies after removing duplicates. Following the title and abstract screening, 63 studies were excluded. The remaining 82 full-text reports were assessed, and 37 randomized trials were deemed suitable for inclusion in the final analysis. Of the 37 included RCTs, 12 compared endoscopic procedures with microscopic techniques, 4 compared endoscopic with open approaches, and 4 compared variations of endoscopic approaches (eg, transforaminal vs interlaminar, unilateral vs bilateral, or dual-channel vs single-channel techniques). The remaining 17 studies evaluated endoscopic surgery with adjunctive factors such as anesthesia type, navigation methods, or device modifications.

Table 1.

Characteristics of Included Studies

Author	Year	Journal	Sample size	Lost to follow-up
Gadjradj et al	2022	British Medical Journal	488	68
Park et al	2023	The Spine Journal	64	0
Yue et al	2023	The Spine Journal	90	2
Du et al	2022	Cellular and Molecular Biology	182	0
Park et al	2020	The Spine Journal	64	5
Li et al	2020	European Spine Journal	71	0
Zhang et al	2022	Cellular and Molecular Biology	290	0
Qian et al	2024	European Spine Journal	161	11
Ran et al	2021	Pain Physician	68	2
Pan et al	2016	Medical Science Monitor	106	0
Ye et al	2020	Annals of Palliative Medicine	60	0
Chen et al	2022	Pain Physician	98	7
Kang et al	2019	Medicine	70	8
Chen et al	2018	Journal of Neurosurgery: Spine	153	16
Kotheeranurak et al	2023	European Spine Journal	60	5
Tao et al	2018	Eur Rev Med Pharmacol Sci	462	0
Ying et al	2016	Medicine	45	0
Hussein	2016	Journal of the American Academy of Orthopaedic Surgeons	80	7
Wu et al	2023	J Coll Physicians Surg Pak	80	0
Gibson et al	2017	European Spine Journal	141	2
Komp et al	2015	Pain Physician	160	25
Li et al	2019	World Neurosurg	99	0
Chen et al	2023	Spine	247	31
Kong et al	2019	Orthopade	40	1
Chen et al	2020	Spine	250	9
Park et al	2019	World Neurosurgery	64	1
Hussein	2014	European Spine Journal	200	15
Alver et al	2024	European Spine Journal	65	5
Liu et al	2023	Neurospine	28	0
Wang et al	2023	Global Spine Journal	230	0
Fan et al	2022	BMC Musculoskeletal Disorders	344	0
Kim et al	2023	Global Spine Journal	52	7
Gadjradj et al	2022	Neurospine	613	93
Aygun et al	2021	Clinical Spine Surgery	154	0
Wu et al	2020	Annals of Translational Medicine	86	1
Mo et al	2019	World Neurosurgery	80	0
Ahmed et al	2019	Medical Forum Monthly	172	0

Across all 37 RCTs, we identified 160 dichotomous outcomes (Table 2). 23 outcomes were classified as statistically significant and 137 were classified as statistically nonsignificant. For the 160 total outcomes, the median FI was 4 (IQR: 3-5) and the median FQ was 0.038 (IQR: 0.017-0.067) indicating that the reversal of only 3.8% of patients would have been required to alter the study significance of included RCTs. For the 23 significant outcomes, the median FI was 7 (IQR: 2-13) and the median FQ was 0.024 (IQR: 0.012-0.056). For the 137 nonsignificant outcomes, the median FI was 4 (IQR: 3-5) and the median FQ was 0.041 (IQR: 0.020-0.068). Additionally, in 76 of 160 analyzed outcomes (48.78%), the number of patients lost to follow-up exceeded the FI (Table 2), highlighting that improved postoperative follow-up could potentially affect the statistical significance of nearly half of these outcomes.

Table 2.

Statistical Fragility of Overall Outcomes

	Number of outcomes	FI, median (IQR)	FQ, median (IQR)	Percentage of outcomes where number of patients lost to follow up was greater than FI (n, %)
All RCT outcomes	160	4 (3-5)	0.038 (0.017-0.067)	76,47.50%
Significant outcomes (P < 0.05)	23	7 (2-13)	0.024 (0.012-0.056)	9,39.13%
Nonsignificant outcomes (P ≥ 0.05)	137	4 (3-5)	0.041 (0.020-0.068)	67,48.91%

Among the reported outcomes, complications were the most frequently analyzed subgroup (N = 85), whereas revisions/reoperations were the least reported (N = 18) (Table 3). The microscopic technique subgroup was the most fragile, with a median FI of 4 (IQR: 3-5) and an associated FQ of 0.022 (IQR: 0.013-0.051), suggesting that altering outcomes in just 2.2% of patients would reverse statistical significance. The biportal and revisions/reoperations subgroups exhibited similar levels of statistical robustness, each with a median FI of 5; the biportal subgroup had an FI of 5 (IQR: 2-6) with an FQ of 0.044 (IQR: 0.013-0.079), and the revisions/reoperations subgroup had a median FI of 5 (IQR: 4-5) with an FQ of 0.037 (IQR: 0.018-0.073). The complications subgroup demonstrated a median FI of 4 (IQR: 3-5) and an FQ of 0.038 (IQR: 0.017-0.067). The self-reported pain subgroup was found to have the greatest statistical stability, with a median FI of 4 (IQR: 3-6) and the highest FQ of 0.051 (IQR: 0.025-0.074). A scatter plot analyzing the fragility quotient of studies over time revealed a statistically significant but minimal negative association (β = −0.0056, R² = 0.055, P = 0.0037) (Figure 3).

Table 3.

Statistical Fragility of Subgroup Outcomes

	Number of outcomes	FI, median (IQR)	FQ, median (IQR)	Percentage of studies in which number of patients lost to follow up was greater than FI (%)
Complications	85	4(3-5)	0.038(0.017-0.067)	49.41
Self-reported Pain	43	4(3-6)	0.051(0.025-0.074)	30.23
Endoscopic vs microscopic	81	4(3-5)	0.022(0.013-0.051)	76.54
Biportal endoscopic technique	21	5(2-6)	0.044(0.013-0.079)	28.57
Revisions/Reoperation	18	5(4-5)	0.037(0.018-0.073)	50

A Cochrane risk of bias assessment indicated that 36 out of the 37 evaluated RCT’s were categorized as having an overall “low risk of bias.” The single study categorized as having ‘some concerns’ exhibited potential bias related to inadequate concealment of the allocation sequence before participants were enrolled and assigned to their interventions (Table 4).

Figure 3.

Fragility Quotient of Significant Outcomes With Respect to Year of Publication

Table 4.

Bias Assessment for Included Studies Evaluated Using Revised Cochrane Risk-Of-Bias Tool for Randomized Trials

Study	Domain 1: Risk of bias arising from randomization process	Domain 2: Risk of bias due to deviations from the intended interventions	Domain 3: Risk of bias due to missing outcome data	Domain 4: Risk of bias in measurement of the outcome	Domain 5: Risk of bias in selection of the reported result	Overall risk of bias
Gadjradj et al	Low	Low	Low	Low	Low	Low
Park et al	Low	Low	Low	Low	Low	Low
Yue et al	Low	Low	Low	Low	Low	Low
Du et al	Low	Low	Low	Low	Low	Low
Park et al	Low	Low	Low	Low	Low	Low
Li et al	Low	Low	Low	Low	Low	Low
Zhang et al	Low	Low	Low	Low	Low	Low
Qian et al	Low	Low	Low	Low	Low	Low
Ran et al	Low	Low	Low	Low	Low	Low
Pan et al	Low	Low	Low	Low	Low	Low
Ye et al	Low	Low	Low	Low	Low	Low
Chen et al	Low	Low	Low	Low	Low	Low
Kang et al	Low	Low	Low	Low	Low	Low
Chen et al	Low	Low	Low	Low	Low	Low
Kotheeranurak et al	Low	Low	Low	Low	Low	Low
Tao et al	Low	Low	Low	Low	Low	Low
Ying et al	Low	Low	Low	Low	Low	Low
Hussein	Low	Low	Low	Low	Low	Low
Wu et al	Low	Low	Low	Low	Low	Low
Gibson et al	Low	Low	Low	Low	Low	Low
Komp et al	Low	Low	Low	Low	Low	Low
Li et al	Low	Low	Low	Low	Low	Low
Chen et al	Low	Low	Low	Low	Low	Low
Kong et al	Low	Low	Low	Low	Low	Low
Chen et al	Low	Low	Low	Low	Low	Low
Park et al	Low	Low	Low	Low	Low	Low
Hussein et al	Some concerns	Low	Low	Low	Low	Some concerns
Alver et al	Low	Low	Low	Low	Low	Low
Liu et al	Low	Low	Low	Low	Low	Low
Wang et al	Low	Low	Low	Low	Low	Low
Fan et al	Low	Low	Low	Low	Low	Low
Kim et al	Low	Low	Low	Low	Low	Low
Gadjradj et al	Low	Low	Low	Low	Low	Low
Aygun et al	Low	Low	Low	Low	Low	Low
Wu et al	Low	Low	Low	Low	Low	Low
Mo et al	Low	Low	Low	Low	Low	Low
Ahmed et al	Low	Low	Low	Low	Low	Low

Discussion

This study analyzed the fragility of RCTs investigating endoscopic lumbar decompression. Subgroup analysis of each outcome type revealed variable robustness of study findings, with complications, microscopic technique, and revisions and reoperations identified as the most fragile. In contrast, self-reported pain and biportal technique demonstrated the greatest robustness. While progress has been made in examining the statistical fragility in other orthopedic subspecialties,^10,14,15 the spine literature has not received comparable attention. Examining the statistical fragility of lumbar endoscopic decompression clinical trials provides insight into the robustness of the outcomes assessed in pertinent literature.

This study’s median FQ was 0.038 across all outcomes, indicating that in a sample of 100 patients, approximately 4 patient outcomes would need to be reversed to flip the statistical significance. This result demonstrates that RCTs on lumbar endoscopic decompression are statistically fragile. While no FQ threshold signifies fragility, the literature suggests that our result falls under fragile findings, with some studies deeming FQs as high as 8.0% fragile.¹⁶ We observed similar results when examining other spine-related fragility studies. For example, in a comparative analysis of cervical disc arthroplasty and anterior cervical discectomy and fusion, Ortiz-Babilonia et al. reported a median FQ of 0.043, slightly less fragile than the FQ reported in this study.¹⁷ Additionally, Tiao et al examined lumbar disc arthroplasty vs fusion and reported a median FI of 5 with an FQ of 0.022,¹⁸ while Yu et al analyzed vertebroplasty trials and found a median FI of 5 with an FQ of 0.053.¹⁹ Yu et al also noted that nearly 80% of outcomes had more patients lost to follow-up than the FI. These findings are consistent with our results and reinforce that spine surgery RCT outcomes are statistically fragile, as even small numbers of unreported events could overturn significance and alter trial conclusions.

The median FQ for significant outcomes was 0.024, which is more fragile than the median FQ for all outcomes and the nonsignificant outcomes. Significant outcomes tend to be more fragile, especially in RCTs with smaller sample sizes.²⁰ There are many risks associated with fragile significant outcomes, including false confidence in the effectiveness of treatment and an increased risk for false null hypothesis rejection. For instance, an RCT reviewed in this study compared the frequency of unintended durotomy between endoscopic and open discectomy procedures, reporting a P-value of 0.005. Although this may initially appear statistically significant and robust to a reader, the FQ for this specific primary outcome was found to be 0.017, suggesting fragility.¹⁷ While the result may appear statistically convincing, its practical reliability is questionable. Physicians should interpret such findings with caution, considering the fragility of the evidence before adopting 1 approach over another based solely on presented results. Reporting fragility metrics alongside P-values may provide readers with a better sense of the robustness of an RCT outcome which would help guide evidence-based surgical decision-making for lumbar decompression.

Revisions and reoperation provide valuable insight into the safety and success of the initial surgery, as success rates decline with each subsequent procedure.^21,22 With revision surgeries occurring in over 13% of patients undergoing lumbar spine surgery within a 10 year follow-up,²³ the fragility of these outcomes raises concern over their validity. While existing literature often highlights lower complication rates for endoscopic procedures compared to open surgery, the observed statistical fragility suggests that the perceived safety advantages of endoscopic techniques could be overly optimistic.²¹ Complications such as infections, nerve damage, and instability can dramatically influence a patient’s quality of life (QoL), ability to recover,^24,25 and overall satisfaction with their surgical outcomes.²⁶ Although self-reported pain was the most robust outcome, the observed fragility suggests that even a small shift in patient outcomes—just 5.1% of the study sample—could negate statistical significance. Pain relief is the primary goal of endoscopic lumbar decompression surgeries, as these procedures aim to alleviate nerve compression and improve the QoL for patients.¹ Additionally, pain reduction is associated with a variety of postoperative recovery factors, such as decreased morbidity, shorter recovery time, decreased length of opioid use, and lower health-care costs.²⁷ The fragility of revision and reoperation, complications, and pain-related outcomes raises concerns regarding the accurate assessment of these critical surgical measures.

Minimally invasive spine surgery (MISS) has undergone significant evolution over the past few decades. It is becoming more popular as it offers faster recovery times, reduced complications, and improved postoperative outcomes compared to open surgery.²⁸ Microscopic spine surgery, a common MISS procedure, utilizes a microscope to guide visualization through a single portal.²⁹ However, the microscopic technique subgroup was found to be the most fragile, raising concerns regarding their statistical validity given its widespread use in clinical practice, 76.54% of the studies assessing microscopic techniques reported a loss to follow-up that exceeds the FI, which is notable since. As a lack of reporting from these patients could alter the significance of the findings. In contrast, the more recent biportal endoscopic technique is becoming an increasingly popular alternative.³⁰ It utilizes 2 portals, 1 for the endoscope and the other for the surgical tools, offering physicians a direct visualization of the anatomy as they perform the procedure. While slightly more robust, the fragility of the biportal technique raises similar questions about the consistency of its reported benefits. Given that these techniques have been compared in the literature with findings showing no significant difference between the 2,^31-33 ensuring statistically robust results is crucial to validate these outcomes and confirm the clinical viability of both techniques. This is particularly relevant given that complications related to biportal endoscopic spinal surgery may be as high as 8.1%.³⁴ However, the novelty of biportal techniques may account for these limitations, indicating a need for additional comparative trial literature to evaluate clinical significance further.

One concern in fragility analysis is patients who are lost to follow-up. These patients can significantly impact statistical significance, as the patients who are not reporting their outcomes have a possibility to meet the required number of outcome changes (FI) in order to swap the statistical significance of a study’s outcome. It is important to keep this component in mind when analyzing the fragility statistics obtained from this study.

Fragility is becoming an increasingly used tool to measure the reliability of study outcomes — by incorporating fragility assessments early on in the study design; researchers can proactively identify areas of improvement that can increase the robustness of the study. However, various fragility studies over the years continuously suggest that the robustness of study outcomes has not significantly improved.^10,15,35,36 Key factors contributing to persistent fragility include small sample sizes and bias-influenced outcome measures. Underpowered trials are particularly vulnerable to fragile outcomes, as insufficient sample size reduces the likelihood of detecting true effects while also increasing the susceptibility of reported findings to reversal with only a few outcome changes. Thus, power analysis and fragility analysis should be viewed as complementary tools: power reflects the ability to detect an effect, whereas fragility reflects the stability of the reported result. Addressing these issues requires increasing sample sizes to enhance statistical power, adopting objective standardized outcome measures to reduce variability, and implementing stricter blinding protocols to minimize bias. Additionally, using objective criteria allows for less variability and more reliable results, ultimately improving the quality of conclusions drawn related to surgical outcomes such as pain, compilations, and reoperation, ultimately leading to more beneficial patient results.

Limitations

This study has limitations that should be taken into account. First, this fragility analysis is limited to dichotomous outcomes, reducing its applicability to studies with continuous measures. Alternative fragility metrics, such as the continuous fragility index, may be more appropriate. Additionally, while widely accepted methodologies and parameters are clearly defined for dichotomous fragility outcomes, there is no consensus on standardized thresholds for the FI and FQ measures. This lack of standardization introduces variability in the interpretation of these outcomes, which can then impact the comparability of our findings. Furthermore, our analysis focuses on outcome fragility and P-values, without accounting for study design limitations, sample size, inclusion and exclusion criteria, and blinding protocols. This indicates that our assessment of fragility is based solely on statistical measures without considering methodological factors that could impact the reliability of the studies included in our systematic review. As a result, the true robustness of the evidence may be underestimated, and potential biases introduced by study design limitations may be overlooked, affecting the overall validity of our conclusions.

Despite these limitations, our study highlights the importance of the fragility of outcomes associated with lumbar decompression. Clinicians and researchers should approach lumbar decompression outcomes cautiously, given the fragility of the outcomes discovered in this study. The fragility of the outcomes reveals a necessary shift to be made in RCT design, and future research on lumbar decompression should consider the potential influence of patient loss to follow-up and incorporate larger sample sizes.

Conclusion

This systematic review is the first to analyze the fragility of lumbar decompression outcomes in RCTs. The significance of the fragility of these outcomes necessitates caution in interpreting the outcomes of lumbar decompression in guiding clinical decision-making. Integrating fragility metrics into current study designs will ensure the robustness of the study and that even small changes in sample size and follow-up will not reverse the significance of the findings. This approach will lead to enhanced patient outcomes and improve the reliability of research results by highlighting the fragility of outcomes and emphasizing the importance of robust study designs. This shift will have profound implications for the field, encouraging more consistent evidence and fostering greater confidence in clinical decision-making and scientific advancements.

Supplemental Material

Supplemental Material - Statistical Fragility of Endoscopic Lumbar Decompression Outcomes: A Systematic Review of Randomized Controlled Trials

Supplemental Material for Statistical Fragility of Endoscopic Lumbar Decompression Outcomes: A Systematic Review of Randomized Controlled Trials by Kareem S. Mohamed, Alexander Yu, Yazan Alasadi, Prabhjot Singh, Luca Valdivia, Avanish Yendluri, Junho Song, Nikan K. Namiri, John Corvi, Samuel K. Cho in Global Spine Journal

Footnotes

ORCID iDs

Kareem S. Mohamed

Alexander Yu

Prabhjot Singh

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Disclosures

Samuel K. Cho, MD, FAAOS. AAOS: Board or committee member. American Orthopaedic Association: Board or committee member. AOSpine North America: Board or committee member. Cervical Spine Research Society: Board or committee member. Globus Medical: IP royalties and Fellowship support. North American Spine Society: Board or committee member. Scoliosis Research Society: Board or committee member. Stryker: Paid consultant. Cerapedics: Fellowship support.

Supplemental Material

Supplemental material for this article is available online.

References

Panjeton

Brown

Searcy

Meroney

Kumar

. Endoscopic spinal decompression: a retrospective review of pain outcomes at an academic medical center. Cureus. 2021;13(10):e19112.

Ahn

. Current techniques of endoscopic decompression in spine surgery. Ann Transl Med. 2019;7(Suppl 5):S169.

Chad

. Full-endoscopic versus microscopic spinal decompression for lumbar spinal stenosis: a systematic review & meta-analysis. Spine J. 2024;24(6):1022-1033.

Yang

Wang

. Comparative effects and safety of full-endoscopic versus microscopic spinal decompression for lumbar spinal stenosis: a meta-analysis and statistical power analysis of 6 randomized controlled trials. Neurospine. 2022;19(4):996-1005.

Leopold

Porcher

. Editorial: threshold P values in orthopaedic Research-We know the problem. What is the solution? Clin Orthop Relat Res. 2018;476(9):1689-1691.

Lin

Xing

Chu

, et al. Assessing the robustness of results from clinical trials and meta-analyses with the fragility index. Am J Obstet Gynecol. 2023;228(3):276-282.

Feinstein

. The unit fragility index: an additional appraisal of “statistical significance” for a contrast of two proportions. J Clin Epidemiol. 1990;43(2):201-209.

Ruelos

VCB

Masood

Puzzitiello

, et al. The reverse fragility index: RCTs reporting non-significant differences in failure rates between hamstring and bone-patellar tendon-bone autografts have fragile results. Knee Surg Sports Traumatol Arthrosc. 2023;31(8):3412-3419.

Proal

Moon

Kwon

. The fragility index and reverse fragility index of FDA investigational device exemption trials in spinal fusion surgery: a systematic review. Eur Spine J. 2024;33(7):2594-2603.

10.

Yendluri

Chiang

Linden

, et al. The fragility of statistical findings in the reverse total shoulder arthroplasty literature: a systematic review of randomized controlled trials. J Shoulder Elb Surg. 2024;33(7):1650-1658.

11.

Megafu

Mian

, et al. The statistical fragility of outcomes in calcaneus fractures: a systematic review of randomized controlled trials. Foot. 2023;57:102047.

12.

Page

McKenzie

Bossuyt

, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71.

13.

Sterne

JAC

Savović

Page

, et al. RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ. 2019;366:l4898.

14.

Yendluri

Megafu

Wang

, et al. The fragility of statistical findings in the femoral neck fracture literature: a systematic review of randomized controlled trials. J Orthop Trauma. 2024;38(6):e230-e237. doi:10.1097/BOT.0000000000002793

15.

Locke

Koehne

Yendluri

, et al. The statistical fragility of patellar resurfacing in total knee arthroplasty: a systematic review of randomized controlled trials. J Arthroplast. 2024;40:795-801. doi:10.1016/j.arth.2024.09.008 Published online September 24.

16.

Imbergamo

Sequeira

Patankar

Means

Stein

. The statistical fragility of studies on rotator cuff repair with graft augmentation. J Shoulder Elb Surg. 2023;32(5):1121-1125. doi:10.1016/j.jse.2022.12.017

17.

Ortiz-Babilonia

Gupta

Cartagena-Reyes

, et al. The statistical fragility of trials comparing cervical disc arthroplasty and anterior cervical discectomy and fusion: a Meta analysis. Spine. 2024;49(10):708-714. doi:10.1097/BRS.0000000000004756

18.

Tiao

Song

Hoang

, et al. The statistical fragility of lumbar disc arthroplasty vs lumbar fusion: a systematic review of randomized controlled trials. Glob Spine J. 2025;0(0):21925682251368313. doi:10.1177/21925682251368313 Published online August 9.

19.

Mohamed

Kurapatti

, et al. The statistical fragility of vertebroplasty outcomes: a systematic review of randomized controlled trials. J Craniovertebral Junction Spine. 2025;16(1):26-33.

20.

Andrade

. The use and limitations of the fragility index in the interpretation of clinical trial findings. J Clin Psychiatry. 2020;81(2):20f13334. doi:10.4088/JCP.20f13334

21.

Antonacci

Zeng

Ford

Wellington

Kia

Zhou

. A narrative review of endoscopic spine surgery: history, indications, uses, and future directions. J Spine Surg. 2024;10(2):295-304.

22.

Cao

Zhao

Wang

Hou

. Revisional endoscopic foraminal decompression via modified interlaminar approach at L5-S1 after failed posterior instrumented lumbar fusion in elderly patients. Bioengineering. 2023;10(9):1097. doi:10.3390/bioengineering10091097

23.

Noh

Cho

Kim

Lee

Kim

. Risk factors for reoperation after lumbar spine surgery in a 10-year Korean national health insurance service health examinee cohort. Sci Rep. 2022;12(1):1-9.

24.

Stulz

Pfeiffer

. Peripheral nerve injuries resulting from common surgical procedures in the lower portion of the abdomen. Arch Surg. 1982;117(3):324-327.

25.

Dencker

Bonde

Troelsen

Varadarajan

Sillesen

. Postoperative complications: an observational study of trends in the United States from 2012 to 2018. BMC Surg. 2021;21(1):393.

26.

Woodfield

Deo

Davidson

Chen

TYT

van Rij

. Patient reporting of complications after surgery: what impact does documenting postoperative problems from the perspective of the patient using telephone interview and postal questionnaires have on the identification of complications after surgery? BMJ Open. 2019;9(7):e028561.

27.

Gan

. Poorly controlled postoperative pain: prevalence, consequences, and prevention. J Pain Res. 2017;10:2287-2298.

28.

Choi

Park

Kim

Yeom

. Recent updates on minimally invasive spine surgery: techniques, technologies, and indications. Asian Spine J. 2022;16(6):1013-1021.

29.

Mayer

. A history of endoscopic lumbar spine surgery: what have we learnt? BioMed Res Int. 2019;2019:4583943.

30.

Kavishwar

Hyeun

. A narrative review of current and future of unilateral biportal endoscopic (UBE) transforaminal lumbar interbody fusion. Semin Spine Surg. 2024;36(1):101084.

31.

Chen

Zhou

Chen

Yao

Liu

. Biportal endoscopic decompression vs. microscopic decompression for lumbar canal stenosis: a systematic review and meta-analysis. Exp Ther Med. 2020;20(3):2743-2751.

32.

Park

Jang

, et al. Biportal endoscopic versus microscopic lumbar decompressive laminectomy in patients with spinal stenosis: a randomized controlled trial. Spine J. 2020;20(2):156-165. doi:10.1016/j.spinee.2019.09.015

33.

Librianto

Ipang

Saleh

, et al. Comparison of microscopic decompression and biportal endoscopic spinal surgery in the treatment of lumbar canal stenosis and herniated disc: a one-year Follow-up. Open Access Maced J Med Sci. 2022;10(B):1188-1194.

34.

Chen

Zhou

Wang

Liu

Luo

. Complications of unilateral biportal endoscopic spinal surgery for lumbar spinal stenosis: a meta-analysis and systematic review. World Neurosurg. 2023;170:e371-e379.

35.

Parisien

Trofa

O’Connor

, et al. The fragility of significance in the hip arthroscopy literature: a systematic review. JB JS Open Access. 2021;6(4):e21.00035. doi:10.2106/JBJS.OA.21.00035

36.

Parisien

Constant

Saltzman

, et al. The fragility of statistical significance in cartilage restoration of the knee: a systematic review of randomized controlled trials. Cartilage. 2021;13(1_suppl):147S-155S.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.01 MB