Abstract
Study Design
Cross-sectional.
Objectives
Laminoplasty (LP) and laminectomy with fusion (LF) are surgical approaches for degenerative cervical myelopathy (DCM), but their comparative effectiveness remains controversial. Systematic reviews and meta-analyses are crucial for guiding clinical decision-making; however, spin bias, which distorts results and misleads readers, is prevalent in the orthopaedic literature, and can hinder clinical decision making. We aim to determine the prevalence of spin bias and assess the methodological quality of systematic reviews and meta-analyses comparing LP with LF in the treatment of DCM.
Methods
We systematically searched the PubMed, Web of Science, and Embase databases, identifying systematic reviews and meta-analyses comparing LP and LF for DCM. Spin bias was assessed using the Yavchitz classification system, and methodological quality was graded using the AMSTAR-2 tool.
Results
Fourteen studies met the inclusion criteria. Spin bias was identified in 64% of the reviews, with the most common type being Type 9 (inappropriately claiming superiority of treatment despite reporting bias). Spin types 3, 4, and 6 were each present in 14% of studies. AMSTAR-2 quality assessments rated 21% of studies as critically low, 36% as low, and 43% as moderate; none achieved a high-quality rating.
Conclusion
Spin bias is prevalent in systematic reviews and meta-analyses comparing LP and LF for DCM, and the overall methodological quality remain suboptimal. Addressing spin bias and improving adherence to rigorous reporting standards are essential to enhance the reliability of evidence guiding clinical decision-making.
Introduction
Degenerative cervical myelopathy (DCM) is characterized by spinal cord compression due to degenerative changes in the cervical spine. It manifests as neurologic deficits, including gait disturbances, decreased hand dexterity, motor and sensory deficits, and bowel/bladder dysfuction. 1 A study in North America estimated the prevalence to be 605 per million. 2
Surgical intervention is often indicated for patients with moderate or severe disease, typically characterized by the modified Japanese Orthopaedic Association scale (mJOA). 3 Laminoplasty (LP) and laminectomy with fusion (LF) are two widely-used surgical techniques; however, the question of which approach is superior remains without consensus. While surgical decision-making is multifactorial and influenced by factors such as clinical presentation, surgeon preference, and regional practice patterns, some studies have favored LP due to lower rates of postoperative C5 nerve palsy, whereas others have supported LF for its potential to better maintain cervical lordosis at long-term follow up.4,5 Most studies, however, report no significant difference in clinical outcomes between these techniques, leaving the debate unresolved.6,7
Spine surgeons require evidence-based recommendations to guide patient care. Systematic reviews and meta-analyses are particularly valuable, as they combine data from multiple studies to provide a comprehensive summary of the existing literature.8,9 However, the reliability of these reviews depends on their accuracy in reflecting the findings of individual studies—an especially critical consideration for controversial topics like surgical approaches for DCM. One factor that can compromise the integrity of systematic reviews is spin bias, a form of bias whereby results are distorted, potentially misleading readers either intentionally or unintentionally. Evidence suggests that spin bias is prevalent in orthopaedic literature, with rates as high as 93%.10-13
Given the high prevalence of spin bias and the ongoing debate over the optimal surgical approach for DCM, the primary objective of this study is to evaluate the prevalence of spin bias in systematic reviews and meta-analyses comparing LP and LF for the treatment of cervical myelopathy. This study will use the spin classification system established by Yavchitz et al. 14 Additionally, we aim to assess the methodological quality of these studies using the AMSTAR-2 survey, a 16-item checklist designed to critically evaluate the quality of systematic reviews. 15 Based on previous findings in the orthopaedic literature, we hypothesize that at least 40% of the included reviews will exhibit spin bias.
Methods
Search Process and Inclusion Criteria
A search was conducted using PubMed, Web of Science, and Embase using the search term ((systematic review OR meta-analysis) AND (laminoplasty) AND (laminectomy)), the results of which were imported into Covidence for screening. Two independent reviewers initially screened the articles by title and abstract. Articles that passed the initial screening were subsequently reviewed for final inclusion using the full texts. Any conflicts between the two reviewers were resolved by a third independent reviewer. Only articles meeting all of the following criteria were included in the ensuing analysis: 1) systematic review with or without meta-analysis; 2) cervical spondylotic myelopathy pathology and/or ossified posterior longitudinal ligament diagnosis; 3) Treatment with laminectomy with fusion and laminoplasty; 4) Comparison of clinical or radiographic outcomes between laminectomy with fusion and laminoplasty cohorts; 5) Available in English.
Data Collection
Included articles were graded by two independent reviewers for the nine most severe types of spin, as defined by Yavchitz et al, in 2016. 14 We found the nine most severe types of spin to be the most objective, therefore minimizing subjectivity during the grading process. By including only the most objective forms of spin, we aim to improve the reliability of the study. For each paper, the presence or absence of these nine types of spin were recorded, as well as the publishing journal, journal impact factor, year of publication, yearly citations, total citations, and any funding sources. These data were collected to investigate potential associations between various paper attributes and the presence of spin. All data was collected using an Excel spreadsheet in November 2024. Following initial data collection, studies were graded using A Measurement Tool to Assess Systematic Reviews 2 (AMSTAR 2) criteria. This tool consists of 16 items that assess the quality of a systematic review’s methodology; these data were also collected to explore potential associations with the presence of spin. All data were collected in December 2024.
Statistical Analysis
The collected data were analyzed in three ways. First, the potential existence of associations between AMSTAR 2 category, yearly citations, and journal impact factor and the number of types of spin present (0-9) were independently determined using linear regression modeling. Next, determination of an association between the AMSTAR 2 category, yearly citations, and journal impact factor with the presence or absence of spin was performed using binary logistic regression analysis. Lastly, multiple regression modeling was used to test whether the presence of spin could be predicted given AMSTAR 2 rating, journal impact factor, and yearly citations. Interrater reliability was assessed using the kappa statistic. All data were analyzed using IBM SPSS 29.
Results
The search resulted in 302 articles, of which 14 met the inclusion criteria (Figure 1).4,5,7,16-26 The included studies were published between 2013 and 2024. Journal impact factors ranged from 1.3 to 3.8, with an average of 2.47. Of the included studies, three received funding. The number of times each article was cited by another article ranged from 0 to 105 with an average of 30.29. Study selection process.
Spin Bias Prevalence
Nine Most Severe Types of Spin per Yavchitz et al.

Total types of spin by article.
Interrater reliability, assessed using the kappa statistics, demonstrated moderate to near-substantial agreement (kappa = 0.571, P = .031).
AMSTAR-2 Rating
Characteristics of Studies which Did and Did Not Contain Spin.
Multivariate Regression Models
In our first model, after employing linear regression, we found that impact factor
Discussion
This study aimed to evaluate the prevalence of spin bias and to assess the quality of systematic reviews and meta-analyses comparing the outcomes of cervical laminoplasty with laminectomy and fusion for DCM. We identified spin bias in 9 out of the 14 articles, with all studies receiving AMSTAR-2 quality grades ranging from critically low to moderate. None were deemed high quality. This is the first study to demonstrate both the high prevalence of spin bias and the overall quality limitations in systematic reviews and meta-analyses addressing this debated topic. These findings underscore the need for improved methodological rigor to enhance the credibility of recommendations in future studies.
In this study we employed methodology similar to prior studies investigating spin bias in orthopaedic literature, all of which have shown consistent reliability among raters.27,28 Our findings of a 64% prevalence of spin aligns with previous literature reporting spin prevalence rates ranging from 28 to 93.1%.9,10,12,13,27-30
The most common type of spin found in our analysis was Type 9, wherein authors claim the superiority of a treatment despite reporting bias. For instance, one meta-analysis concluded in the abstract that cervical laminoplasty was superior due to fewer complications and lower rates of C5 nerve palsy compared to laminectomy and fusion, even though the results section of the main text acknowledges significant publication bias influencing the findings. 19 Similarly, the abstract of another meta-analysis favored laminoplasty for its better postoperative range of motion, with shorter operative time, less blood loss, and lower rates of postoperative complications, and C5 nerve palsy, but the article’s results section reported statistically significant heterogeneity in its findings. 22 Another abstract of a meta-analysis concluded laminoplasty’s superiority based on lower rates of C5 nerve palsy, yet the discussion section noted significant heterogeneity in the primary studies that may have biased the meta-analysis, advising cautious interpretation of these results. 18
Other prevalent spin types included Type 3, Type 4, and Type 6, each identified in 14% of the studies. Type 3 spin is when there is selective overemphasis on the efficacy of the experimental treatment. One review emphasized that laminectomy with fusion had better outcomes, including greater improvement in postoperative Japanese Orthopaedic Association (JOA) scores, but in the discussion acknowledged weak significance of these findings, and suggested that surgeons should cautiously consider the marginal superiority of laminectomy with fusion for these findings. 24 Although the abstract did note the weak significance, the overemphasis on efficacy could mislead readers. Type 4 spin is when the conclusion claims the safety of a treatment based on non-statistically significant data with wide confidence intervals. For instance, a meta-analysis concluded that laminoplasty reduced C5 nerve palsy and axial pain risk, but primary studies had significantly wide confidence intervals, including [0.23-89.62], [0.07-16.45], and [0.30-109.58]. 23 Similarly, another meta-analysis claimed that laminectomy with fusion better maintained cervical lordosis at long term follow-up, despite non-significant differences in cervical curvature index (CCI) at long term follow up (95% CI −2.20 to 0.23, P = .112). 4 Type 6 spin is characterized by the selective reporting to favor the safety of the experimental treatment. One systematic review highlighted in the abstract lower complication rates for laminectomy with fusion, though in the discussion they emphasized the weak association between these findings, and suggested that there is no conclusive evidence to support the superiority of one treatment over the other. 24
Quality evaluation using the AMSTAR-2 survey resulted in a quality score of critically low, low, and moderate. While six out of the fourteen studies (43%) were graded as moderate quality, none achieved a high-quality score. This was largely due to the lack of reporting on funding sources for primary studies and the absence of a list of excluded studies—both non-critical flaws that do not undermine study conclusions. However, more severe issues, such as unaddressed heterogeneity or bias influencing study outcomes, were frequent among lower-rated studies. These critical flaws prevented such studies from achieving higher quality scores. For example, studies often acknowledged significant heterogeneity or bias in their results sections but failed to explain how these factors influenced their conclusions. Such omissions not only hinder the methodological rigor but also diminish the reliability of the recommendations derived from these studies. Although the relatively small number of studies contain spin (n = 9) limited our ability to conduct robust statistical analyses correlating AMSTAR-2 rating with spin bias, prior research has shown such associations. Specifically, Hwang et al 31 found that a critically low AMSTAR-2 score was associated with a higher risk of Type 9 spin, while higher AMSTAR-2 scores were associated with a lower risk of Types 4 and 5 spin. These findings suggest that reviews with lower methodological quality are more susceptible to introducing spin bias, further emphasizing the importance of critical appraisal when interpreting systematic reviews and meta-analyses.
Due to the efficiency with which systematic reviews synthesize data from multiple studies, spine surgeons often rely on these reviews to inform clinical decision-making. Consequently, the presence of spin bias can significantly influence surgical choices. Boutron et al 32 demonstrated that physicians favor treatments when spin is present in the literature. In the setting of LP vs LF, such bias may impair the ability to select the most appropriate surgical approach for a given patient. For instance, a surgeon aiming to preserve spinal alignment may favor LF after reading a meta-analysis abstract that reports superior maintenance of cervical lordosis–even though the main text reveals that this difference was not statistically significant. 4 Similarly, in a patient with multiple comorbidities, a surgeon may choose LP based on an abstract citing lower intraoperative blood loss and fewer postoperative complications. However, the main text of the same study describes substantial heterogeneity in these outcomes, undermining the reliability of the reported benefits.19,22 Notably, among the articles identified with spin bias in our analysis, the majority favored LP in their abstracts, suggesting a potential bias towards this procedure–an issue that may further complicate evidence-based clinical decision-making.
Strategies have been proposed to help mitigate the presence of spin bias in the medical literature. Researchers often face pressure to publish high volumes of research, leading them to demonstrate findings in such a way to elevate the probability of publication. However, a randomized trial suggests that cautious claims regarding correlational findings can still attract reader interest without diminishing their impact. 33 Thus, we must promote a research culture that rewards high-quality methodology and transparent conclusions, regardless of whether the results are groundbreaking. Furthermore Chiu et al 34 has proposed specific safeguards against spin bias. Peer reviewers and journal editors should thoroughly examine both the abstract and main text of studies to ensure that conclusions match the results. This includes avoiding casual language and overgeneralizations that could mislead readers. Secondly, journal editors should request the raw data of all submitted research and scrutinize P-values, odds ratios, and confidence intervals to validate the significance of conclusions.
There are also practical strategies that spine surgeons can implement to mitigate the influence of spin bias in the literature. First, surgeons should carefully examine the results sections of the full text to ensure that the conclusion in the abstract accurately reflect the underlying data. Whenever possible, findings should be evaluated across multiple studies rather than consulting one review or meta-analysis. Comparing study conclusions to evidence-based guidelines from professional organizations, such as the North American Spine Society and AOSpine, can also provide additional context and validation. Enraging in journal club discussions with colleagues is another effective way to promote critical appraisal and identify potential biases. 35 Additionally, prior research has suggested that higher impact journals may be associated with lower rates of spin bias. 11 Therefore, studies published in high-impact journals may be more reliable and should be utilized when possible. Finally, when evaluating industry-sponsored studies, surgeons should exercise heightened scrutiny–carefully examining objective data and remining cautious of potentially overstated or selectively reported conclusions. 35
This study is not without limitations. Due to the subjective nature of identifying spin bias and grading the studies, we may have over or under-emphasized the incidence of spin. This was mitigated by utilizing methodology from prior spin studies, including independent evaluation by two reviewers and arbitration by a third in case of disagreements. Inter-rater reliability, assessed by the kappa statistic, demonstrated moderate to near-substantial agreement. However, the inherent subjectivity of the grading system still allows for human error and may limit the generalizability of our findings. For example, studies assessed by reviewers with differing clinical backgrounds, linguistic interpretations, or surgical technique preferences may yield different spin classifications. Furthermore, some studies did not differentiate between types of laminoplasty (e.g., open-door vs expansive), potentially limiting the granularity of our analysis. Additionally, although our multivariable regression analysis revealed near-significant trends suggesting a potential correlation between AMSTAR-2 ratings and the presence of spin bias, these findings may have been influenced by the limited sample size and should be interpreted with caution. Validation in larger cohorts is warranted. Furthermore, the relatively small sample size of 14 articles may have influenced the reported incidence of spin and limited generalizability. Lastly, many studies on LP originate from Asian countries, and our restriction to English-language publications may have excluded relevant studies, potentially introducing selection bias.
Conclusion
Spin bias was identified in 64% of systematic reviews and meta-analyses comparing cervical laminoplasty with laminectomy and fusion for cervical myelopathy. These studies achieved AMSTAR-2 quality ratings ranging from critically low to moderate, with none achieving high quality. To address this, future research should explore strategies to mitigate spin bias in the orthopaedic literature and enhance the quality of published research.
Footnotes
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Henry Avetisian, Kevin Matthew, Annika Myers, Apurva Prasad, Jordan O. Gasho, and William Karakash have nothing to disclose. Jeffrey C. Wang has received intellectual property royalties from Zimmer Biomet, NovApproach, SeaSpine, and DePuy Synthes. Raymond J. Hah has received grant funding from SI bone, consulting fees from NuVasive, and support from the North American Spine Society to attend meetings. Ram K. Alluri has received grant funding from NIH, consulting fees from HIA Technologies, and payment from Eccential Robotics for lectures and presentations.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Data Availability Statement
Data is not publicly available but can be available upon request.
