Abstract
Study Design
Cross sectional.
Objective
Spin bias, where authors distort findings to overstate efficacy, is prevalent in the medical literature. The comparative superiority of polyetheretherketone (PEEK) and titanium (Ti) cages in spinal fusion remains controversial. This study aims to assess the prevalence of spin bias in meta-analyses and systematic reviews comparing PEEK vs Ti cages in spinal fusion.
Methods
The PubMed, Embase, and Web of Science databases were searched to identify meta-analyses and systematic reviews comparing PEEK and titanium cages in spinal fusion. Included studies were assessed for the presence of the 9 most severe types of spin bias. This study also graded the quality of these articles using A Measurement Tool to Assess systematic Reviews 2 (AMSTAR 2) criteria.
Results
The search resulted in 2352 articles, of which 13 met the inclusion criteria. Spin bias was identified in 8/13 (61.54%) of the included studies, with the most prevalent types being Type 3 (38.46%) and Type 5 (30.77%). Using AMSTAR 2, 1/13 (7.69%) studies were rated as critically low quality, 4/13 (30.77%) as low, 8/13 (61.54%) as moderate, with none rated as high.
Conclusions
Spin was found in 61.54% of the reviews comparing PEEK and Ti cages in spinal fusion, with none achieving a high-quality rating. Surgeons must critically evaluate these articles for bias prior to utilizing them in clinical decision making.
Introduction
Interbody fusions are commonly employed to add foraminal height, achieve indirect decompression, and enhance spine fusion. 1 While various cervical and lumbar approaches exist, all interbody fusions involve discectomy, endplate preparation, and implant placement, typically using a graft, cage, or spacer. 1
Early attempts at vertebral fusion using bone grafts were limited by high rates of collapse and pseudarthrosis. However, the development of intervertebral cages has significantly mitigated these complications. 2 Titanium (Ti) and polyetheretherketone (PEEK) are currently the 2 most commonly used materials in interbody cage construction, each with distinct advantages and limitations. 3 Ti is valued for its biocompatibility and ability to promote bone growth through surface modifications, but its elastic modulus mismatch with bone and high radiodensity can lead to complications such as stress shielding, subsidence, and difficulties in imaging fusion. 3 PEEK cages offer radio transparency, allowing for better visualization on postoperative MRI, and high durability and favorable wear properties.4,5 They also possess a modulus more similar to native bone, which typically leads to lower rates of subsidence. 6 While Ti remains the more widely used material for interbody cage construction, PEEK’s radiolucency makes it preferable in specific indications, such as oncology and spondylodiscitis, where clear visualization of fusion is critical. 7 However, the research on clinical outcomes, especially regarding fusion rates between titanium and PEEK cages, shows mixed results. Various studies have reported conflicting findings, with some indicating superior fusion rates for PEEK, others favoring titanium, and some finding no significant differences between the 2 materials.8-10
Spine surgeons rely on the literature to guide clinical decision making, such as choosing the appropriate cage for their patients. Due to time constraints, they frequently consult abstracts rather than full texts. Therefore, it is essential that both the abstract and full texts remain free of bias. One such type of bias commonly found in the medical literature is spin, where authors distort the results to overstate efficacy or understate harm, particularly in the abstracts of systematic reviews and meta-analyses.
This study aims to quantify and analyze spin in the abstracts of systematic reviews and meta-analyses comparing radiographic and clinical outcomes between Ti and PEEK cages in spinal fusion. Additionally, we will assess the quality of these studies using A Measurement Tool to Assess systematic Reviews 2 (AMSTAR 2) criteria. 11 Based on prior findings of high spin rates in orthopaedic research, we hypothesize that more than 30% of the studies comparing Ti vs PEEK cages in spinal fusion will contain at least 1 form of spin.12-14
Methods
Training
This study adhered to the guidelines defined by Preferred Reporting Items for Systematic reviews and Meta-analyses (PRISMA). 15 Article reviewers (AP and KM) were trained by HA to ensure consistency in article selection and spin bias assessment. Training included instruction on inclusion criteria and spin bias evaluation. Specifically, reviewers were provided with an example of an article that met the inclusion criteria to guide them in identifying eligible studies during screening. For spin bias identification, reviewers were trained by analyzing a previously published spin study in the spine literature and replicating its findings to enhance reliability in detecting spin bias. 12 Following training, a search was conducted by HA on March 3, 2025 using PubMed, Embase, and Web of Science with a keyword search for: ((systematic review OR meta-analysis) AND (Titanium) AND (PEEK OR Polyetheretherketone) AND (Spine fusion)).
Inclusion Criteria
All systematic reviews and meta-analyses comparing Polyetheretherketone (PEEK) and Titanium cages in spinal fusion were screened for inclusion by 2 independent reviewers via Covidence. Inclusion criteria included: (1) a systematic review (with or without a meta-analysis); (2) treatment with spinal fusion surgery; (3) comparison of Polyetheretherketone (PEEK) and Titanium cages; (4) available in English. No exclusions were made based on the date of publication or country of origin.
Review Process
In 2016, Yavchitz et al defined 21 types of spin and ranked them in order of severity. 16 For this study we chose to evaluate the included articles for only the 9 highest-ranked types of spin, for 2 reasons. For 1, since these types of spin are ranked as the most severe, we believe them to be the most important to address in the current literature. Additionally, these 9 types of spin are the most well-defined, minimizing subjective interpretation by the 2 independent reviewers. The approach of assessing only these 9 most severe spin types has also been employed in previous orthopaedic spin studies.12-14 All included articles were screened for spin bias by 2 independent reviewers. Disputes were resolved through arbitration from a third reviewer.
After being evaluated for spin, each article was graded using the AMSTAR 2 (A Measurement Tool to Assess Systematic Reviews 2) criteria. 11 AMSTAR 2 measures the methodological quality of systematic reviews, calculating a score that ranges from critically low to high based on the presence or absence of specific objective benchmarks. 11 Lastly, the journal, year of publication, journal impact factor, funding source, yearly citations, and the number of times cited by other studies were recorded for each article. This data was collected on March 3, 2025 and was included to explore potential association with the presence of spin.
Statistical Analysis
Covidence was used to organize, screen, and conduct a full text review of the aforementioned articles. Spin and AMSTAR 2 categorizations were collected and organized using an Excel spreadsheet. Subsequently, a linear regression model was used to independently determine the presence or absence of an association between AMSTAR 2 rating, yearly number of citations, and journal impact factor with the quantity of spin (0-9) found in abstracts. In addition, logistic regression modeling was used to determine whether AMSTAR 2 rating, yearly citations, and/or journal impact factor were associated with the presence of spin in abstracts. Finally, multiple regression modeling was used to determine whether the presence or absence of spin could be predicted using AMSTAR 2 rating, journal impact factor, and yearly citations. All data were analyzed using IBM SPSS 29.
Results
The literature search resulted in 2352 articles (Figure 1). Of these, 13 met the inclusion criteria.6-10,17-24 The year of publication ranged from 2017 to 2025. The impact factors for the journals ranged from 1.17 to 3.8, with an average of 2.31. The range for the number of times each article was cited by another study was 0 to 83, with an average of 13.58. 10/13 of the studies included a meta-analysis. 3/13 of the studies received funding (Supplemental Table 1). PRISMA flowchart.
Spin Rating
Nine Most Severe Types of Spin.

Total number of spin by article.
AMSTAR 2 Rating
Characteristics of Studies Which Did and Did Not Contain Spin.
Multivariate Regression Modeling
Our first model utilized linear regression, with total types of spin serving as our dependent variable. The AMSTAR category (P = .797), citations per year (P = .791), as well as impact factor (P = .548) all fell short of significance in predicting total types of spin. In our second model, we utilized a binary logistic regression to predict presence of spin. Similarly, this model did not significantly associate AMSTAR category (P = .328), citations per year (P = .889), or impact factor (P = .550) with the presence of spin. Lastly, in our multiple regression model, we did not find AMSTAR category (P = .416), citations per year (P = .918), or impact factor (P = .629) to significantly predict presence of spin.
Discussion
As intraoperative technologies advance, and more options are brought to the market, surgeons increasingly rely on the literature to select the most appropriate devices for each patient. Therefore, it is vital to ensure that these studies maintain high quality and remain free from reporting biases such as spin. In this study, we utilized a grading process similar to those employed in prior orthopaedic spin studies, which have demonstrated strong interrater reliability.12-14 By systematically examining spin bias, we aimed to enhance the credibility of both the full texts and abstracts of systematic reviews and meta-analyses comparing PEEK vs titanium cages in spinal fusion. This is particularly important as spinal fusion has become one of the most common surgical procedures and is expected to continue to rise with the aging population. 25
We found that 8 out of the 13 (61.54%) included studies exhibited at least one of the 9 most severe forms of spin. This alarming rate aligns with previous orthopaedic literature, where spin is reported to occur in 28% to 93.1% of the studies.13,14,26-28 The most prevalent type of spin was Type 3 spin, characterized by the selective reporting of or overemphasis on efficacy outcomes or analysis favoring the beneficial effect of the experimental intervention. It was found in 5/13 (38.46%) of the included studies. For example, in a meta-analysis, the abstract concluded that titanium cages were superior to PEEK cages due to higher rates of spinal fusion. However, the results section revealed no significant difference in final follow-up fusion rates between the 2 groups, while titanium cages had higher rates of subsidence. 10 Similarly, in another study, the authors concluded that Titanium cages are superior due to higher fusion rates, yet at the final follow-up, there was no difference in fusion rates, complications rates, or reoperation rates between the 2 groups. 17
The second most prevalent spin type was Type 5 spin, found in 4 out of 13 (30.77%) of the studies. This type of spin involves claiming a beneficial effect of the experimental treatment despite a high risk of bias in primary studies. For instance, a meta-analysis comparing titanium and PEEK cages for anterior cervical discectomy and fusion (ACDF) favored PEEK in the abstract due to higher subsidence rates for titanium cages. 8 However, the discussion revealed that only one of the primary studies were blinded. Also due to the small sample size (4 articles) of the meta-analysis, the results were prone to be skewed from bias. Another study comparing these cages in ACDF reported higher fusion rates for PEEK in the abstract, but noted significant heterogeneity in study design in the full text, which compromised their ability to perform a reliable meta-analysis. 20 Additionally, another study reported in their abstract that while fusion rates were comparable between titanium and PEEK cages, subsidence rates were higher in the Titanium group. 7 However, in the full text, they reported that 5/6 (83.33%) of the primary studies included in their analysis were of very low quality, with significant heterogeneity in outcomes measures.
The AMSTAR 2 tool, widely used to assess the methodological quality of systematic reviews and meta-analyses, was applied to the included studies. In the analysis, 1/13 of the studies were rated as critically low quality, 4/13 as low quality, and 8/13 as moderate quality. None achieved a high-quality rating. Notably, no studies met the criteria for providing justifications for excluded studies or reporting on funding sources in included studies – both essential for a high-quality rating. Given that industry funding is common in primary studies comparing cage types, the lack of transparency regarding fundings sources represents a significant methodological weakness across all reviews. Conversely, all studies fulfilled the criteria for explaining study design selection and describing included studies in adequate detail. The only study rated as critically low quality was a systematic review, which failed to address key factors such as potential sources of bias, funding, and heterogeneity in the primary studies. 6 The generally low AMSTAR 2 scores across the studies may reflect both inherent weaknesses in the research comparing PEEK and titanium cages in spinal fusion and the tool’s stringent grading criteria. For instance, non-critical methodological shortcomings—such as failing to provide a list of excluded studies, a criterion that none of the studies met—do not necessarily undermine the overall methodological rigor but still preclude a study from receiving a high-quality rating. Conversely, more substantial flaws, such as failing to assess bias in primary studies or address heterogeneity in meta-analysis results, significantly compromise methodological integrity and weaken the reliability of study conclusions.
Several factors contribute to the high prevalence of spin in the medical literature. The immense pressure on researchers to publish frequently in high impact journals may incentivize them to present their findings in such a way that maximizes their likelihood of acceptance. Furthermore, journals currently lack explicit guidelines to detect and prevent spin bias. Authors are also often constrained by strict character limits for abstracts, which can prevent thorough discussions of limitations and biases. Regardless of the cause, spin significantly undermines research quality and poses a serious risk to clinical decision-making, particularly in spine surgery.
Due to time constraints, surgeons often rely primarily on abstracts of systematic reviews, where authors may introduce spin bias by overstating the safety and efficacy of an intervention or emphasizing non-significant findings. This can lead to misinterpretation of data and potentially impact surgical decision-making, particularly when selecting the optimal implant material. 29 Boutron et al demonstrated that spin bias in abstracts influences physician perceptions, with clinicians more likely to rate a treatment as beneficial when spin is present in the abstract. 30 In the context of cage selection, this misinterpretation may lead to suboptimal implant choices, as certain indications favor 1 cage type over another. For instance, PEEK cages are often favored in cases of spondylodiscitis and oncology due to their radiolucency. Inappropriate cage selection could contribute to higher rates of complications such as pseudarthrosis and subsidence, potentially leading to increased reoperation rates and poorer patient outcomes.
To mitigate spin bias, journals must enforce stricter adherence to reporting standards such as PRISMA, which already exists, as well as broader implementation of guidelines like ARRIVE and CONSORT, which have been shown to improve research transparency.31,32 Journal editors should receive formal training to recognize spin bias during the peer-review process and ensure that conclusions accurately reflect the study results. Requiring authors to submit raw data and statistical analyses could further enhance transparency. Chiu et al suggested that journal editors carefully compare abstract conclusions with the full manuscript to prevent causal overstatements and overgeneralization. 33 Additionally, a cultural shift in research is needed, prioritizing high-quality, transparent studies over those that emphasize only positive findings. 33 Notably, a randomized clinical trial demonstrated that cautious reporting of correlational findings does not deter reader interest. 34 From a clinical perspective, surgeons should critically evaluate full manuscripts rather than relying solely on abstracts. This includes scrutinizing the results section to ensure that the conclusions presented in the abstract are supported by statistically significant findings. Additionally, surgeons should cross-reference findings across multiple systematic reviews rather than relying on a single review for clinical decision-making. Clinicians should prioritize studies published in high-impact peer-reviewed journals. Although the current study did not find a correlation between spin bias and journal impact factor, previous research has suggested such an association. 35 Furthermore, surgeons should engage in journal club discussions focused on identifying potential biases and critically appraising study methodology. Lastly, surgeons should be particularly cautious when reviewing industry-sponsored studies, ensuring that, when possible, conclusions align with recommendations from professional spine societies.
This study has limitations. Due to the subjective nature of the AMSTAR 2 and spin grading system, there is the potential for human error. This was mitigated by adopting protocols from previous spin studies and by having 2 authors independently evaluate the articles, with a third for disagreements. The small sample size of included articles and the possibility of missing relevant articles in the literature search may have caused us to over- or under-emphasize the prevalence of spin. Also potentially due to the small sample size, we did not find a significant association between spin bias, and article characteristics including journal impact factor and year of publication, however other studies have noted this association. 35 Furthermore, this study was underpowered, limiting our ability to identify potential associations in statistical analyses. Lastly, the analysis included studies that examined cages made not only of pure PEEK but also of surface-modified PEEK, which may have introduced a confounding variable. Surface modifications may influence fusion rates, biomechanical properties, and subsidence risk, which could, in turn, affect the way study conclusions are framed. This heterogeneity in PEEK cage composition may contribute to variability in spin bias across studies. Future studies evaluating spin bias may benefit from stratifying analyses based on the specific type of PEEK cage used to determine whether spin is more prevalent in studies evaluating pure PEEK vs surface-modified PEEK.
Despite these limitations, this study provides the first systematic assessment of spin bias in literature comparing PEEK and Ti cages–an ongoing and highly debated topic in spine surgery.
Conclusion
We identified spin in 61.54% of the systematic reviews and meta-analyses comparing PEEK vs titanium cages in spinal fusion. Acknowledging and addressing spin bias is essential for improving the integrity of scientific research. Further efforts are needed to develop and implement strategies that reduce spin bias and enhance the quality of published literature.
Supplemental Material
Supplemental Material - Polyetheretherketone vs Titanium Cages in Spinal Fusion: Spin Bias in Abstracts of Systematic Reviews and Meta-Analyses
Supplemental Material for Polyetheretherketone vs Titanium Cages in Spinal Fusion: Spin Bias in Abstracts of Systematic Reviews and Meta-Analyses by Henry Avetisian, Apurva Prasad, Kevin Mathew, David McCavitt, William J. Karakash, Dil Patel, Jeffrey C. Wang, Raymond J. Hah, Ram K. Alluri in Global Spine Journal
Footnotes
Declaration of Conflicting Interest
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Henry Avetisian, Apurva Prasad, Kevin Matthew, David McCavitt, and William Karakash have nothing to disclose. Jeffrey C. Wang has received intellectual property royalties from Zimmer Biomet, NovApproach, SeaSpine, and DePuy Synthes. Raymond J. Hah has received grant funding from SI bone, consulting fees from NuVasive, and support from the North American Spine Society to attend meetings. Ram K. Alluri has received grant funding from NIH, consulting fees from HIA Technologies, and payment from Eccential Robotics for lectures and presentations
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethical Statement
Data Availability Statement
Data is not publicly available but can be available upon request.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
