Abstract
Study Design:
Systematic review.
Objectives:
To assess the methodological quality of systematic reviews and meta-analyses in spine surgery over the past 2 decades.
Materials and Methods:
We conducted independent and in duplicate systematic review of the published systematic reviews and meta-analyses between 2000 and 2019 from PubMed Central and Cochrane Database pertaining to spine surgery involving surgical intervention. We searched bibliographies to identify additional relevant studies. Methodological quality was evaluated with AMSTAR score and graded with AMSTAR 2 criteria.
Results:
A total of 96 reviews met the eligibility criteria, with mean AMSTAR score of 7.51 (SD = 1.98). Based on AMSTAR 2 criteria, 13.5% (n = 13) and 18.7% (n = 18) of the studies had high and moderate level of confidence of results, respectively, without any critical flaws. A total of 29.1% (n = 28) of the studies had at least 1 critical flaw and 38.5% (n = 37) of the studies had more than 1 critical flaw, so that their results have low and critically low confidence, respectively. Failure to analyze the conflict of interest of authors of primary studies included in review and lack of list of excluded studies with justification were the most common critical flaw. Regression analysis demonstrated that studies with funding and studies published in recent years were significantly associated with higher methodological quality.
Conclusion:
Despite improvement in methodological quality of systematic reviews and meta-analyses in spine surgery in current decade, a substantial proportion continue to show critical flaws. With increasing number of review articles in spine surgery, stringent measures must be taken to adhere to methodological quality by following PRISMA and AMSTAR guidelines to attain higher standards of evidence in published literature.
Keywords
Introduction
In the paradigm of evidence-based medicine, systematic reviews and meta-analyses sit at top hierarchy. 1,2 By pooling the results of independent studies, a quantitative meta-analysis increases the precision and statistical power to detect the effect of treatment to the given population. 3 -5 A well-conducted quantitative review may resolve discrepancies between studies with conflicting results. Health care decision makers rely on systematic reviews as one of the key tools to achieve evidence-based health care. 6 Hence maintaining the quality of such evidences remain top priority.
With the increasing number of articles in the field of spine surgery, surgeons depend on systemic reviews and meta-analysis as their primary source of scientific evidence. Dijkman et al 7 showed a 5-fold increase in the number of meta-analyses in the orthopedic literature between 2005 and 2008. Quality of such meta-analyses depends on the quality of the primary studies included in the review and the methodological rigor with which the study was conducted.
The quality of reporting of such well-conducted reviews more accurately reflect authors ability to write in a comprehensible manner rather than the way they are conducted, which highlights the need for guidelines that evaluate the way in which reviews were planned and conducted. Although the guiding principles to assess their methodological quality was given by AMSTAR 8 (A MeaSurement Tool to Assess systematic Reviews), reporting quality guidelines were given by PRISMA 9 (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) and MOOSE 10 (Meta-analyses Of Observational Studies in Epidemiology) statements, reporting, and methodological quality of systematic reviews and meta-analysis remains less than optimal. 11
The objective of this review was to assess the methodological quality of these evidences in spine surgery over the past 2 decades.
Materials and Methods
Our methodology and reporting of systematic review follow PRISMA and AMSTAR 2 guidelines, which consists of 27-item checklist and 16-point assessment, respectively, to help authors improve the conduction and reporting of systematic reviews and meta-analyses.
Eligibility Criteria
In order to be included in our study, a study should meet the following criteria: The study should be a systematic review or meta-analysis The study must be related to spine surgery involving at-least 1 surgical intervention The study must be published between January 2000 and September 2019 in English.
Exclusion Criteria
Diagnostic studies involving spinal classification, cadaver studies, and spinal rehabilitation Studies without any comparison group for the surgical intervention being evaluated Studies dealing with intervention being applied to perioperative variables like deep vein thrombosis prophylaxis
Study Identification
We performed a computerized search of PubMed Central and the Cochrane Database for Systematic Reviews between the period 2000 and 2019 with the following terms and Boolean operators: (“spine” OR “spinal”) AND (“surgery” OR “methods” OR “procedure” OR “fracture” OR “infection” OR “deformity”). The results of the search were filtered based on the publication type to isolate meta-analysis and systematic reviews. The bibliography of each meta-analyses was reviewed by both the authors to look for additional relevant studies. Both the authors independently reviewed the title of each article retrieved from the search for its relevance and excluded studies with identified reasons as mentioned in the flow diagram in Figure 1. After title screening, abstract and full text screening was done by both the authors independently. Any discrepancy was settled by consensus. Agreement between 2 authors at each stage of screening was assessed by weighted kappa scores. 12 An interclass correlation coefficient was used for quality appraisal.

PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flowchart showing the search results and their reason for exclusion of articles.
Assessment of Methodological Quality
Each eligible study was independently reviewed by both the authors for methodological quality with the AMSTAR tool, which is a scoring tool with 11 domains of assessment and its recent update AMSTAR 2, which is a critical appraisal tool for systematic reviews that include randomized or nonrandomized studies of health care intervention or both, rather than Oxman and Guyatt index, 13 which was in use previously by Bhandari et al 14 and Dijkman et al. 7
AMSTAR tool was developed as an improvement to Oxman and Guyatt index and Sacks Assessment of Quality Checklist. 8 For each of 11 questions in AMSTAR tool, as score of 1 was given if the criteria was met and a score of 0 was given if it was not met or the information was not available. AMSTAR 215 retains 10 of the original domains of AMSTAR and has 16 domains in total, has simpler response categories than original AMSTAR and has an overall grading based on the critical domains as shown in Table 1. AMSTAR 2 was not designed to create an overall score combining individual items instead it considers the potential impact of an inadequate item in the validity of study results and grades the studies into 4 categories as shown in Table 2. Any discrepancy in scoring was resolved by consensus.
Critical Domains of Methodological Quality Assessment Based on AMSTAR 2 Guidelines.
Grading of the Systematic Reviews and Meta-Analyses Based on AMSTAR 2 Criteria and Its Inference on Confidence of the Results of the Review.
a Multiple noncritical weaknesses may diminish confidence in the review and it may be appropriate to move the overall appraisal down from moderate to low confidence.
Data Extraction
For every eligible study, the relevant data was extracted in duplicate with discrepancies resolved by consensus. Data gathered from every study was as follows: Name of the journal Type of journal (surgical/nonsurgical) Year of publication We arbitrarily designated individual articles into either of 5 category of spine surgery Trauma Infection Degeneration Deformity Miscellaneous Financial support (yes/no)
Data Analysis
Prior to analysis of data, we developed hypothesis regarding the quality and quantity of studies over time. We hypothesized that the number of studies would have increased since 2010 and that the quality of studies published from 2010 to 2019 would be higher than that of those published prior to 2010. We also hypothesized that meta-analyses from Cochrane Collaboration 16 has higher quality scores compared to non-Cochrane meta-analyses. Mean AMSTAR score was calculated in total and frequency of distribution of individual AMSTAR assessment was ascertained. An independent t test was used to compare the mean quality scores between studies published in Cochrane and other journals. We examined the effect of independent variables like publication year, type of journal (surgical/nonsurgical), study category, and financial support on the dependent variable (overall quality score). The variables that revealed a significant association with quality of meta-analyses in univariate analysis were used in multiple regression model. The results of all these analyses were reported with 95% confidence intervals (CIs). For all statistical analysis, P value less than .05 was considered significant. Statistical analysis was done with IBM SPSS Statistics for Windows Version 25 (IBM Corp, Armonk, NY, USA)
Source of Funding
No external funding was used for this study.
Results
Study Identification
A total of 4855 potentially relevant articles were identified: 3877 (79.8%) from PubMed Central and 978 (20.2%) from Cochrane Database for Systemic Reviews from initial search out of which 745 duplicates were removed and a total of 4110 articles were title screened and 391 relevant articles were reviewed with abstracts and 172 articles were found eligible for full text review and 65 articles were included in the study along with 9 articles from bibliographic search and 22 articles form journal search resulting in a total of 94 articles being included in the study as shown in Table 3. Agreement between the authors for title, abstract, and full text screening for study identification were substantially high (k = 0.92, 95% CI = 0.90-0.93; k = 0.84, 95% CI = 0.82-0.89; k = 0.89, 95% CI = 0.87-0.91, respectively).
Summary of Characteristics of the Included Articles With Their Methodological Quality Scores.
Abbreviations: BMC, BioMed Central; BMJ, British Medical Journal; CORR, Clinical Orthopaedics and Related Research; ESJ, European Spine Journal; GSJ, Global Spine Journal; JBI-SRIR, Joanna Briggs Institute Database of Systematic Reviews and Implementation Reports; JBJS, Journal of Bone and Joint Surgery; PLOS, Public Library of Science.
Methodological Quality—AMSTAR Score
A total of 96 articles included in the study were published in 31 different journals as shown in Figure 2. Of these, Global Spine Journal, European Spine Journal, Spine Journal, and Cochrane Reviews contributed to around 55.2% of the publications. The mean AMSTAR score of all 96 articles were 7.51 (SD = 1.98). When analyzing the journal-wise quality scores, Cochrane had the reviews with highest methodological quality while the other journals that contributed more to the subject like Global Spine Journal, European Spine Journal, and Spine Journal were lacking such methodological standards as shown in Table 4.

The distribution of included articles based on the journal of publication. GSJ, Global Spine Journal; ESJ, European Spine Journal; CORR, Clinical Orthopaedics and Related Research; JBJS, Journal of Bone and Joint Surgery.
Arrangement of Journal-Wise Distribution of the Selected Articles Based on Their Methodological Quality by AMSTAR Score.
Abbreviations: PLOS, Public Library of Science; JBI-SRIR, Joanna Briggs Institute Database of Systematic Reviews and Implementation Reports; BMC, BioMed Central; CORR, Clinical Orthopaedics and Related Research; JBJS, Journal of Bone and Joint Surgery.
Methodological Quality—AMSTAR 2 Grades
While grading these articles based on AMSTAR 2 tool to know the impact of the methodological flaws in the confidence of the study results, we found that only 13.5% (n = 13) articles had high level of confidence with no or 1 noncritical weakness in the methodological quality, while 18.7% (n = 18) articles had moderate level of confidence with more than 1 noncritical weakness in the methodological quality. Hence only 32.2% of the study articles are without any critical flaws.
The critical domains of the methodological assessment as given by AMSTAR 2 is shown in Table 1, any deficiency on them would result in a critical flaw in methodological quality and reduce the confidence of the results derived from the review. We found that 29.1% (n = 28) of the studies had at least 1 critical flaw and 38.5% (n = 37) of the studies had more than one critical flaw making their results have low and critically low confidence, respectively.
Out of the critical domains evaluated for methodological quality, 82.2% (n = 79) of articles did not evaluate the conflict of interest of authors of primary studies involved in the review; 71.85% (n = 69) did not provide the link or list of studies excluded from the review; 64.5% (n = 62) lacked additional gray literature search other than electronic published articles; 44.7% (n = 43) did not evaluate publication bias by any statistical methods like funnel plot.
Publication and Research Trend
In total, largest number of studies were published during the year 2016 (n = 17, 18.1%). There was a significant positive correlation between the year of publication and the AMSTAR score (r = 0.658, P = .008). This suggests that the mean AMSTAR score has increased significantly over the years with peak score being achieved at 2015. A nearly 5-fold increase in published studies was observed in the current decade (2010-2019) compared with the previous decade (2000-2009).
The 3 most commonly researched areas of interest in orthopedic spine surgery were spinal degeneration (n = 62), spinal trauma (n = 16), and spinal deformity (n = 6). Among the various pathology involved in degenerative spinal conditions, the most commonly analyzed surgeries were for treatment of disc disease (n = 39) and spondylolisthesis (n = 16).
Regression Analysis
We performed linear regression analysis to determine the association between the potential prognostic variables (like publication year, type of journal, study category, and financial support) and the quality of study. In univariate analysis, each variable is analyzed independent of other and showed that funding and year of publication were significantly associated with the quality of study. Similarly, on multivariate analysis, on considering all the factors together, both of them remained significantly associated to the methodological quality as sown in Table 5.
Regression Analysis of Factors Associated With Study Quality.
a The values are given as beta coefficient with the 95 percent confidence interval in parentheses.
Cochrane Versus Rest of Journals
We analyzed the articles published in Cochrane Collaboration and rest of the journals for methodological quality and we found a significant difference in mean AMSTAR scores between reviews of Cochrane reviews and rest of the journals (t36.926 = 11.705, P < .001). The mean score of Cochrane reviews was 3.34 AMSTAR scores more than the mean score of rest of the journals.
Discussion
We analyzed the methodological quality of all the systemic reviews and meta-analysis from 2000 to 2019 published in spine surgery related to a surgical intervention, which formed the keystone, based on which a newer surgical intervention is accepted or rejecting from entering into common clinical practise. 17 The main strength of our review is to answer the focused primary question, strict eligibility criteria, assessment of interreviewer agreement, and use of a validated measure to assess the methodological rigor of reviews.
The mean AMSTAR score of all included articles was only two-thirds of the necessary domains of methodological quality. Although the methodological quality of the studies shows a rising trend in the timeline, the overall score of the included study was suboptimal. The top journals in spine surgery like Global Spine Journal, Spine, European Spine Journal, which contributed to 46.8% of systematic reviews and meta-analyses of surgical interventions in spine, were of low methodological quality. The main reason for lack of methodological quality may be due to underreporting or lack of information or inadequate methods followed.
True to our hypothesis, reviews from Cochrane Collaboration had significantly higher methodological quality compared with the other journals. The mean score was 3.34 AMSTAR scores higher than the mean score of rest of the journals. Our findings were consistent with other reviews showing that fewer Cochrane reviews have major to extensive flaws compared with reviews in other published journals. 18 -21
In accordance with our hypothesis, the quality of systematic reviews and meta-analysis in spine surgery showed a significant positive correlation with the year of publication and showed a 5-fold increase in numbers of publication in the last decade as shown in Figure 3, which was in accordance with many reviews in orthopedic surgery. 11,14,15 We also found that funding and year of publication remained a potential prognostic variable in predicting the methodological quality of the study on multivariate analysis of factors like publication year, type of journal (surgical/nonsurgical), study category (trauma/degeneration/inflammation/infection/miscellaneous) and financial support. Hence an increase in the funding opportunities to the future meta-analyses and systematic review might improve their methodological quality.

The publication trend based on the number of articles published per year and quality trend based on their mean AMSTAR score per year.
On analyzing the category of surgical intervention in spine, which was rigorously reviewed and analyzed, degenerative disorders involving intervertebral disc (n = 39) and spondylolisthesis (n = 16) showed a predominant volume due to the increased research for better understanding of their etiologies and evolution of newer modalities of surgical management.
AMSTAR scale and other tools like Oxman and Guyatt Index give a numerical score of methodological quality of a meta-analysis and systematic review and do not take into account the impact of the domain that is not fulfilled by the concerned article and its impact on the confidence of the conclusions derived out of them. Hence, we chose to include AMSTAR 2 grading, which was developed as a critical appraisal tool to ascertain the impact of weaknesses and flaws in methodology and its impact on overall confidence of the results. 14 Figure 4 shows the variability in the quality of the articles when AMSTAR score and AMSTAR 2 grading was taken for methodological assessment.

The variability in assessment of methodological quality based on numerical scores of AMSTAR and AMSTAR 2 grades.
Although the quality of the primary studies determine the value of the meta-analysis performed with them, the methodology to perform quality meta-analysis with nonrandomized, observational studies is given by AMSTAR 2 guidelines. 14 However, the results obtained from them must always be approached with caution and with awareness of potential limitation of the primary study designs.
Most of the reviews which scored low level of confidence of results in AMSTAR 2 grading lacked the following domains: conflict of interest of authors of primary studies involved in the review, link or list of studies excluded from the review, gray literature searches other than electronic published articles and evaluation of publication bias. Attention must be paid to these domains to improve the methodological quality of the reviews published in future.
The journals with high impact factors in field of spine surgery were the ones that contributed to most of the included articles in our study; however, they did not show high methodological quality. Hence impact factor is not a predictor of quality of systematic reviews published in top cited journals. The impact factor is a measure of the frequency with which articles from a specific journal are cited in a particular year, 22 which can be artificially inflated by self-citation. Thus, there is still a debate on how misleading the impact factor is in terms of quality of systematic reviews.
A stringent methodological quality control for studies providing high level evidence like systematic reviews and meta-analyses is needed to allow the reader to precisely assess the results of the review. Bhandari et al 13 in their study reported that only 15% of the reviews were rigorous from 110 reviews they analyzed from 15 orthopedic journals during the year 2000. Our study showed 32.2% of reviews were without any major methodological flaws which is higher than the previous studies; however, it is still a small proportion of total reviews analyzed.
Our study also has some limitations. Despite performing a comprehensive search of literature, there is a possibility that potentially relevant high-quality studies were omitted due following reasons: lack of surgical intervention, language restriction, limited period of analysis (2000-2019), unpublished articles, publication bias against meta-analysis without significant findings. Utilization of AMSTAR scores and AMSTAR 2 grading for the evaluation of methodological quality might be a limitation in view of multiple tools available for the same. However, the systematic reviews and meta-analyses included in our review are likely to be the representative sample of the total number of articles in spine surgery dealing with surgical intervention that would be readily accessible to anyone interested in the field.
The increased use of systematic reviews and meta-analysis for decision making in spine surgery has put an increased demand on their methodological quality to get results with high reliability. 23,24 Although it appears easy in concept, the production of high-quality systematic review and meta-analysis is highly demanding. It is essential for those who are planning for meta-analysis in future to adhere to accepted methodologies and provide the best available evidence to address the well-defined clinical questions.
Conclusion
Overall, the methodological quality of the systematic reviews and meta-analysis in spine surgery involving a surgical intervention is less than optimal. Despite the improvement in the methodological quality of systematic reviews and meta-analyses in spine surgery in the current decade, a substantial proportion continue to show major to critical flaws as per AMSTAR 2, which affect the confidence of their results. With the increasing number of review articles being published in spine surgery, stringent measures must be taken to adhere to the methodological quality by following PRISMA and AMSTAR guidelines to attain higher standards of evidence in literature. Funding of research projects might improve their methodological quality provided they follow above said guidelines.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
