Reducing Waste and Increasing the Usability of Psychiatry Research: The Family of EQUATOR Reporting Guidelines and One of Its Newest Members: The PRISMA-DTA Statement

Abstract

Keywords

screening diagnosis epidemiology evidence based medicine metaanalysis methods systematic reviews

In Canada, our provincial and territorial health systems are increasingly challenged to meet the needs of patients, including mental health needs. Research is needed to find ways to provide more effective and efficient care. Waste in research, however, presents a formidable barrier to achieving this.^1

–6 One important source of waste in research is the incomplete and inaccurate reporting of published research results. Poor and inaccurate reporting pose substantial barriers to effectively translating research into improved patient care, both in psychiatry and other areas of medicine.⁶ Assessing the utility and applicability of published findings requires an understanding of what actually occurred in a study and knowledge of the study’s complete set of results.⁷ This information, however, is often either not provided in published study reports or is not provided with enough clarity to be useful.^6,8 As a result, billions of dollars in research funding are wasted every year, and the risk of inadequate and misinformed care is needlessly heightened.^1

–6

Biomedical research reporting guidelines have been developed with the goal of increasing the value and quality of research. These guidelines typically describe a minimum set of information that should be clearly reported, provide examples of guideline-consistent reporting, and include a checklist to facilitate compliance.⁹ The first Consolidated Standards of Reporting Trials (CONSORT) statement on the reporting of parallel group randomized controlled trials was published in 1996.¹⁰ The EQUATOR (Enhancing the QUAlity and Transparency Of health Research) initiative, which grew out of the work of CONSORT and other groups, was founded in 2008 with the mission of improving health research by promoting transparent, accurate, and complete reporting.¹¹ Other major reporting guidelines that have been developed by EQUATOR include the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) statement,^12,13 the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement,^14,15 and the Standards for Reporting of Diagnostic Accuracy (STARD) statement.^16,17 In recent years, EQUATOR reporting guidelines have expanded to include an extensive list of guidelines intended for specific trial designs (e.g., cluster randomized trials),¹⁸ for reporting specific types of outcomes (e.g., patient-reported outcomes),¹⁹ or for reporting specific types of systematic reviews and meta-analyses, including the Preferred Reporting Items for a Systematic Review and Meta-Analysis of Diagnostic Test Accuracy Studies (PRISMA-DTA) statement,²⁰ among other examples.

The publication of reporting guidelines has increased the completeness and quality of reporting. Pre-post studies of several reporting guidelines have shown that studies published after the publication of CONSORT, STARD, and PRISMA include more elements from the relevant guidelines than studies published before the guidelines.^21
–23 Given the positive impact of guideline publication, it is not surprising that many medical journals encourage researchers to refer to guidelines when preparing their manuscripts for peer review, and some journals require authors to adhere to them and submit a completed checklist along with their manuscript. Journal endorsement of guidelines has improved the completeness of reporting as well. Across journals, studies published in journals endorsing the CONSORT statement report more CONSORT items than studies published in journals that do not endorse the CONSORT statement.^24,25 Within journals, similar results have been found, with more CONSORT items reported in studies published post- versus pre-CONSORT endorsement by the journals.²⁵ However, enforcement appears to be more effective than endorsement alone. One study found that journals that actively enforced the CONSORT for Abstracts guideline had a significant increase in number of items reported over time, whereas no such trends were found among journals that endorsed, but did not enforce, the guideline.²⁶

Despite evidence that reporting guidelines are effective in improving the completeness of study reports across different study designs, many journals do not require that authors use them. Indeed, we reviewed the author instructions of the top 10 psychiatry journals based on impact factor per Thomson Reuters InCites Journal Citation Reports (March 23, 2018) and found that only 3 (JAMA Psychiatry, Lancet Psychiatry, Biological Psychiatry) required the use of appropriate EQUATOR reporting standards, generally, with 2 others mentioning only a single example of a reporting guideline, either CONSORT (American Journal of Psychiatry) or PRISMA (Acta Psychiatrica Scandinavica). Only 2 journals (JAMA Psychiatry, Biological Psychiatry) indicated in author instructions that authors must submit a checklist along with their manuscript to show that they reported their study consistent with the most appropriate EQUATOR reporting guideline. The Canadian Journal of Psychiatry, to its credit, requires that manuscripts submitted to the journal follow the appropriate EQUATOR reporting guideline and that authors submit the appropriate reporting guideline checklist with their manuscript.

In psychiatry, the use of screening and assessment tools is increasingly encouraged for the identification of patients with unrecognized conditions or as case-finding tools for patients suspected of having a disorder, including major depressive disorder.^27,28 Concerns have been raised, however, about the quality of primary studies in this area^29
–31 and about the methodological quality and reporting of meta-analyses.^23,32 The recently published PRISMA-DTA, if followed, will help this situation.

The original PRISMA statement^12,13 is a reporting guideline for systematic reviews and meta-analyses that contains an evidence-based minimum set of items (27 items) that should be reported, along with a flow chart for illustrating the review process. The main focus of the original PRISMA guideline, however, was on systematic reviews and meta-analysis of randomized trials. Many elements of the conduct and reporting of interventional studies and DTA studies are common to both study designs. There are, however, important differences.

Thus, the PRISMA-DTA statement²⁰ was developed as an extension of the original PRISMA statement and aimed to reflect specific concepts, methods, language, and requirements for reporting of systematic reviews and meta-analyses of diagnostic test accuracy studies. It was designed to help ensure the applicability of reviews in practice by multiple potential users and to improve transparency and completeness of reporting of DTA systematic reviews and meta-analyses.

The PRISMA-DTA checklist consists of 27 items, 8 of which are identical to the original PRISMA items (3, 5, 7, 9, 10, 16, 17, and 27), 17 that were modified based on PRISMA items (1, 2, 4, 6, 8, 11-14, 18-21, and 23-26) due to ambiguity in wording in the original PRISMA items or to address reporting concerns specific to DTA systematic reviews and meta-analyses, and 2 new items. The first new item (D1) regards the statement of the scientific and clinical background for using the test, including its intended use and clinical applicability and, if relevant, a rationale for minimally accepted accuracy. The same test can be used for different purposes, and in psychiatry, a depression screening tool, for instance, could be tested as a general population screening tool or as a case-finding tool among people suspected of having depression. The same tool could also be used for tracking progress of patients undergoing treatment. How the test is used can be an important source of variability in diagnostic accuracy. Furthermore, drawing conclusions about how a test may be used in practice depends on understanding minimal levels of acceptable test accuracy and reasons for setting such thresholds. The second new item (D2) requires the reporting of the statistical methods used for meta-analyses, if performed. Rigorously conducted meta-analyses of DTA studies require more sophisticated multivariate models (e.g., bivariate and hierarchical summary receiver-operating characteristic) than is often the case for meta-analyses of other types of studies in order to model the tradeoff between sensitivity and specificity and to account for correlation between sensitivity and specificity. Thus, greater attention to reporting of modeling procedures was considered essential.²⁰

The PRISMA-DTA statement also removed 2 items, items 15 and 22, from the original PRISMA statement. These items pertain to methods and results related to the risk of bias that may affect the cumulative evidence, including publication bias or selective reporting of results within studies. The rationale for this decision was that there is only limited evidence suggesting that publication bias and reporting bias are problematic for DTA studies.

However, in psychiatry research, selective cutoff reporting, whereby authors report only DTA results for cutoffs with reasonable sensitivity and specificity, was recently shown to result in biased estimates of pooled sensitivity and specificity when primary studies are meta-analyzed. A conventional meta-analysis of the Patient Health Questionnaire–9 (PHQ-9) depression screening tool found that pooled estimates of sensitivity actually improved as cutoff thresholds increased from 9 (less severe symptoms) to 11 (more severe symptoms), which would be mathematically impossible if complete data were analyzed.³³ This occurred due to selective reporting of results for cutoff thresholds only where the PHQ-9 performed well in individual primary studies but nonreporting for thresholds where it performed poorly. Members of our team compared results from conventional meta-analysis of published PHQ-9 diagnostic accuracy using aggregate study–level data to results using individual patient data meta-analysis (IPDMA).³⁴ Using IPDMA allowed us to incorporate accuracy results from both published and unpublished cutoff thresholds from the same set of studies. We found that due to selective cutoff reporting in primary studies, estimates of pooled sensitivity were distorted; when only published accuracy results were analyzed, sensitivity was underestimated for cutoff thresholds below the generally accepted “standard” cutoff of 10 and overestimated for cutoff thresholds greater than10 compared with the complete IPDMA results.³⁴

Concerns that selective cutoff reporting in primary studies may be biasing results in meta-analyses were also recently raised by the authors of a conventional meta-analysis of the PHQ-9.³⁵ They pointed out that no firm conclusions could be drawn about cutoff thresholds other than the standard threshold of 10 because authors of different studies often reported inconsistent cutoffs, and this appeared to be driven by selective reporting.³⁵ Another recent meta-analysis of brief versions of the Geriatric Depression Scale warned readers that results should be interpreted cautiously due to the likely influence of selective cutoff reporting on synthesized accuracy results.³⁶ More work is needed to understand the scope of this problem, and IPDMA is one approach to overcome it, although statistical approaches may also work. In the meantime, we recommend that authors of systematic reviews and meta-analyses of the accuracy of mental health–screening tools evaluate the possibility of selective cutoff reporting and describe their methods and results per the PRISMA-DTA items on conducting additional analyses (items 16 and 23).

In sum, reporting guidelines improve the transparency and completeness of published study reports. Authors who submit manuscripts to the Canadian Journal of Psychiatry are required to report their studies using the most appropriate EQUATOR reporting guideline. Authors who conduct systematic reviews and meta-analyses of the accuracy of mental health–screening tools should use the PRISMA-DTA reporting guideline. They should also address selective reporting practices that may lead to an exaggeration of test accuracy when only results from well-performing cutoffs are reported in primary studies, which appears to be a common problem in psychiatry DTA research.

Footnotes

Declaration of Conflicting Interests

The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: All authors have completed the Unified Competing Interest form at (available on request from the corresponding author). Dr. Thombs declared that he was an author of the PRISMA-DTA reporting guideline and that he has no other competing interests. All other authors declared that they have no competing interests.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Drs. Thombs and Benedetti were supported by Fonds de recherche du Québec–Santé researcher salary awards. Ms. Rice was supported by a Canadian Institutes of Health Research (CIHR) Vanier Graduate Scholarship. Ms. Levis was supported by a CIHR Frederick Banting and Charles Best Canada Graduate Scholarship Doctoral Award.

References

Macleod

Michie

Roberts

. Biomedical research: increasing value, reducing waste. Lancet. 2014:383(9912);101–104.

Chalmers

Bracken

Djulbegovic

. How to increase value and reduce waste when research priorities are set. Lancet. 2014:383(9912);156–165.

Ioannidis

JPA

Greenland

Hlatky

. Increasing value and reducing waste in research design, conduct, and analysis. Lancet. 2014:383(9912);166–175.

Al-Shahi Salman

Beller

Kagan

. Increasing value and reducing waste in biomedical research regulation and management. Lancet. 2014:383(9912);176–185.

Chan

A-W

Song

Vickers

. Increasing value and reducing waste: addressing inaccessible research. Lancet. 2014:383(9912);257–266.

Glasziou

Altman

Bossuyt

. Reducing waste from incomplete or unusable reports of biomedical research. Lancet. 2014:383(9912);267–276.

Moher

Liberati

Tetzlaff

. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLOS Med. 2009;6(7):e1000097.

Chan

Altman

. Epidemiology and reporting of randomised trials published in PubMed journals. Lancet. 2005;365(9465):1159–1162.

Simera

Moher

Hoey

. A catalogue of reporting guidelines for health research. Eur J Clin Invest. 2010;40(1):35–53.

10.

Begg

Cho

Eastwood

. Improving the quality of reporting of randomized controlled trials: the CONSORT statement. JAMA. 1996;276(8):637–639.

11.

Altman

Simera

Hoey

. EQUATOR: reporting guidelines for health research. Lancet. 2008;371(9619):1149–1150.

12.

Moher

Liberati

Tetzlaff

. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ. 2009;339:b2535.

13.

Liberati

Altman

Tetzlaff

. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ. 2009;339:b2700.

14.

von Elm

Altman

Egger

. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. PLOS Med. 2007;4(10):e296.

15.

Vandenbroucke

von Elm

Altman

. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): explanation and elaboration. PLOS Med. 2007;4(10):e297.

16.

Bossuyt

Reitsma

Bruns

. STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ. 2015;351:h5527.

17.

Cohen

Korevaar

Altman

. STARD 2015 guidelines for reporting diagnostic accuracy studies: explanation and elaboration. BMJ Open. 2016;6(11):e012799.

18.

Campbell

Piaggio

Elbourne

. Consort 2010 statement: extension to cluster randomised trials. BMJ. 2012;345:e5661.

19.

Calvert

Blazeby

Altman

. Reporting of patient-reported outcomes in randomized trials: the CONSORT PRO extension. JAMA. 2013;309(8):814–822.

20.

McInnes

MDF

Moher

Thombs

. Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: the PRISMA-DTA statement. JAMA. 2018;319(4):388–396.

21.

Prady

Richmond

Morton

. A systematic evaluation of the impact of STRICTA and CONSORT recommendations on quality of reporting for acupuncture trials. PLOS One. 2008;3:e1577.

22.

Smidt

Rutjes

van der Windt

. The quality of diagnostic accuracy studies since the STARD statement: has it improved? Neurology. 2006;67:792e7.

23.

Rice

Kloda

Shrier

. Reporting completeness and transparency of meta-analyses of depression screening tool accuracy: a comparison of meta-analyses published before and after the PRISMA statement. J Psychosom Res. 2016;87:57–69.

24.

Turner

Shamseer

Altman

. Consolidated standards of reporting trials (CONSORT) and the completeness of reporting of randomised controlled trials (RCTs) published in medical journals. Cochrane Database Syst Rev. 2012;11:MR000030.

25.

Plint

Moher

Morrison

. Does the CONSORT checklist improve the quality of reports of randomized controlled trials: a systematic review. Med J Aust. 2006;185:263e7.

26.

Hopewell

Ravaud

Baron

. Effect of editors’ implementation of CONSORT guidelines on the reporting of abstracts in high impact medical journals: interrupted time series analysis. BMJ. 2012;344:e4178.

27.

Thombs

Coyne

Cuijpers

. Rethinking recommendations for screening for depression in primary care. CMAJ. 2012;184(4):413–418.

28.

Thombs

Ziegelstein

. Does depression screening improve outcomes in primary care? BMJ. 2014;348:g1253.

29.

Thombs

Rice

. Sample sizes and precision of estimates of sensitivity and specificity from primary studies on the diagnostic accuracy of depression screening tools: a survey of recently published studies. Int J Methods Psychiatr Res. 2016;25(2):145–152.

30.

Rice

Thombs

. Risk of bias from inclusion of currently diagnosed or treated patients in studies of depression screening tool accuracy: a cross-sectional analysis of recently published primary studies and meta-analyses. PLOS One. 2016;11(2):e0150067.

31.

Thombs

Arthurs

El-Baalbaki

. Risk of bias from inclusion of patients who already have diagnosis of or are undergoing treatment for depression in diagnostic accuracy studies of screening tools for depression: systematic review. BMJ. 2011;343:d4825.

32.

Rice

Shrier

Kloda

. Methodological quality of meta-analyses of the diagnostic accuracy of depression screening tools. J Psychosom Res. 2016;84:84–92.

33.

Manea

Gilbody

McMillan

. Optimal cut-off score for diagnosing depression with the Patient Health Questionnaire (PHQ-9): a meta-analysis. CMAJ. 2012;184(3):E191–E196.

34.

Levis

Benedetti

Levis

. Selective cutoff reporting in studies of diagnostic test accuracy: a comparison of conventional and individual-patient-data meta-analyses of the Patient Health Questionnaire-9 depression screening tool. Am J Epidemiol. 2017;185(10):954–964.

35.

Moriarty

Gilbody

McMillan

Manea

. Screening and case finding for major depressive disorder using the Patient Health Questionnaire (PHQ-9): a meta-analysis. Gen Hosp Psychiatry. 2015;37(6):567–576.

36.

Pocklington

Gilbody

Manea

. The diagnostic accuracy of brief versions of the Geriatric Depression Scale: a systematic review and meta-analysis. Int J Geriatr Psychiatry. 2016;31(8):837–857.