Abstract
Abstract
Objective: Continuous outcome measures are essential in rehabilitation research. Incomplete reporting of their mean and standard deviation, required for meta-analysis, potentially introduces bias and imprecision if it prevents studies being included. We aimed to determine how often systematic reviewers encounter missing mean or standard deviation values and to recommend practical statistical solutions.
Design: 1. Cross-sectional survey of systematic review authors. 2. Reanalysis of Cochrane review data to evaluate how accurately statistical methods for recovering missing mean or standard deviation values estimate the true meta-analysis treatment effect.
Setting: Rehabilitation intervention systematic reviews.
Participants: Cochrane stroke rehabilitation review authors; stroke patients.
Interventions: Reanalysis of a Cochrane review of early supported discharge services.
Main measures: Hospital length of stay.
Results: Survey responses covered 53 of 70 Cochrane reviews. Almost all studied continuous outcome measures, 68% encountering missing summary statistics. Various solutions were attempted but 76% of meta-analyses omitted at least one study due to missing information. In the review reanalysis (N = 1055), a method based on the minimum and maximum performed best in recovering missing standard deviations; a method based on the median, lower and upper quartiles successfully estimated a missing mean.
Conclusion: Practical statistical methods help reduce risk of bias, maximise the evidence included in rehabilitation meta-analyses and offer a clear hierarchy of solutions to handling unreported mean and standard deviation values.
Introduction
Systematic reviews and meta-analyses of the effectiveness of stroke rehabilitation interventions form a valuable resource for stroke survivors, carers, researchers, health-care professionals, guideline developers, and policymakers. The Cochrane Library 1 contains over 200 stroke-related systematic reviews coordinated by the Cochrane Stroke Review Group. The influential role of systematic reviews as the primary route to readily accessible and rigorous summaries of stroke trial findings confers a responsibility on reviewers to implement valid and robust methodology. With the rapid and accelerating accumulation of new stroke trial data, systematic reviews must summarize the evidence with minimal bias and the greatest possible precision to inform stroke patients and health care professionals and to guide future stroke research. This depends critically on avoiding bias in identifying trials, selecting trials for meta-analysis, and extracting data; 2 and on including as much of the available data as possible.
Continuous outcome measures, such as the Stroke Impact Scale 3 and stroke-specific quality of life (SS-QOL) 4 are highly relevant to stroke survivors, while continuous resource use measures such as hospital length of stay are pivotal to evaluating the cost-effectiveness of a stroke intervention. Around one-third of stroke reviews in the Cochrane Database of Systematic Reviews 1 include a continuous primary outcome; three-quarters contain a continuous secondary outcome.
Although some continuous outcomes have a “bell-shaped” normal distribution, many do not; examples include hospital length of stay and measures of physical function and poststroke depression. 5 For such outcomes, analysis strategies and reporting vary 6 : the clinical trial publication often summarizes the outcome using the median and either the minimum and maximum values or the lower and upper quartiles. In contrast, standard meta-analysis requires information on the mean and either the standard deviation, variance or standard error 7 for each treatment group. These may not be reported for outcomes that are not normally distributed. Accessing original individual patient data or additional summary statistics not included in the original trial report is often difficult. 8
While the problem of incomplete reporting of trials 9 has lessened in recent years thanks in part to reporting guidance, such as the consolidated standards of reporting trials (CONSORT), 10 in order to summarize the available literature, fully stroke rehabilitation systematic reviewers still need to deal with unreported standard deviation and mean values. One option is to exclude the trial from the meta-analysis. The alternative is to apply statistical methods to recover the unreported values, allowing the trial to be retained in the meta-analysis. We recently reviewed methods to handle missing standard deviation and missing mean values. 11 Here, we explore the extent of the issue in stroke rehabilitation systematic reviewing and illustrate potential solutions by reanalyzing individual patient data from the Cochrane stroke review. 12
We planned to establish, via a survey of the authors of Cochrane reviews, how often stroke rehabilitation systematic reviewers encounter missing mean or standard deviation values in their reviews and the methods they use to address this. Second, we aimed to illustrate the use of statistical methods for handling missing standard deviation and mean values in meta-analysis by reanalyzing individual patient data from a Cochrane stroke review. 12 We sought to identify methods that would be straightforward for systematic reviewers to apply, while still avoiding bias and giving the correct level of precision in the meta-analysis findings.
Methods
Survey Design
We performed an online survey of all authors who published a Cochrane review of a stroke rehabilitation intervention. A covering email (Figure S1) explained the background to the research and the questionnaire objectives. Completion of the questionnaire was voluntary, the results being held confidentially and used for research purposes only. The covering email contained a link to the online Google Forms questionnaire; for invitees based in China, where Google Forms was inaccessible at the time of the survey, the questionnaire was attached to the email. The questionnaire may be viewed at
Stroke rehabilitation review authors were invited to complete the questionnaire within 1 month; if no response was received a single reminder e-mail was sent. Completion of the questionnaire was considered an indicator of consent. The online questionnaire was piloted by 4 authors of Cochrane reviews before circulating it to the full list of potential participants.
Survey Sample Size and Data Analysis
At the time of the survey (November 2015-March 2016) there were 70 published Cochrane Stroke Reviews of rehabilitation interventions. For each rehabilitation review the lead author, the second author and the contact author were approached. Authors of more than one published review were asked to complete one questionnaire per review.
Questionnaire data were summarized using frequencies and percentages to indicate the extent of missing mean and standard deviation data in rehabilitation reviews and how reviewers dealt with this.
Statistical Methods for Handling Missing Standard Deviation or Mean Missing Standard Deviation
In a methodology systematic review, 11 we identified 15 methods for handling missing standard deviation values. We selected for further evaluation, 2 of which are readily applicable and which do not require complex statistical modeling. First, Walter and Yao 13 present an enhancement to the range method based on the minimum and maximum observed values of the outcome, providing a look-up table of conversion factors from range to standard deviation for various sample sizes. Secondly, Ma et al 14 present a study-level imputation method which uses the weighted average of the variances observed in other trials in the meta-analysis. In addition to these 2 methods, the Cochrane Handbook 7 notes that (for a normally distributed outcome measure) the interquartile range (IQR) approximately equals to 1.35 standard deviations; we also evaluated this method as it is similar to another approach 15 identified in our review. Hereafter, we refer to these approaches as enhanced range, weighted average, and IQR methods, respectively.
Missing Mean
Our systematic review of methods for handling missing mean values 11 identified 4 approaches. From these, we chose for detailed assessment 3 pragmatic methods15-17 that estimate the mean algebraically based on other summary statistics that are likely to be reported. Ho et al 16 derive a missing mean value using the median, minimum, maximum (MMM), and sample size. Bland 17 takes account of the extended scenario where information on the lower and upper quartiles is also available (MMMQ1Q3); Wan et al 15 provide a method which applies where the median and lower and upper quartiles are available but the minimum and maximum are not (MQ1Q3).
The Appendix includes formulae and reference to an online calculator for the methods for replacing missing standard deviation or mean values.
Data Source
Data were from trials of early supported discharge services following acute stroke included in a Cochrane systematic review and individual patient data meta-analysis. 12 The data were from 8 trials (1055 participants). From this we generated example meta-analyses when the standard deviation or mean value was unavailable. We illustrated statistical methods for recovering the standard deviation when it was missing from both intervention arms for each of the 8 trials in the data set in turn. An equivalent approach demonstrated methods for handling missing mean values.
Evaluation of Statistical Methods
We evaluated each statistical method for estimating missing standard deviation or mean values compared to analysis of the complete data set using the following:
The bias of the overall meta-analysis result (the difference between the estimated and true values of the intervention effect). The precision of the overall meta-analysis result (the ratio of the widths of the confidence intervals for the intervention effect [width when estimating missing standard deviation or mean: width when complete data set analyzed]); values greater than 1 indicate worse precision than under the analysis of the complete data set.
Method performance was judged overall by combining the bias and precision to calculate the mean squared error which was then compared to the mean squared error from analysis of the complete data set.
We also investigated the performance of the default option of omitting a trial with missing standard deviation or mean value(s) from the meta-analysis. Findings were illustrated graphically by presenting the overall meta-analysis results (intervention effect and 95% confidence interval) for each of the methods alongside the complete data set analysis result and the result where the trial with a missing standard deviation or mean value was omitted.
We analyzed hospital length of stay, an outcome with a skewed distribution for which the standard deviation and mean summary statistics may be more likely to be omitted from published trial reports. The intervention effect on hospital length of stay was estimated by the mean difference using random effects meta-analysis models fitted in the Cochrane RevMan software v5.3. 18
Results
Survey
Figure 1 summarizes the online questionnaire findings. The survey was sent to 100 authors, based in 11 countries, of 70 Cochrane reviews of a stroke rehabilitation intervention. Around 63 responses (53 reviews, 76%) were received. Most respondents (58 of 63; 92%) knew the details of the analyses performed in their review; 56 reviewers had aimed to analyze continuous outcomes. Also, 89% (51 of 56) had intended to extract the mean, standard deviation, and sample size in the review; unreported standard deviations or means were encountered by 68% (38 of 56). Meta-analysis was often performed in situations where missing standard deviations or means had been encountered (34 of 38; 89%).
Of the reviewers who attempted meta-analysis, having encountered unreported standard deviation or mean values, the majority contacted source trial publication authors to request the missing items (29 of 34; 85%). While some received all of the information requested, including 3 who were sent the original trial data, many (50%) obtained less than half of the information they requested or did not receive a response. Many review authors (62%) also used statistical imputation to estimate the unreported values. Despite these various strategies, most authors (26 of 34; 76%) who encountered unreported standard deviation or mean values had to omit at least one trial from their meta-analysis.
Online Survey
Example Review Reanalysis
We tested various strategies for dealing with missing data using the Cochrane review of early supported discharge. 12
Missing Standard Deviation
Figure 2(a) summarizes the performance of the enhanced range, weighted average, and IQR methods for replacing missing standard deviation values in meta-analysis. Detailed assessment of the bias and precision of methods is presented in online supplementary Table S1. All 3 methods performed consistently better than the strategy of omitting the trial with missing standard deviation from the meta-analysis which biased the true treatment effect estimate (9.36 days) by up to 2 days. While differences between statistical methods were small, the enhanced range approach showed least bias. A similar pattern was found for imprecision; omitting the trial gave poorer precision and the enhanced range method nearly always performed the best. Overall, the enhanced range gave an MSE closest to that obtained from analysis of the complete data set in 7 of the 8 scenarios.

Missing Mean
Figure 2(b) and supplementary Table S2 provide the corresponding results when replacing a missing mean value was required. The MQ1Q3MM and MQ1Q3 methods provided less biased results than the MMM or trial omission strategies. Overall the MQ1Q3 approach had an MSE closest to that obtained from the analysis of the complete data set in most scenarios, providing a substantially superior MSE in 2.

Recommendations
Figure 3 illustrates an overall stepwise strategy that enables stroke systematic reviewers performing meta-analysis to handle mean or standard deviation summaries that are missing from clinical trial reports.
Discussion
Our questionnaire findings demonstrate that missing summary statistics pose a problem for systematic reviewers in stroke rehabilitation studies, and that there is a lack of consensus over how best to address the issue. This paper has highlighted methods that reviewers may readily apply to estimate unreported mean or standard deviation values. In tests, using real data from the Cochrane review of early supported discharge interventions following stroke, the enhanced range method 13 proved particularly reliable in estimating a missing standard deviation and the MQ1Q3 method 15 performed strongly in estimating an unreported mean. As a supplement to the guidance on handling missing data already available in the Cochrane Handbook, 7 these methods will help review authors maximize the information included in meta-analyses.
Strengths and Limitations
Inviting multiple authors from each review to comment and asking detailed questions about the methods used ensured that we obtained comprehensive information on the approaches currently being used by systematic reviewers of stroke rehabilitation interventions to deal with missing summary statistics, at a level of detail not available in the text of review publications. The use of individual participant data from the real Cochrane review demonstrated that the analytical strategies are straightforward to apply and perform strongly in minimizing bias and optimizing the precision of the meta-analysis estimates.
Our survey of Cochrane’s review authors was restricted to stroke rehabilitation-intervention reviews, further investigation is required to confirm whether the findings are also reflected in other systematic review topics. Our questionnaire was limited to the English language. Although some review authors did not have English as their first language, all had recently published academic articles in English so we did not anticipate any language barriers.
While our early supported discharge review reanalysis showed consistent performance of methods in a review where a strong treatment effect is present on the length of stay outcome, the findings here relate to analysis of a single-exemplar data set. However, the early supported discharge services analysis contained a typical number of trials, of a representative size, for a stroke systematic review. A separately published analysis 11 in which meta-analysis data sets were generated using individual participant data from the general anesthesia versus local anesthesia (GALA) trial 19 showed similar findings, notably across meta-analyses ranging from 5 to 30 included trials and covering small (<25), large (>80) or mixed numbers of participants per trial.
Generalizability
Do these findings generalize to other stroke trial contexts and indeed to other therapeutic areas? We deliberately selected stroke rehabilitation as an area in which the continuous outcomes measures used may well have a skewed distribution and therefore would be prone to unreported mean or standard deviation values. Nonetheless, work in the surgical context 19 supports the findings and certain outcomes, such as length of hospital stay, will not be normally distributed whether the context is acute, subacute or rehabilitation. More generally, the practical information (Appendix) on implementing the statistical approaches outlined in this paper mean that the methods will be readily applied in reviews in any therapeutic indication.
Conclusion
Incomplete reporting of trial findings is likely to reduce over time, thanks to initiatives to improve reporting, such as the CONSORT. 20 In the meantime, our questionnaire findings will raise awareness among journal editors and systematic reviewers of the important issue of unreported mean and standard deviation values, while the methods highlighted in this paper will help reviewers to satisfy the ongoing ethical obligation to make effective use of data from trials that have already completed.
Clinical Messages
Rehabilitation outcome measures which are important to patients and researchers often have a skewed (non-normal) distribution. As a consequence, authors of systematic reviews of rehabilitation interventions are likely to encounter missing mean or standard deviation values in clinical trial reports.
Practical statistical methods are available to help reviewers replace missing summary statistics, allowing them to include more of the eligible trials in their reviews and, hence, reducing the risk of bias in the results of meta-analysis.
Appendix: Resources for Recovering Missing Standard Deviation and Mean Values
Missing Standard Deviation: Enhanced Range
Walter and Yao 13 provide a lookup table based on the minimum, maximum and sample size.
Missing Standard Deviation: Weighted Average 14
Either use the smallest and largest standard errors reported for other studies in the meta-analysis to provide best- and worst-case scenarios; or use a weighted average of the observed standard errors in other studies in the meta-analysis. The weighted average formula is
Missing Standard Deviation: IQR 7
If lower (Q1) and upper (Q3) quartiles are reported replace the missing standard deviation by: (Q3 − Q1)/1.35.
Missing Mean: MMM 16
If the median M, minimum A, maximum B and sample size n are known, the missing mean x is estimated by:
Missing Mean: MQ1Q3MM 17
If the median M, lower (Q1) and upper (Q3) quartiles, minimum A, maximum B and sample size n are reported, use:
Missing Mean: MQ1Q3 15
Where the median M, lower (Q1) and upper (Q3) quartiles are known, use:
Missing Mean: Online Resource
The online publication of 15 links to an Excel spreadsheet which demonstrates the MQ1Q3 calculation to replace a missing mean, and does the same for simplified versions of methods MMM and MQ1Q3MM.
Supplemental Material
Supplemental material for Unreported Summary Statistics in Trial Publications and Risk of Bias in Stroke Rehabilitation Systematic Reviews: An International Survey of Review Authors and Examination of Practical Solutions
Supplemental Material for Unreported Summary Statistics in Trial Publications and Risk of Bias in Stroke Rehabilitation Systematic Reviews: An International Survey of Review Authors and Examination of Practical Solutions by Christopher J. Weir, Valentina Assi, Lumine Na, Stephanie C. Lewis, Gordon D. Murray, Peter Langhorne, and Marian C. Brady in Journal of Stroke Medicine
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The Stroke Association (grant number 2012/05) and the Rosetrees Trust funded this project. Christopher Weir was also supported by National Health Service (NHS) Lothian via the Edinburgh Clinical Trials Unit. Marian Brady is funded by the Chief Scientist Office, Scottish Government’s Health and Social Care Directorate. None of the funders had any role in the study design, the collection, analysis and interpretation of the data, the writing of the report, or the decision to submit the article for publication. The views are those of the authors and not necessarily those of the funders.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
