Sage Journals: Discover world-class research

Abstract

Background

In meta-analysis, researchers often pool the results from a set of similar studies. A number of studies, however, often tend to report only the minimum and maximum values, median, and/or the first and third quartiles. Recently, many methods have been discussed for estimating the mean and standard deviation from those sample summaries. However, these methods may provide a substantially biased estimate of the inverse variance that is needed for the meta-analysis.

Research Design

We use Basu’s theorem to derive unbiased estimators for σ⁻² from the most commonly used sample summaries from the normal distribution. While there are no closed formulas for these estimators, we use simulations to obtain simple approximations for the estimators.

Results

The proposed approximate estimators still show a little to no bias for normally distributed data and generally show smaller bias than the usual methods even for some non-normal distributions. The proposed estimators have lower mean squared error.

Conclusions

The proposed estimators are recommended for the purpose of obtaining inverse-variance weights, particularly in the context of meta-analyses.

Keywords

Meta-analysis five-figure summary range interquartile range variance

Introduction

When one wishes to combine data from multiple independent studies, the sample mean and sample standard deviation are the essential quantities for the meta-analysis.¹ Some studies do not report either the mean or the standard deviation and report only the five-figure summary (including the sample median, the first and third quartiles, and the minimum and maximum values). Moreover, even the five-figure summary may not be fully reported; see, for example, Thatcher et al.,² where only the median and range were reported or Monaco et al.³ where only the median and the quartiles were reported. Rather than discarding the studies that do not report the sample mean and standard deviation directly, one can try to estimate these quantities from the reported summaries. While the work on this problem can be traced back all the way to Tippett,⁴ the interest in the systematic research of the estimators resurfaced in Hozo et al.⁵ and a significant body of literature soon followed;^6–11 see also Weir et al.¹² and Walter et al.¹³ where various methods have been discussed and compared.

Typically, three scenarios are investigated depending on the summaries being reported in addition to a sample size:

1. {a, m, b; n},

2. {q₁, m, q₃; n}, or

3. {a, q1, m, q3, b},

where a is the minimum, q₁ is the first quartile, m is the median, q₃ is the third quartile, b is the maximum and n is the sample size.

Walter et al.¹³ showed that a naïve method for estimating σ⁻² in Scenario 1 by estimating σ first and then raising that estimate to power −2 can result in substantial bias, particularly for small samples. This is true even for the state-of-the-art methods presented in Shi et al.¹⁴ and Balakrishnan et al.;¹⁵ see Figure 1. Therefore, Walter et al.¹³ used a Taylor series approximation to provide a better estimate for σ⁻² when only the sample range is available. They showed that the approximation by the polynomial of fourth degree yields good results even for small sample sizes.

Figure 1.

Bias and mean square error (MSE) of estimating σ⁻² by first estimating σ using the best methods for the appropriate scenario from Shi et al.¹⁴ and Balakrishnan et al.¹⁵ and then raising the estimate to power −2. Results from 10⁵ simulations for data from N(0, 1) are shown; however, the bias and MSE is the same regardless of the mean and the variance of the normal distribution. (a) Bias. (b) MSE.

In this paper, we significantly simplify the method developed in Walter et al.¹³ and propose unbiased estimators of σ⁻² from normally distributed data. The estimators can incorporate the knowledge of the interquartile range as well. Unfortunately, there are no closed formulas for those estimators, and so we searched for and obtained simple-to-use approximations that still show a little to no bias.

Methods

Scenario 1, {a, m, b; n}

Let us write

\frac{1}{R_{n}^{2}} = (\frac{S_{n}^{2}}{R_{n}^{2}}) (\frac{1}{S_{n}^{2}}),

(1)

where R_n = b − a is the sample range and S_n is the sample standard deviation. From Basu’s theorem,¹⁶ we know that

S_{n}^{2} / R_{n}^{2}

and

1 / S_{n}^{2}

are independent random variables since

S_{n}^{2} / R_{n}^{2}

is free of σ² (and so is ancillary) and

S_{n}^{2}

is a complete sufficient statistic for σ². Therefore, we have

E [\frac{1}{R_{n}^{2}}] = E [\frac{S_{n}^{2}}{R_{n}^{2}}] \cdot E [\frac{1}{S_{n}^{2}}]

(2)

due to independence. Note that

(n - 1) S_{n}^{2} / σ^{2} \sim χ_{n - 1}^{2}

. Here, we have assumed the sample variance to have n − 1 as its denominator. If one were to use the sample variance with denominator n, then the only change that would result would be to use

n S_{n}^{2} / σ^{2}

which has a chi-square distribution with n − 1 degrees of freedom in the formula derived subsequently. Thus,

E [\frac{1}{n - 1} \frac{σ^{2}}{S_{n}^{2}}] = \frac{2^{\frac{n - 1}{2} - 1} Γ (\frac{n - 1}{2} - 1)}{2^{\frac{n - 1}{2}} Γ (\frac{n - 1}{2})} = \frac{1}{2 (\frac{n - 1}{2} - 1)} = \frac{1}{n - 3},

(3)

and so

E [\frac{1}{S_{n}^{2}}] = (\frac{n - 1}{n - 3}) \frac{1}{σ^{2}} .

(4)

Let us denote $E [S_{n}^{2} / R_{n}^{2}]$ by α_n. Then,

E [\frac{1}{R_{n}^{2}}] = E [\frac{S_{n}^{2}}{R_{n}^{2}}] \cdot E [\frac{1}{S_{n}^{2}}] = α_{n} \cdot (\frac{n - 1}{n - 3}) \frac{1}{σ^{2}} .

(5)

Thus, when we set

β_{n} = \frac{n - 3}{(n - 1) α_{n}},

(6)

we get

E [β_{n} / R_{n}^{2}] = 1 / σ^{2}

, i.e.,

β_{n} / R_{n}^{2}

is an unbiased estimator of 1/σ².

We note that there is no closed-form formula for β_n. However, as $S_{n}^{2} / R_{n}^{2}$ is free of μ and σ, we can obtain the value of α_n and consequently the value of β_n, by simulations over a large number of runs as described below in Section Numerical simulations. Furthermore, we can then approximate the simulated values of α_n and β_n by simple functions and thus obtain a simple-to-use estimator of 1/σ².

Scenario 2, {q₁, m, q₃; n}

As in scenario 1, let I_n = q₃ − q₁ be the interquartile range and let $γ_{n} = E [S_{n}^{2} / I_{n}^{2}]$ . Let δ_n = (n − 3)/(n − 1)γ_n. Then, $δ_{n} / I_{n}^{2}$ is an unbiased estimator of 1/σ². As above, we can use simulations to get values of γ_n and δ_n and obtain approximations for simple-to-use estimator.

Scenario 3 {a, q₁, m, q₃, b; n}

As in Shi et al.,¹⁴ we will get an unbiased estimator of σ⁻² if we take any linear combination of the two estimators presented above for Scenarios 1 and 2. We need to find the weights that yield the smallest variance.

For any w ∈ [0, 1], consider an estimator $w β_{n} / R_{n}^{2} + (1 - w) δ_{n} / I_{n}^{2}$ . Let $R_{n}^{*} = R_{n} / σ^{2}$ and $I_{n}^{*} = I_{n} / σ^{2}$ be the range and interquartile range, respectively, for the standard normal distribution N(0, 1). Note that $E [β_{n} / R_{n}^{* 2}] = 1$ and $E [δ_{n} / I_{n}^{* 2}] = 1$ . Thus,

Var (w \frac{β_{n}}{R_{n}^{2}} + (1 - w) \frac{δ_{n}}{I_{n}^{2}}) = \frac{1}{σ^{4}} (w^{2} V_{1, n} + {(1 - w)}^{2} V_{2, n} + 2 w (1 - w) V_{3, n}),

(7)

where

\begin{array}{c} V_{1, n} & = Var (\frac{β_{n}}{R_{n}^{* 2}}) = E [{(\frac{β_{n}}{R_{n}^{* 2}} - 1)}^{2}], \end{array}

(8)

\begin{array}{c} V_{2, n} & = Var (\frac{δ_{n}}{I_{n}^{* 2}}) = E [{(\frac{δ_{n}}{I_{n}^{* 2}} - 1)}^{2}], \end{array}

(9)

\begin{array}{c} V_{3, n} & = Cov (\frac{β_{n}}{R_{n}^{* 2}}, \frac{δ_{n}}{I_{n}^{* 2}}) = E [(\frac{β_{n}}{R_{n}^{* 2}} - 1) (\frac{δ_{n}}{I_{n}^{* 2}} - 1)] \end{array} .

(10)

Minimizing (7) with respect to w ∈ (−∞, ∞), we get the equation

2 w V_{1, n} - 2 (1 - w) V_{2, n} + 2 (1 - 2 w) V_{3, n} = 0 .

(11)

Thus, the optimal weight is given by

w_{o p t, n} = \frac{V_{2, n} - V_{3, n}}{V_{1, n} + V_{2, n} - 2 V_{3, n}} .

(12)

As before, while we could not obtain closed formulas for V_1,n, V_2,n and V_3,n, we can obtain their approximate values from the simulations. This in turn gives an approximate value of the optimal weight w_opt,n in (12). Moreover, the simulations yield w_opt,n ∈ (0, 1); if we had w_opt,n < 0 then we would have to use 0 as the optimal weight and, similarly, if w_opt,n > 1, we would have to use 1.

Numerical simulations

For every n ∈ {9, 13, 17, …, 97} = {4k + 1; k = 2, 3, …, 24}, we generated 10⁵ random samples of size n from N(0, 1). We restricted ourselves to n < 100 since, as seen in Figure 1, the indirect approximation has the highest bias and MSE for n < 50.

For each sample, we calculated the sample standard deviation S_n, sample range $R_{n}^{*}$ , and interquartile range $I_{n}^{*}$ . We then obtained values of α_n and γ_n as the average value of $S_{n}^{2} / R_{n}^{* 2}$ and $S_{n}^{2} / I_{n}^{* 2}$ , respectively, with which we obtained β_n = (n − 3)/(n − 1)α_n and δ_n = (n − 3)/(n − 1)γ_n. Similarly, we obtained values of V_1,n, V_2,n and V_3,n from formulas (8)–(10) and used those values to obtain a simulated value of w_opt,n by (12).

We calculated bias of the estimates as the average of $β_{n} / R_{n}^{2} - 1 / σ^{2}$ and $δ_{n} / I_{n}^{2} - 1 / σ^{2}$ , respectively. The mean square error was calculated as the average of ${(estimate - σ^{- 2})}^{2}$ .

Finally, as in Wan et al.,⁸ we study the performance of the proposed estimators relative to the naïve methods using the state of the art estimators of σ from Shi et al.¹⁴ for various distributions, specifically (a) normal distribution with μ = 50 and σ = 17, (b) log-normal distribution with μ = 4 and σ = 0.3, (c) beta distribution with α = 9 and β = 4, (d) exponential distribution with λ = 10, and (e) Weibull distribution with a = 2 and b = 35.

Results

Table 1 shows values of the constants α_n, β_n, γ_n, δ_n, w_n obtained by numerical simulations. We note that the coefficient β_n agrees with the Taylor series approximation method (with polynomial of order 4) developed in Walter et al.¹³

Table 1.

The results of numerical simulations (rounded to four decimal places): n is the sample size; α_n is the simulated average of $S_{n}^{2} / R_{n}^{2}$ ; β_n = (n − 3)/(n − 1)α_n; γ_n is the simulated average of $S_{n}^{2} / I_{n}^{2}$ ; δ_n = (n − 3)/(n − 1)γ_n; w_n is the optimal weight of estimate E₁ From (20) in the case of Scenario 3. For every scenario, the Bias and MSE show bias and Mean Square Error of (independent) 10⁵ simulations of estimating σ⁻² by (18), (19), and (20), respectively.

	Scenario 1				Scenario 2				Scenario 3
n	α _n	β _n	Bias	MSE	γ _n	δ _n	Bias	MSE	w _n	Bias	MSE
9	0.1095	6.8492	0.0022	0.5400	1.0580	0.7089	−0.0380	5.4088	0.9681	0.0044	0.5769
13	0.0890	9.3671	0.0014	0.3029	0.8238	1.0116	0.0230	2.3341	0.9395	0.0031	0.2964
17	0.0777	11.2607	−0.0004	0.2154	0.7412	1.1804	0.0223	0.8175	0.8561	0.0031	0.2002
21	0.0706	12.7412	0.0005	0.1738	0.6999	1.2859	0.0178	0.5075	0.7990	0.0041	0.1536
25	0.0655	14.0039	−0.0001	0.1486	0.6722	1.3636	0.0128	0.3603	0.7553	0.0031	0.1257
29	0.0617	15.0541	0.0002	0.1294	0.6512	1.4260	0.0053	0.2817	0.7283	0.0016	0.1056
33	0.0586	15.9885	0.0000	0.1172	0.6366	1.4727	0.0013	0.2316	0.7025	0.0004	0.0925
37	0.0561	16.8221	0.0001	0.1078	0.6271	1.5060	0.0016	0.1979	0.6812	0.0006	0.0828
41	0.0540	17.5840	−0.0016	0.0994	0.6206	1.5307	0.0025	0.1742	0.6640	−0.0001	0.0743
45	0.0523	18.2388	−0.0016	0.0924	0.6133	1.5565	0.0002	0.1553	0.6537	−0.0010	0.0681
49	0.0507	18.8846	−0.0028	0.0877	0.6082	1.5758	−0.0005	0.1364	0.6293	−0.0019	0.0624
53	0.0495	19.4271	−0.0005	0.0827	0.6027	1.5955	−0.0008	0.1250	0.6224	−0.0006	0.0579
57	0.0483	19.9654	0.0003	0.0801	0.5990	1.6098	0.0001	0.1128	0.6006	0.0002	0.0542
61	0.0472	20.4650	−0.0009	0.0761	0.5958	1.6226	0.0001	0.1055	0.5953	−0.0005	0.0510
65	0.0463	20.9308	0.0004	0.0732	0.5925	1.6350	0.0011	0.0975	0.5841	0.0007	0.0482
69	0.0454	21.3664	−0.0002	0.0703	0.5906	1.6435	0.0018	0.0925	0.5785	0.0006	0.0457
73	0.0446	21.7866	−0.0001	0.0679	0.5873	1.6554	0.0012	0.0866	0.5695	0.0004	0.0434
77	0.0439	22.1900	−0.0002	0.0662	0.5861	1.6613	0.0034	0.0827	0.5625	0.0014	0.0419
81	0.0433	22.5399	0.0006	0.0648	0.5833	1.6714	0.0016	0.0761	0.5457	0.0010	0.0398
85	0.0426	22.9036	0.0009	0.0630	0.5819	1.6777	0.0031	0.0725	0.5389	0.0019	0.0380
89	0.0421	23.2327	0.0006	0.0613	0.5803	1.6841	0.0028	0.0688	0.5320	0.0016	0.0366
93	0.0415	23.5493	0.0022	0.0602	0.5786	1.6906	0.0044	0.0662	0.5260	0.0032	0.0356
97	0.0411	23.8434	0.0022	0.0587	0.5777	1.6949	0.0043	0.0628	0.5180	0.0032	0.0339

By trial and error, we discovered that $α_{n}^{- 0.75}$ , $w_{o p t, n}^{- 0.75}$ and 1/(δ_n − 2.25) are almost linear in ln n; see Figure 2. Applying linear regression then yields the following approximations.

\begin{array}{l} α_{n} \approx {(2.4 \ln (n) - 0.02)}^{- 4 / 3}, \end{array}

(13)

\begin{array}{c} β_{n} \approx \frac{n - 3}{n - 1} {(2.4 \ln (n) - 0.02)}^{4 / 3}, \end{array}

(14)

\begin{array}{c} γ_{n} \approx \frac{1}{3.25 \ln (n) - 5.45} + 0.47, \end{array}

(15)

\begin{array}{c} δ_{n} \approx 2.25 - \frac{2}{\ln (n) - 0.92}, \end{array}

(16)

\begin{array}{c} w_{o p t, n} \approx {(0.2727 \ln (n) + 0.3617)}^{- 4 / 3} \end{array} .

(17)

Figure 2.

Top row: approximating (a) $α_{n}^{- 0.75}$ , (b) $- {(δ_{n} - 2.25)}^{- 1}$ , and (c) $w_{o p t, n}^{- 0.75}$ , by functions linear in ln n. Bottom row: relative errors of the approximations (d) equations (13), (3) equations (16), and (f) equation (17).

This yields the following unbiased estimators of 1/σ²:

\begin{array}{c} E_{1} = \frac{(n - 3) {(2.4 \ln (n) - 0.02)}^{4 / 3}}{(n - 1) {(b - a)}^{2}}, \end{array}

(18)

\begin{array}{l} E_{2} = \frac{2.25 - \frac{2}{\ln (n) - 0.92}}{{(q_{3} - q_{1})}^{2}}, \end{array}

(19)

\begin{array}{c} E_{3} & = w_{o p t, n} E_{1} + (1 - w_{o p t, n}) E_{2} . \end{array}

(20)

The estimator E_i is used in scenario i. The illustration that the estimators are unbiased and their mean square errors are shown in Figure 3.

Figure 3.

Bias (a) and MSE (b) of estimating σ⁻² under various scenarios by estimators (18)–(20).

The performance of the proposed estimators for non-normal distributions is illustrated in Figures 4 and 5. We see that, in terms of MSE and generally also in terms of bias, the proposed estimators outperform the naïve method of estimating the standard deviation first and then raising it to power −2. The notable exception when bias is slightly smaller for the naïve methods are for n > 20 in Scenario 2 for beta distribution. The proposed methods also slightly underperform, in terms of bias, in Scenario 1 for n > 80 and log-normal or Weibull distributions.

Figure 4.

Bias of estimating σ⁻² by the newly proposed estimators (full markers) against the bias of the naïve method using the state of the art estimators from Shi et al.¹⁴ (empty markers) for Normal distribution (first row), Log-normal distribution (second row), Beta distribution (third row), Exponential distribution (fourth row) and Weibull distribution (fifth row) and scenario 1 (left column), scenario 2 (middle column) and scenario 3 (right column).

Figure 5.

MSE (on a logarithmic scale) of estimating σ⁻² by the newly proposed estimators (full markers) against the MSE of the state of the art estimators from Shi et al.¹⁴ (empty markers) for Normal distribution (first row), Log-normal distribution (second row), Beta distribution (third row), Exponential distribution (fourth row) and Weibull distribution (fifth row) and scenario 1 (left column), scenario 2 (middle column) and scenario 3 (right column).

Real data example

To illustrate the use of the proposed method in proposed method in a single study which might then be incorporated into a meta-analysis, we took a dataset from Monaco et al.³ which describes a randomised clinical trial on using tranexamic acid in open aortic aneurysm surgery. The dataset consists of two groups (experimental and control). Each group has n = 50 patients and for each patient, there are 7 continuous variables as described in Table 2. From the dataset, we estimated the inverse variance in each group separately, based on the individual data points, and taking the reciprocal of the standard estimate of the variance, i.e., s⁻² where s is the sample SD. We also estimated the appropriate quantiles and applied the proposed direct methods (18)-(20) as well as the indirect method of estimating σ first as in Shi et al.¹⁴ and then raising it to the power −2.

Table 2.

Summary of the analysis performed on seven continuous variables from data from Monaco et al.³ D = direct method as in (18)–(20). I = indirect method by estimating σ first as in Shi et al.¹⁴

Variable	1 (10⁴)	2 (10³)	3 (10⁶)	4 (10⁴)	5 (10⁵)	6 (10⁶)	7 (10¹)
Experimental group (n = 50)
True s⁻²	3.71	5.894	1.19	4.864	5.071	1.087	4.546
D, Sc 1	2.921	4.638	0.5366	4.109	4.786	0.5047	2.968
I, Sc 1	3.11	4.937	0.5712	4.375	5.095	0.5373	3.16
D, Sc 2	5.849	8.069	4.393	9.885	6.583	2.471	15.82
I, Sc 2	6.072	8.377	4.561	10.26	6.834	2.565	16.42
D, Sc 3	4.029	5.936	1.996	6.295	5.466	1.249	7.83
I, Sc 3	3.984	6.03	1.056	5.944	5.716	0.8879	5.328
Control group (n = 50)
True s⁻²	4.198	4.999	2.528	2.512	5.438	2.281	7.449
D, Sc 1	3.925	3.664	1.569	1.596	5.022	1.348	7.598
I, Sc 1	4.178	3.901	1.671	1.699	5.346	1.435	8.089
D, Sc 2	5.849	4.881	3.743	3.228	3.763	3.985	15.82
I, Sc 2	6.072	5.067	3.886	3.351	3.907	4.137	16.42
D, Sc 3	4.653	4.125	2.392	2.213	4.546	2.346	10.71
I, Sc 3	4.829	4.324	2.264	2.184	4.674	2.075	10.49

The bold font signifies which of the two estimates is closer to the true value of s⁻². To keep the inverse variances on the same scale, the values shown are multiplied by appropriate power of 10 shown in the first row. Variable description: 1 = Surgical time, 2 = Clamping time, 3 = Intraoperative blood loss, 4 = Blood loss 0 to 4 h after surgery, 5 = Blood loss 0 to 24 h after surgery, 6 = Blood loss from the beginning of surgery to 24 h after surgery, 7 = Postoperative hospital stay.

The summary is presented in Table 2. The data published in Monaco et al.³ contained only the quartile values, so this corresponds to our Scenario 2. The proposed direct methods in Scenario 2 always outperform the indirect method in the experimental group and mostly (5 out 7) outperforms the indirect method in the control group. However, in this Scenario, both methods also yield the worst errors among the methods we considered. The estimated values are closer to s⁻² in Scenario 1 (when the indirect methods outperform the proposed method) and Scenario 3 (when the estimates are closest to the true value and the results are mixed). The goodness of the approximation is consistent with Figures 1 and 3 which show that, for n < 100, the MSEs in Scenario 3 are smaller than in Scenario 1 which are smaller than in Scenario 2.

Conclusions and discussion

The problem of estimating an unreported standard deviation from reported data summaries is an important problem in meta-analysis. Recently, unbiased methods with least MSE have been developed in Shi et al.¹⁴ and Balakrishnan et al.¹⁵ developed a unified approach that works for any reported summaries. However, even these optimal methods possess bias when used to estimate the inverse variance.

In this paper, we have developed a simple and efficient method for estimating the inverse variance directly. We have proved analytically that the proposed estimators are unbiased for normally distributed data. We obtained numerical values of the estimators through Monte Carlo simulations using which we have found simple-to-use approximations.

The proposed estimators were developed under the assumption that the data is normally distributed. Under this assumption, the proposed estimators outperform the naïve methods and even the best estimators from Shi et al.¹⁴ and Balakrishnan et al.¹⁵ both in terms of bias and MSE. However, the data distribution is typically not known and it thus important to estimate the robustness of the proposed method. We saw that the proposed estimators also improve MSE, and almost always (except for the exponential distribution and large n) possess lower bias, over the naïve methods even for data that are not normally distributed.

Therefore, when the inverse variance is the true quantity of interest, for pooling purposes, for example, one should estimate it directly using the estimators presented in this paper.

Footnotes

Acknowledgements

We thank Dr. Landoni (IRCCS San Raffaele Scientific Institute and Vita-Salute San Raffaele University, Milan, Italy) for providing the original study data used in the example.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research was funded by the Natural Sciences and Engineering Research Council of Canada RGPIN-2020-06733 and RGPIN-2016-03670. The funding agency had no input in study design, analysis and interpretation of data, in the writing of the report, nor in the decision to submit the article for publication.

ORCID iD

Jan Rychtář

References

Higgins

Thomas

Chandler

et al. Cochrane handbook for systematic reviews of interventions. Hoboken, New Jersey: John Wiley & Sons, 2019.

Thatcher

De Campos

Bell

et al. Epoetin alpha prevents anaemia and reduces transfusion requirements in patients undergoing primarily platinum-based chemotherapy for small cell lung cancer. Br J Cancer 1999; 80(3): 396–402.

Monaco

Nardelli

Pasin

et al. Tranexamic acid in open aortic aneurysm surgery: a randomised clinical trial. Br J Anaesth 2020; 124(1): 35–43.

Tippett

LHC

. On the extreme individuals and the range of samples taken from a normal population. Biometrika 1925; 17(3/4): 364–387.

Hozo

Djulbegovic

Hozo

. Estimating the mean and variance from the median, range, and the size of a sample. BMC Med Res Methodol 2005; 5(1): 13.

Walter

Yao

. Effect sizes can be calculated for studies reporting ranges for outcome variables in systematic reviews. J Clin Epidemiol 2007; 60(8): 849–852.

Ramírez

Cox

. Improving on the range rule of thumb. Rose-Hulman Undergrad Math J 2012; 13(2): 1.

Wan

Wang

Liu

et al. Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC Med Res Methodol 2014; 14(1): 135.

Bland

. Estimating mean and standard deviation from the sample size, three quartiles, minimum, and maximum. Int J Stat Med Res 2015; 4(1): 57–64.

10.

Luo

Wan

Liu

et al. Optimally estimating the sample mean from the sample size, median, mid-range, and/or mid-quartile range. Stat Methods Med Res 2018; 27(6): 1785–1805.

11.

Rychtář

Taylor

. Estimating the sample variance from the sample size and range. Stat Med 2020; 39(30): 4667–4686.

12.

Weir

Butcher

Assi

et al. Dealing with missing standard deviation and mean values in meta-analysis of continuous outcomes: a systematic review. BMC Med Res Methodol 2018; 18(1): 1–14.

13.

Walter

Rychtář

Taylor

et al. Estimation of standard deviations and inverse-variance weights from an observed range. Stat Med 2022; 41: 242–257.

14.

Shi

Luo

Weng

et al. Optimally estimating the sample standard deviation from the five-number summary. Res Synth Methods 2020; 11(5): 641–654.

15.

Balakrishnan

Rychtář

Taylor

et al. Unified approach to optimal estimation of mean and standard deviation from sample summaries. Stat Methods Med Res 2022; online first: DOI: 10.1177/09622802221111546

16.

Casella

Berger

. Statistical inference. Belmont, CA: Brooks/Cole Cengage Learning, 2021.

Approximately unbiased estimators of the inverse variance from sample summary statistics

Abstract

Background

Research Design

Results

Conclusions

Keywords

Introduction

Methods

Scenario 1, {a, m, b; n}

Scenario 2, {q1, m, q3; n}

Scenario 3 {a, q 1 , m, q 3 , b; n}

Numerical simulations

Results

Real data example

Conclusions and discussion

Footnotes

Acknowledgements

Declaration of conflicting interests

Funding

ORCID iD

References

Scenario 2, {q₁, m, q₃; n}

Scenario 3 {a, q₁, m, q₃, b; n}