Sage Journals: Discover world-class research

Abstract

Under a repeated screening program, Njor et al.¹ provided a method to test independence of tests, which leads to a simple calculation of the cumulative risk of receiving a false-positive test. To describe their method, we, without loss of generality, consider the simplest case in which the same screening modality is applied to each individual twice and all people took both tests. Let X_i = 1 or 0 denote a false-positive result or not a false-positive result at test i for i = 1, 2, and let n_ij denote the frequency of the screening outcome (X₁, X₂) = (i, j) for i, j = 0, 1. We use p to denote the probability of receiving at least one false-positive tests, i.e., p = 1 – P(X₁ = 0, X₂ = 0). Then the maximum likelihood estimate of the parameter p is

{\hat{p}}_{O B S} = 1 - n_{00} / n,

where n = n₁₁ + n₁₀ + n₀₁ + n₀₀ is the sample size. If X₁ and X₂ are independent, then the maximum likelihood estimate of p is given by

p^{*} = 1 - (1 - {\hat{p}}_{1}) (1 - {\hat{p}}_{2}),

where

\hat{p}

₁ = (n₁₁ + n₁₀)/n and

\hat{p}

₂ = (n₁₁ + n₁₀)/n are maximum likelihood estimates of P(X₁ = 1) and P(X₂ = 1), respectively.

The method by Njor et al.¹ states that the hypothesis of independence between X₁ and X₂ cannot be rejected at 5% level of significance if the estimate p^* is within the following 95% confidence interval (CI) of p

\begin{array}{l} ({\hat{p}}_{O B S} - 1.96 \sqrt{{\hat{p}}_{O B S} (1 - {\hat{p}}_{O B S}) / n)}, \\ {\hat{p}}_{O B S} + 1.96 \sqrt{{\hat{p}}_{O B S} (1 - {\hat{p}}_{O B S}) / n}), \end{array}

(see left bottom of Njor et al.¹, page 95). It appears that there is a gap between this method and method of testing independence between X₁ and X₂. In fact, their method is a test of the following null hypothesis:

H_{0} : p = p^{*} v s . H_{A} : p \neq p^{*} .

That is, their method tests whether p is equal to a given number p^* or not, which is not equivalent to testing independence of X₁ and X₂ even though p^* is the maximum likelihood estimate of p under the independence assumption. To see this, we consider a screening program, which offers two tests to detect a certain disease. Assume that 100 people participate in this program and data are given by n₁₁ = 5, n₀₁ = 6, n₁₀ = 7, n₀₀ = 82. Thus, $\hat{p}$ ₁ = 0.12, $\hat{p}$ ₂ = 0.11, p^* = 1 – (1 – 0.11) (1 – 0.12) = 0.2168, and $\hat{p}$ _OBS = 0.18. Furthermore, the 95% CI of p is given by (0.1047, 0.2553). As p^* = 0.2168 is within this CI, using the method of Njor et al.¹ one concludes that two screening tests are independent. However, two tests are clearly not independent because the odds ratio between two tests is 9.3691. The dependence between X₁ and X₂ can be also seen by performing either the Pearson-Chi square test or the Fisher exact test. The P values for the Pearson-Chi square test and Fisher exact test are < 0.0018 and 0.0033, respectively.

Although the above example is artificial, it is sufficient to explain that the method of Njor et al.¹ cannot be used to test independence of the repeated screening tests. There are different methods in the literature to calculate the cumulative risk of a false-positive test under a repeated screening program.^2–8 All methods are based on certain assumptions. For example, Hofvind et al.² assumed independence of tests and calculated the cumulative risk of a false-positive recall in the Norwegian Breast Cancer Screening Program. Under the independence assumption, calculation of the cumulative risk of a false-positive test will be much easier than calculation of the cumulative risk of a false-positive test under other assumptions. It is well-known that the number of tests under a repeated screening program is usually greater than two and that the pattern of missing tests is usually very complicated. Because of these difficulties, we believe that testing independence of tests appears to be a statistical challenge and needs further investigation.

References

Njor

, Olsen

, Schwartz

. Predicting the risk of a false-positive test for women following a mammography screening programme. J Med Screen 2007; 14: 94-7

Hofvind

, Thoresen

, Tretli

. The cumulative risk of a false-positive recall in the Norwegian Breast Cancer Screening Program. Cancer 2004; 101: 1501-7

Baker

, Kramer

. Estimating the cumulative risk of a false-positive under a regimen involving various types of cancer screening tests. J Med Screen 2008; 15: 18-22

Christiansen

, Wang

, Barton

. Predicting the cumulative risk of false-positive mammograms. J Natl Cancer Inst 2000; 92: 1657-66

Croswell

, Kramer

, Kreimer

. Cumulative incidence of false-positive results in repeated, multi-modality cancer screening. Annals of Family Medicine 2009; In press

Elmore

, Barton

, Moceri

. Ten-year risk of FP screening mammograms and clinical breast examinations. N Engl J Med 1998; 338: 1089-96

Gelfand

, Wang

. Modelling the cumulative risk for a false-positive under repeated screening events. Stat Med 2000; 90: 1865-79.

, Fagerstrom

, Prorok

, Kramer

. Estimating the cumulative risk of a false-positive test in a repeated screening program. Biometrics 2004; 60: 651-60 DOI: 10.1258/jms.2009.008099

On testing independence of repeated screening tests

Abstract

References