Abstract

Under a repeated screening program, Njor et al.
1
provided a method to test independence of tests, which leads to a simple calculation of the cumulative risk of receiving a false-positive test. To describe their method, we, without loss of generality, consider the simplest case in which the same screening modality is applied to each individual twice and all people took both tests. Let Xi = 1 or 0 denote a false-positive result or not a false-positive result at test i for i = 1, 2, and let nij denote the frequency of the screening outcome (X1, X2) = (i, j) for i, j = 0, 1. We use p to denote the probability of receiving at least one false-positive tests, i.e., p = 1 – P(X1 = 0, X2 = 0). Then the maximum likelihood estimate of the parameter p is
The method by Njor et al. 1 states that the hypothesis of independence between X1 and X2 cannot be rejected at 5% level of significance if the estimate p* is within the following 95% confidence interval (CI) of p
(see left bottom of Njor et al. 1 , page 95). It appears that there is a gap between this method and method of testing independence between X1 and X2. In fact, their method is a test of the following null hypothesis:
That is, their method tests whether p is equal to a given number p* or not, which is not equivalent to testing independence of X1 and X2 even though p* is the maximum likelihood estimate of p under the independence assumption. To see this, we consider a screening program, which offers two tests to detect a certain disease. Assume that 100 people participate in this program and data are given by n11 = 5, n01 = 6, n10 = 7, n00 = 82. Thus,
Although the above example is artificial, it is sufficient to explain that the method of Njor et al. 1 cannot be used to test independence of the repeated screening tests. There are different methods in the literature to calculate the cumulative risk of a false-positive test under a repeated screening program.2–8 All methods are based on certain assumptions. For example, Hofvind et al. 2 assumed independence of tests and calculated the cumulative risk of a false-positive recall in the Norwegian Breast Cancer Screening Program. Under the independence assumption, calculation of the cumulative risk of a false-positive test will be much easier than calculation of the cumulative risk of a false-positive test under other assumptions. It is well-known that the number of tests under a repeated screening program is usually greater than two and that the pattern of missing tests is usually very complicated. Because of these difficulties, we believe that testing independence of tests appears to be a statistical challenge and needs further investigation.
