Abstract

In our paper 1 we concluded that the area under the curve (AUC) of a receiver operating characteristic (ROC) plot was a statistic that should be abandoned as a summary of the performance of a screening test. The AUC has little direct meaning to most people; few would know the performance of a screening test with an AUC of 0.70 because it does not directly indicate the detection rate (or sensitivity) for a specified false-positive rate (complement of specificity), or vice versa. Also, the standard deviation of screening markers in affected and unaffected individuals often differ, and in such situations the AUC will be unhelpful or even misleading.
Pinsky found that for 29 ovarian cancer markers the ratio of standard deviations (affected to unaffected) ranged tenfold, from 0.28 to 3.2. Consequently, some of the markers will have an AUC that does not reflect screening performance, for reasons given in our paper. Pinsky reports high correlations between the AUC and detection rates for two specified false positive rates, and interprets this as validation of the AUC. This is not the case, as is illustrated in Table 3 of the paper Pinsky cites. 2 The markers CA15.3 and Spondin-2 had the same AUC (0.74), implying that they have identical screening performances, but for a false-positive rate of 5%, CA15.3 had a detection rate of 46% but Spondin-2 had a detection rate of 28%.
Dr Pinsky’s view is that the AUC avoids the somewhat arbitrary nature of determining a level at which to fix the false-positive rate. In practice, it is better either to show the ROC curve or to present a table with various estimates of screening performance, each specified according to a given detection rate or a given false positive rate. We do not think the use of a single AUC avoids the need to do this.
