Abstract

To the Editor:
We welcome the interest and enthusiasm of Halber et al. (1997) in the nonparametric randomization test that we described (Holmes et al., 1996) not least because they called their program ‘Sherlock’. However, as the great detective would surely have pointed out, there are a number of flaws in the article that led the authors to an incorrect primary conclusion, that our test “is a less-sensitive procedure [than] a standard parametric test“.
First, “we must look for consistency” (Doyle, The Problem of Thor Bridge). The principal flaw is that the ‘standard’ parametric test was used by Halber et al. in a nonstandard way. The parametric statistic images were thresholded at Z = 1.64, or a P ≤ 0.05, uncorrected for multiple comparisons. Thus, the false-positive or Type I error rate is 0.05 per voxel. So we expect that 5% of voxels to appear above the threshold by chance alone—exploiting inappropriately Holmes' adage that “it is a question of cubic capacity. A man with so large a brain must have something in it” (Doyle, The Adventure of the Blue Carbuncle). In contrast, the nonparametric randomization test is constructed to control strongly for Type I error at the image level, giving a Type I error on average once for every 20 applications of the test. For an equitable comparison of test sensitivities, we must “balance probabilities” (Doyle, The Hound of the Baskervilles). The only valid comparison is with a parametric test that uses the same statistical model and also controls Type I error strongly for the family of voxel tests, which invariably would lead to a much higher threshold than 1.64. Our experience is that with such a comparison, the two approaches have remarkably similar sensitivities, and by using variance smoothing, the nonparametric test is more powerful at low degrees of freedom.
There is a similar lack of consistency in the visual comparison that the reader is asked to make between Figs. 2 and 3 in the article by Halber et al. The maximum intensity projection method for the whole brain used in Fig. 3 almost invariably makes the extent of activation appear larger than when they are shown in a set of orthogonal views through a single focus, as used in Fig. 2. “It is, of course, a trifle but there is nothing so important as trifles” (Doyle, The Man with the Twisted Lip).
In conclusion, we are concerned that Halber et al. have presented our method poorly through inappropriate comparisons, but are nonetheless pleased to see the interest in the method, for we “cannot live without brainwork. What else is there to live for?” (Doyle, The Sign of Four).
