Abstract

Karl Pearson’s 1904 report on Certain enteric fever inoculation statistics is seen as a key paper in the history of meta-analysis.1–3 In it, Pearson raised several important methodological issues arising from his correlations between typhoid and mortality and the inoculation status of soldiers serving in various parts of the British Empire. 4
First, he noted the ‘significance’ of the individual correlations. For this he used the magnitude of the correlations in relation to their ‘probable errors’. Second, he pointed out the ‘extreme irregularity’ of the correlation values – what we would now call heterogeneity – and sought to explain why they differed. Third, he commented on the ‘lowness’ of the values, arguing that they were too low to convince him that the inoculation had been proven worthwhile. He felt that a better vaccine was needed.
Pearson also commented on how the data had been obtained. He was concerned that self-selection into the inoculated group by volunteers who were ‘more cautious and careful’ could have produced spurious estimates of effectiveness. This and his concerns about the weakness of the correlations led him to recommend that an ‘experiment’ be done. He did not propose a randomised controlled trial – he was writing before Fisher developed the theoretical reasons for random allocation – but Pearson clearly understood the need for comparability of groups. His solution was to call for volunteers, register them all, and only inoculate every second one.
The data available to Pearson were presented in 2 × 2 tables. To create a measure of effect, he computed for each table the tetrachoric correlation, which he had described a few years earlier. 5 The approach assumes the data come from a bivariate normal distribution and derives the correlation based on that distribution.
Tetrachoric correlations calculated by Pearson and relative odds for typhoid fever data.
For correlations, the overall estimate is the arithmetic mean of the correlations, as given by Pearson. For the relative odds, it is the Mantel-Haenszel pooled estimate. N/A shows that the separate estimates were heterogeneous, and hence not pooled.
A formal test of heterogeneity for the odds ratios in the first set of tables confirms Pearson’s observation (Breslow-Day X2 = 90.6 on 4 df, p < 0.001). However, this is not so for the second set, for which the test is not conventionally statistically significant: X2 = 6.9 on 5 df, p = 0.23. Given this, it is legitimate to compute a pooled odds ratio: the Mantel-Haenszel estimate is 1.77 (95% CI 1.5–2.1).
A final point: Pearson considered the effectiveness of inoculation in two steps – whether it prevented soldiers from acquiring typhoid fever and whether it reduced mortality in those who had developed the disease. For four of the groups, it is possible to explore directly the relationship between inoculation and mortality from the disease. The odds ratios are in the range of 2.2–6.8. They are not significantly different from each other – X2 = 5.2 on 3 df, p = 0.16. The pooled estimate is 4.5 (95% CI 3.1–6.6). At face value, it is a strong effect (the inverse is 0.22, 95% CI 0.15–0.32) by current criteria. Even so, I suspect that Pearson would still not have been convinced of the value of vaccination but would have continued to insist that further work was needed, including a proper controlled trial.
