Sage Journals: Discover world-class research

Abstract

There is a view among some medical historians that the emergence of the randomized clinical trial originated from statistical thinking, and that the modern era of controlled trials was essentially ushered in with the iconic randomized trial of streptomycin for pulmonary tuberculosis reported by the British Medical Research Council (MRC) in 1948. For example:
The professional emergence of statistics as a codified body of knowledge and the concomitant rise of individuals trained in its methods provided the necessary conditions for the Laplacian vision of the probabilistically based clinical trial to come into being. ¹

The randomized clinical trial is ‘an extension of the statistician RA Fisher's ideas about experimental design’ (p. 132). ‘The statisticians’ randomized controlled trial came to represent the symbol and substance of the statistical method in medicine’. ²

The history of randomized clinical trials may be traced back to the biometricians’ work and it seems to be a good example of ‘applied statistics’. On the one hand there was a direct lineage from Pearson to Bradford Hill via Fisher and Major Greenwood… On the other hand, it is not too difficult to argue for conceptual legacy, since the basic concepts grounding the choice of randomisation can be traced back to RA Fisher's work. ³

[Karl] Pearson's statistical methods provided the framework for Austin Bradford Hill's work on the randomised clinical trial (pp. viii–ix) and constituted a seminal statistical idea. ⁴

The conceptualization of clinical trials as ‘a seminal statistical idea’ which ‘can be traced back to RA Fisher's work’ has not been demonstrated by these writers or by others. The early history of clinical trials has little to do with statistical theory and much more to do with the more fundamental and less technical concept of a fair – that is, unbiased – test.^5–11

The need to ‘compare like with like’ in fair tests of treatments has been recognized by some people for a long time, and not only by physicians. In a letter to Boccacio written in 1364, Petrarch wrote:
I solemnly affirm and believe, if a hundred or a thousand men of the same age, same temperament and habits, together with the same surroundings, were attacked at the same time by the same disease, that if one half followed the prescriptions of the doctors of the variety of those practising at the present day, and that the other half took no medicine but relied on Nature's instincts, I have no doubt as to which half would escape. ¹²

When quantitative methods began to be used at the beginning of the 18th century to assess the effects of variolation authors of the comparisons were sometimes reminded of the need to ensure that like was being compared with like. Thus Massey, challenging the interpretation of comparisons of mortality following variolation and after natural smallpox, wrote:
…to form a just Comparison, and calculate right in this Case, the Circumstances of the Patients, must and ought to be as near as may be on a Par. ¹³

Several reports of prospective experiments were published during the 18th century. In the most celebrated of these James Lind notes that, apart from the treatments, the 12 patients he studied were otherwise similar: ‘They all in general had putrid gums, the spots and lassitude, with weakness of their knees. They lay together in one place, being a proper apartment for the sick in the fore-hold; and had one diet common to all.’¹⁴ Lind does not tell us how he allocated his 12 patients to each of the six treatments he compared, but had he cast lots or used alternation or rotation it would not have been inconsistent with the use of these devices to make fair decisions in other contexts.¹⁵

At the beginning of the 19th century, Alexander Hamilton reported having used alternation to generate parallel comparison groups in a clinical trial of bloodletting done by him and two surgeon colleagues.¹⁶ He described how sick soldiers had been ‘admitted, alternately [my emphasis], in such a manner that each of us had one third of the whole’, and that ‘the sick were indiscriminately received’, and ‘attended as nearly as possible with the same care and accommodated with the same comforts’.¹⁶ Although his report leaves several uncertainties,¹⁷ it seems reasonable to speculate that he described the use of alternation to show that an effort had been made to generate comparable treatment groups.

By the middle of the 19th century, the rationale for alternation was sometimes being made explicit. In 1854, Thomas Graham Balfour described his assessment of whether belladonna could prevent scarlet fever. He divided 151 boys into two comparison groups, ‘taking them alternately from the list, to avoid the imputation of selection [my emphasis]’.¹⁸ It is clear from these words that Balfour used alternation to control selection bias. This is not a statistical concept, and although Balfour was a distinguished statistician as well as a doctor, he cannot be regarded as a theoretical statistician in the ‘Pearsonian/Fisherian’ sense.¹⁹

There are further isolated examples of alternation being used to generate treatment comparison groups during the last half of the 19th century, but they became increasingly common during the first half of the 20th century. Indeed, alternation as a feature of research design became referred to formally in English not only simply as ‘alternation’,²⁰ but also as ‘the alternate method’, ‘rational alternation’,²¹ and ‘the alternate case method’.^21,22 In French it was referred to as ‘la méthode alternante’;^23,24 and in German as ‘Simultanmethode’.²⁵ It is worth noting that designation of this methodological principle occurred before the theoretical statistical qualities of random allocation had been promoted in Ronald Fisher's The Design of Experiments.²⁶ Indeed, even though the word ‘random’ sometimes appeared in reports of controlled trials before the late 1940s, it was often actually alternation that was being used for allocation.²⁷

Unsurprisingly therefore, the use of alternation was reflected in articles and a book published by the Lancet in 1937, written by the father of medical statistics in Britain, Austin Bradford Hill:
By the allocation of the patients to the two groups we want to ensure that these two groups are alike except in treatment… this might be done, with reasonably large numbers, by a random division of the patients; the first being given treatment A, the second being orthodoxly treated and serving as a control, the third being given treatment A, the fourth serving as a control, and so on, no departure from this rule being allowed [my emphasis]. ²⁸

Of the two essential components of unbiased allocation – genesis of an unbiased sequence, and unbiased implementation of the sequence – the former remains a trivially easy task, while the latter will continue to pose challenges.¹¹ Hill was aware of this. In an internal report for the MRC dated 22 December 1933, Hill expressed concern about the allocation of patients to comparison groups in a MRC study of serum treatment for pneumonia in which alternation should have been used.²⁹ Imbalance in the sizes of the comparison groups made clear that alternation had not been strictly observed, prompting Hill to stress in his memorandum that greater effort should be taken ‘that the division of cases really did ensure a random selection’. In others words, to control allocation bias successfully, Hill realized that it is crucially important to conceal the allocation schedule from those involved in entering participants, thus preventing foreknowledge of allocations.

This principle was reflected in the first properly controlled multicentre trial conducted under the aegis of the British MRC. This was designed by Philip D'Arcy Hart to assess the effect of patulin on common cold symptoms.^30–32 When I interviewed him 60 years later, he told me:
Everyone had thought we would use alternation, and we thought we were very clever in setting up a scheme with two patulin groups and two placebo groups using letters to designate each of the four groups, then using rotation to allocate people to the different groups. We thought we were doing something completely new. We wanted to muddle people up. In fact we succeeded in muddling ourselves up. We didn't always remember what the letters stood for. None of us was a statistician, but we felt that the patulin trial was the first decently controlled trial the MRC had done. (IC interview with Philip D'Arcy Hart, 2 May 2003)

D'Arcy Hart was one of the team – with Marc Daniels and Austin Bradford Hill – that designed the MRC streptomycin trial. The report of the study is a model of clarity. A crucially important element is the statement that ‘the details of the (allocation) series were unknown to any of the investigators or to the coordinator and were contained in a set of sealed envelopes, each bearing on the outside only the name of the hospital and a number’.³³ The reason that the MRC streptomycin trial deserves its place in the history of clinical trials is this and other exceptionally clear statements assuring readers that adequate precautions had been taken to minimize the possibility of allocation bias, and thus assure readers that ‘like would be compared with like’.^34,35

In spite of a few examples of random allocation during the 1920s and 1930s, alternation remained the principal method for unbiased prospective allocation to treatment comparison groups^36,37 until well after the end of World War II, even in studies done by investigators such as Richard Doll, who were very familiar with Fisher's writings.³⁸ The ‘clinical’ and ‘statistical’ reasons for random allocation came together only during the second half of the 20th century. But even today, as has been noted by the distinguished statistician David Cox, the primary reason for using random allocation is not statistical, but to help prevent foreknowledge of treatment assignments, and thus the conscious or unconscious temptation to allow biased allocation to occur.³⁹

DECLARATIONS

Competing interests

None declared

Funding

None

Ethical approval

Not applicable

Guarantor

IC

Contributorship

IC is the sole contributor

Acknowledgements

A more detailed account of this issue is available in: Chalmers I. Statistical theory was not the reason that randomisation was used in the British Medical Research Council's clinical trial of streptomycin for pulmonary tuberculosis. In: Jorland G, Opinel A, Weisz G, eds. Body Counts: Medical Quantification in Historical and Sociological Perspectives. Montreal: McGill-Queens University Press, 2005:309–34. I am grateful to Doug Altman, Peter Armitage, Luc Berlivet, David Cox, Philip D'Arcy Hart, Richard Doll, David Hill, Michael Kramer, Stephen Lock, Irvine Loudon, Harry Marks, Iain Milne, Keith O'Rourke, William Silverman, Stephen Stigler, Ben Toth, Ulrich Tröhler, and Jan Vandenbroucke for commenting on earlier drafts of that paper. Additional material for this article is available from The James Lind Library website (www.jameslindlibrary.org), where it was originally published

References

Rosser Matthews

. Quantification and the Quest for Medical Certainty. Princeton, NJ: Princeton University Press, 1995

Marks

. The Progress of Experiment. Cambridge: Cambridge University Press, 2000

Gaudillière

J-P

. Beyond one-case statistics: mathematics, medicine, and the management of health and disease in the postwar era. In: Bottazzini

, Dalmedico

, eds. Changing images in mathematics: from the French Revoluton to the New Millennium. London: Routledge, 2001:283

Magnello

. The introduction of mathematical statistics into medical research: the roles of Karl Pearson, Major Greenwood and Austin Bradford Hill. In: Magnello

, Hardy

, eds. The road to medical statistics. Amsterdam: Rodopi, 2002

Chalmers

. Assembling comparison groups to assess the effects of health care. J R Soc Med 1997;90:379–86

Chalmers

. Why transition from alternation to randomisation in clinical trials was made. BMJ 1999;319:1372

Chalmers

. Comparing like with like: some historical milestones in the evolution of methods to create unbiased comparison groups in therapeutic experiments. Int J Epidemiol 2001;30:1156–64

Edwards

. Control and the therapeutic trial, 1918–1948. MD thesis. London: University of London, 2004

Chalmers

. Statistical theory was not the reason that randomisation was used in the British Medical Research Council's clinical trial of streptomycin for pulmonary tuberculosis. In: Jorland

, Opinel

, Weisz

, eds. Body counts: medical quantification in historical and sociological perspectives. Montreal: McGill-Queens University Press, 2005:309–34

10.

Edwards

. Control and the therapeutic trial. Amsterdam: Rodopi, 2006

11.

Chalmers

. Explaining the unbiased creation of treatment comparison groups. Lancet 2009;374:1670–1

12.

Petrarca

. Rerum Senilium Libri. Liber XIV, Epistola 1. Letter to Boccaccio (V.3). 1364

13.

Massey

. A short and plain account of inoculation. With some remarks on the main argument made use of to recommend that practice, by Mr. Maitland and others. To which is added, a letter to the learned James Jurin, M.D.R.S. Secr. Col. Reg. Med. Lond. Soc. In answer to his letter to the learned Dr. Cotesworth, and his comparison between the mortality of natural and inoculated small pox. The second edition . London: W Meadows, 1723

14.

Lind

. A treatise of the scurvy. In three parts. Containing an inquiry into the nature, causes and cure, of that disease. Together with a critical and chronological view of what has been published on the subject. Edinburgh: A Kincaid and A Donaldson, 1753

15.

Silverman

, Chalmers

. Casting and drawing lots: a time-honoured way of dealing with uncertainty and for ensuring fairness. The James Lind Library 2002. See www.jameslindlibrary.org

16.

Hamilton

. Dissertatio medica inauguralis de synocho castrensi (Inaugural medical dissertation on camp fever). Edinburgh: J Ballantyne, 1816

17.

Milne

, Chalmers

. Hamilton's report of a controlled trial of bloodletting, 1816. The James Lind Library 2002. See www.jameslindlibrary.org

18.

Balfour

. Quoted in West C. Lectures on the diseases of infancy and childhood. London: Longman, Brown, Green and Longmans, 1854

19.

Chalmers

, Toth

. 19th century controlled trials to test whether belladonna prevents scarlet fever. The James Lind Library 2009. See www.jameslindlibrary.org

20.

Bullowa

JGM

. The use of antipneumococcic refined serum in lobar pneumonia: data necessary for a comparison between cases treated with serum and cases not so treated, and the importance of a significant control series of cases. JAMA 1928;90:1354–8

21.

Choksy

KBNH

. On recent progress in serum-therapy of plague. BMJ 1908;1:1282–4

22.

Cecil

, Plummer

. Pneumococcus Type I pneumonia - a study of eleven hundred and sixty-one cases, with especial reference to specific therapy. JAMA 1930;95:1547–53

23.

Cousin

. Des éruptions consecutives aux injections de sérum antidiphthérique et de leur traitement prophylactique par l'ingestion de clorure de calcium. Thèse pour le Doctorat en médicine. Paris: Jules Rousset, 1905

24.

Netter

. Efficacité de l'ingestion de chlorure de calcium comme moyen préventif des éruptions consecutives aux injections de sérum. Séances et Mémoires de la Société de Biologie 1906;58:279–80

25.

Wagner-Jauregg

. Ueber die Infektionsbehandlung der progressiven Paralyse [On infection treatment of progressive paralysis]. Münchener Medizinische Wochenschrift 1931;78:4–7

26.

Fisher

. The design of experiments. Edinburgh: Oliver and Boyd, 1935

27.

Armitage

. Randomisation and alternation: a note on Diehl et al. The James Lind Library 2002. See www.jameslindlibrary.org

28.

Hill

. Principles of medical statistics. London: Lancet, 1937

29.

Medical Research Council Therapeutic Trials Committee. The serum treatment of lobar pneumonia. BMJ 1934;1:241–5

30.

Medical Research Council. Clinical trial of patulin in the common cold. Lancet 1944;2:373–5

31.

Clarke

. The 1944 patulin trial of the British Medical Research Council: an example of how concerted common purpose can get reliable answers to important questions very quickly. The James Lind Library 2004. See www.jameslindlibrary.org

32.

Chalmers

, Clarke

. The 1944 Patulin Trial: the first properly controlled multicentre trial conducted under the aegis of the British Medical Research Council. Int J Epidemiol 2004;32:253–60

33.

Medical Research Council. Streptomycin treatment of pulmonary tuberculosis: a Medical Research Council investigation. BMJ 1948;2:769–82

34.

Doll

. The controlled trial. Postgrad Med J 1984;60:719–24

35.

Doll

. Darwin Lecture. Development of therapeutic trials in preventive and therapeutic medicine. J Biosoc Sci 1991;23:365–78

36.

Podolsky

. Pneumonia before antibiotics: therapeutic evolution and evaluation in Twentieth-Century America. Baltimore, MD: Johns Hopkins University Press, 2006

37.

Podolsky

. Jesse Bullowa, specific treatment for pneumonia, and the development of the controlled clinical trial. The James Lind Library 2008. See www.jameslindlibrary.org

38.

Doll

. The role of data monitoring committees. In: Duley

, Farrell

, eds. Clinical Trials. London: BMJ Books, 2002:97–104

39.

Cox

. Randomization for concealment. The James Lind Library 2009. See www.jameslindlibrary.org

Why the 1948 MRC trial of streptomycin used treatment allocation based on random numbers

Abstract

DECLARATIONS

Competing interests

Funding

Ethical approval

Guarantor

Contributorship

Acknowledgements

References