The advent of fair treatment allocation schedules in clinical trials during the 19th and early 20th centuries

Abstract

The detailed and exceptionally clear 1948 report of the British Medical Research Council's randomized trial of streptomycin for pulmonary tuberculosis is rightly regarded as a landmark in the history of clinical trials.¹ Of crucial importance, it describes how a treatment allocation schedule (based on random number tables) was concealed, thus preventing foreknowledge of allocations among those making decisions about patient participation.^2,3

Although the report of the streptomycin trial is rightly iconic, the attention it has attracted has led many historians to overlook earlier evidence relevant to the evolution of unbiased prospective allocation of patients to treatment comparison groups. This has led some of them to assume that random allocation to treatment comparison groups reflected the development of statistical theory by RA Fisher.^3,4 In fact, for half a century before the MRC trial and Fisher's writings, some medical practitioners wishing to evaluate the effects of treatments had used alternate allocation to assemble similar groups of patients, and so ensure that like would be compared with like. And these developments reflected an even earlier history during which some clinicians and others began to conceptualize what was needed for tests of treatments to be fair.^5–7

Appreciation of the need to compare like with like

More than a millennium ago, some clinicians appreciated that comparisons are needed to arrive at causal inferences about the effects of medical treatments. In the 9th century CE, the Persian physician Al-Razi (Rhazes) explained why he recommended that bloodletting be used to treat the symptoms of meningitis:

‘…I once saved one group [of patients] by it, while I intentionally neglected [to bleed] another group. By doing that, I wished to reach a conclusion.’^8,9

Other people recognized centuries ago that, if treatment comparisons were going to be fair, like must be compared with like. Francisco Petrarch, in a letter to a fellow poet, wrote in 1364:
‘I solemnly affirm and believe, if a hundred or a thousand men of the same age, same temperament and habits, together with the same surroundings, were attacked at the same time by the same disease, that if one half followed the prescriptions of the doctors of the variety of those practising at the present day, and that the other half took no medicine but relied on Nature's instincts, I have no doubt as to which half would escape. ¹⁰ [emphasis added] (One assumes that the poet predicted that the reputation of the medical profession would not be enhanced by the fair comparison he was proposing!)
The writings of several medical researchers in the 18th century make clear that some of them appreciated the importance of comparing like with like in treatment comparisons.⁵ Isaac Massey, for example, challenging claims that inoculation was associated with much lower mortality than natural smallpox, observed that:
‘…to form a just comparison and calculate right in this case, the circumstances of the patients, must and ought to be as near as may be on a par .¹¹ [emphasis added]
And James Lind, in his account of a comparison of six different treatments for scurvy, was careful to note that factors other than the treatments were similar in the patients in his comparison groups:
‘….I took twelve patients in the scurvy… Their cases were as similar as I could have them . They all in general had putrid gums, the spots and lassitude, with weakness of their knees. They lay together in one place, being a proper apartment for the sick in the fore-hold; and had one diet common to all.’ ¹² [emphasis added]

Introduction of methods to ensure that like will be compared with like

Methods to ensure that like will be compared with like in fair treatment comparisons were proposed at least as early as the 17th century. Reflecting a time-honoured device for ensuring fairness,¹³ Van Helmont¹⁴ and Starkey¹⁵ proposed casting lots to decide which patients should be assigned to orthodox physicians (to be bled and purged), and which to their own, alternative treatments. A century later, Anton Mesmer challenged his orthodox physician detractors to cast lots to decide which patients should be treated by them, and which by him, using ‘animal magnetizm’:
In order to avoid any later argument and all the questions that could be raised about differences in age, in temperament, in diseases, in their symptoms etc. the assignment of the patients shall be made by the method of lots. ¹⁶ [emphasis added]
Casting lots is just one of several potentially unbiased methods that can be used to ensure that like will be compared with like in treatment comparisons. Alternation (or rotation) of successive patients to different treatments is an easily understood way of generating patient groups for fair treatment comparisons. As long as the underlying order of the patients' presentation has not been predetermined in some way that introduces bias, strict alternation ensures that no conscious or unconscious bias results in patients with better or worse prognoses being allocated to one of the treatment comparison groups. Other methods that have been used to ensure that like will be compared with like include patients' dates of birth, or the terminal digits of their case record numbers.

Some accounts of the use of unbiased treatment allocation appear early in the 19th century. In his 1816 Edinburgh doctoral thesis, Alexander Lesassier Hamilton reports having used rotation to allocate sick soldiers to different treatments at a base hospital in Elvas during the Peninsular War.^17,18 Patients were allocated either to his care; or to the care of a surgeon colleague who, like him, did not use bloodletting; or to a surgeon colleague who did use bleeding.
It had been so arranged, that this number [366] was admitted, alternately, in such a manner that each of us had one third of the whole. The sick were indiscriminately received, and were attended as nearly as possible with the same care and accommodated with the same comforts. One third of the whole were soldiers of the 61st Regiment, the remainder of my own (the 42nd) Regiment. Neither Mr Anderson nor I ever once employed the lancet. He lost two, I four cases; whilst out of the other third [treated with bloodletting by the third surgeon] thirty five patients died. ¹⁷ [emphasis added]
In 1835, a Society of Truth-loving Men in Nürnberg reported its remarkable blinded comparison of homeopathic provings with ‘snow water’. Vials containing one or other of the two substances were shuffled prior to distribution for assessment.^19,20 A few years later, Thomas Graham Balfour, an army surgeon in charge of an orphanage, was explicit about his rationale for using alternate allocation in his assessment of claims that belladonna was protective against scarlet fever. He reported having used alternation to allocate children either to receive belladonna or to a comparison group ‘to avoid the imputation of selection’. ^21,22

It seems reasonable to speculate that concern to compare like with like, and so to ‘avoid the imputation of selection’, explains the increasing use of alternate allocation to treatment comparison groups during the late 19th and early 20th centuries (in animals²³ as well as in humans). Writers in several countries emphasized the need to compare like with like. These included, for example, Jules Gavarret in France.^24,25 Elisha Bartlett in the USA,^25,26 William Guy In Britain,²⁷ and Alfred Ephraim in Germany (1890–1894).²⁸ A quotation from an 1877 Danish doctoral thesis on tracheotomy for diphtheria gives a flavour of the developing thinking about the grounds for causal inferences about the effects of treatments:
‘If any surgeon with material as large as chief physician Holmer could really take the decision, as a test, to let every second croup patient (with an indication for tracheotomy) remain without the operation and every second undergo the operation, and it turned out that the proportion of unoperated [patients who] recovered was equal to or higher than those operated [on], then one could begin to doubt the value of tracheotomy…’ ²⁹ [emphasis added]
The James Lind Library currently contains well over 200 reports of the use of such potentially unbiased methods of prospective allocation in treatment comparisons published before 1948, when the Medical Research Council's trial of streptomycin was published.¹ The earlier reports we have identified are listed at http://www.jameslindlibrary.org/context/allocation-bias.

During the early decades of the 20th century, alternate allocation became increasingly common as a feature of research design, and was designated formally using specific terms in several languages. In 1902, in an article published in Muenchener Mediziner Wochenschrift referring to alternate allocation trials on treatments for plague in India, Dr G Polverini of the Institute of Experimental Pathology in Florence deemed ‘die alternative Methode’ as the most appropriate ‘for assessing the healing power of a serum in humans’.³⁰ Six years later, one of the physicians responsible for the trials in India – Nasserwanji Hormusji Choksy – referred to the method they had been using as ‘the alternate case method’ and ‘rational alternation’.³¹ In France at about the same time, Maurice Cousin³² and his thesis supervisor Arnold Netter³³ referred to their use of ‘la méthode alternante’ in studies to assess ways of reducing serum sickness. In the USA, Jesse Bullowa³⁴ and Russell Cecil and Norman Plummer³⁵ referred to ‘alternation’ and to 'the alternate case method', respectively, in connection with their trials to assess the effects of serum treatment in pneumonia. And in Austria, Julius Wagner-Jauregg decided to ‘baptize’ the method ‘Simultanmethode’ in German after applying it in studies using fever to treat syphilis.³⁶

It is worth noting that this designation of alternation as a methodological principle by clinician researchers antedated Ronald Fisher's promotion of the theoretical statistical qualities of random allocation in The Design of Experiments.³⁷ Indeed, although there are examples of random allocation being used during the 1930s and early 1940s (see, for example, Doull;³⁸ Theobald;³⁹ Bell⁴⁰), use of the word ‘random’ to describe treatment allocation sometimes actually referred to alternation,⁴¹ even in the writings of Austin Bradford Hill, the statistician most closely associated with the adoption of randomization in Britain.^42,2,3

Where was alternate allocation used, in whom, and to test which interventions?

Pre-1948, alternate allocation trials were done across the world. To date, we have found examples in Algeria, Austria, Australia, Britain, Denmark, Egypt, Finland, France, Germany, India, Italy, Malaya, Netherlands, Sudan, the USA, and Vietnam. Among these, a few programmes of alternate allocation trials stand out. Those done in India by Waldemar Haffkine and Nasserwanji Hormusji Choksy at the turn of the century on vaccines and treatments for plague and cholera are early examples of separate studies done within a series of planned controlled trials.^43–46 In the USA (and in New York and Boston in particular), Jesse Bullowa, William Park, Russell Cecil, Max Finland and others were responsible for a remarkable series of trials testing serum treatment for pneumonia during the third and fourth decades of the 20th century.⁴⁷ The only example of anything comparable in Britain appears to have been a cluster of trials done by Thomas Anderson and his colleagues at Ruchill Hospital in Glasgow in the late 1930s, to assess the effects of sulphonamides in a variety of infections.⁴⁸

Unsurprisingly, given the overwhelming importance of infectious diseases at the time, many alternate allocation trials were done to assess the effects of interventions to prevent or treat infections. The target infections included bacillary dysentery, cerebrospinal fever, cholera, the common cold, diphtheria, erysipelas, gonorrhoea, impetigo, infant diarrhoea, infectious hepatitis, influenza, malaria, mastitis, measles, meningococcal meningitis, plague, pneumonia, poliomyelitis, puerperal fever, scarlet fever, syphilis, tonsillitis, trichomoniasis, Tsutsugamushi disease, tuberculosis, typhoid fever, typhus, and whooping-cough. The interventions tested included antibiotics, antiseptics, diet, Eucalyptus oil, gamma globulin, physical therapies, proteins and amino acids, specific sera, sulphonamides and other drugs, ‘therapeutic malaria’, vaccines, and vitamins.

Alternate allocation trials were also used to assess the effects of nutritional and other interventions to promote health and growth: unpolished and polished rice for beri-beri; germinated beans compared with lemon juice for scurvy; vitamin B1 for polyneuritis in alcohol addicts; and vitamins, minerals, milk and ultraviolet light to promote child growth and development. In pregnancy and childbirth, alternate allocation was used in studies to assess the effects of micronutrients to prevent anaemia and toxaemia; salt for leg cramps; analgesics for pain in labour; perineal shaving and postpartum care of the perineum; ergot alkaloids to reduce postpartum haemorrhage; treatments for acute mastitis and deficient lactation and for preventing sore nipples; and the effects of knee-chest position and postural exercises on postpartum uterine retroversion.

‘The alternate case method’ was also used to challenge claims that surgery was an effective treatment for psychosis, and to put some ‘old wives’ treatments' to the test: a Dr Middleton in Edinburgh reported that he had alternated tannic acid with ‘strong tea of the lumberjack variety’⁴⁹ for treating scalds in children, with results suggesting that the preferences of ‘old wives’ were as likely to be valid as those of medical experts.

More research is needed to increase understanding of the reasons for the explosion of alternate allocation studies from the 1890s onwards. One explanation may have been the gradual adoption of probabilistic, statistical thinking by some physicians.^24,25,28,50 However, even Almroth Wright, who made a career out of dismissing the application of statistics to medicine in the early part of the 20th century, had started doing alternate allocation studies by the early 1910s.⁵¹

What is clear is that, at least as early as the second decade of the 20th century, there were some very clear accounts of the principles that need to be observed when testing treatments. For example, in a paper entitled The crucial test of therapeutic evidence, which was based on an address given at the 1917 annual meeting of the American Medical Association, Torald Sollmann alluded to the unacceptability of biased under-reporting of commercial tests of drugs, and called for independent evaluations, using alternation to control allocation bias and blinding to reduce observer bias.⁵² A study published by Adolf Bingel the following year provides a nice example of these two principles being applied in practice.^53–55

The gradual move from alternation to random allocation

It is clear that, contrary to a common assumption,³ randomized trials did not suddenly fill a methodological vacuum beginning in 1948. Long before the concept of random allocation was introduced by statisticians, some doctors who wanted to compare preventive and therapeutic strategies recognized that comparison groups generated by alternate allocation would yield more credible evidence than comparison groups based on clinical decisions. There is some evidence of statistical expertise being brought to bear in a few of these early trials. For example, in 1912, a formal statistical test was applied to data from one of Choksy's many plague studies.⁵⁶ And during the 1920s, Louis Dublin, an actuary at the Metropolitan Life Insurance Company, seems likely to have been influential in the design and analysis of a series of methodologically sophisticated alternate allocation studies done to evaluate the effects of serum therapy for pneumonia.^47,57

So what led to the gradual move away from alternation to random allocation? The principal disadvantage of alternate allocation is that it usually means that those making decisions about who will participate in treatment comparisons have foreknowledge of upcoming allocations, and this sometimes leads them to undermine an allocation schedule that, in principle, should be unbiased.

In 1933, when assessing the reasons for baseline imbalances in a Medical Research Council trial of serum treatment for pneumonia,⁵⁸ Austin Bradford Hill learned how alternation could be subverted by those recruiting patients.⁵⁹ A dozen years later, Bradford Hill was one of the three-man team designing the MRC's randomized trial of streptomycin. One of the others was Philip D'Arcy Hart. In a trial that D'Arcy Hart had designed for the Medical Research Council in 1943, allocation had been by rotation to one of four groups – two antibiotic, and two placebo – with the specific purpose of preventing foreknowledge of treatment allocations.^60,61 Although one of the reasons that the streptomycin trial has become iconic is that the treatment allocation schedule was based on random number tables,¹ this was not for any esoteric statistical reason.⁶² It was because successful concealment of allocation schedules and prevention of foreknowledge of upcoming allocations among clinicians entering patients in trials is more likely to be achieved with allocation schedules based on random numbers than with schedules using alternation.^2,3

The need to fill gaps in the history of controlled trials

Over most of the past two decades, our identification of pre-1948 reports of controlled trials using potentially unbiased treatment allocation schedules has been ‘opportunistic’. More recently, we have been able to use full text digital searches of the British Medical Journal, the Lancet, the Journal of the American Medical Association, the New England Journal of Medicine and the Proceedings of the Royal Society of Medicine, from the inceptions of the journals to 1947. In addition, a hand search of the Indian Medical Gazette from 1890 to 1910 was prompted by some of the important information about trials done in India at the turn of the 20th century. Table 1 (below) provides a summary of our findings as they stand currently.

Table 1
Pre-1948 reports of controlled trials using potentially unbiased treatment allocation schedules

Journal Pre-1900 1900–1929 1930–1939 1940–1947 Total

Total 26 55 82 77 240

BMJ 5 8 23 21 57

JAMA 2 18 13 16 49

Lancet 2 8 11 21 42

NEJM 1 0 6 0 7

Proc RSM 1 3 3 1 8

Elsewhere 15 18 26 18 77

The methods we have used to identify pre-1948 reports of controlled trials using potentially unbiased treatment allocation schedules are adequate to illustrate the use of this important element of trial design before the widespread adoption of randomization from the late 1940s onwards. However, the numbers in the Table are certainly minimum estimates of numerators, and they lack denominators to allow some estimate of the proportion of all articles on treatment evaluation which have had this feature of trial design. We invite readers to draw our attention to any other pre-1948 reports of trials using potentially unbiased treatment allocation schedules which are not currently included at http://www.jameslindlibrary.org/context/allocation-bias.

Medical historians have not given adequate attention to the use of unbiased treatment allocation before random allocation began to be adopted more widely from the middle of the 20th century onwards. Some relevant material exists in doctoral theses of which we are aware, but most of this relates to developments in Britain (http://www.jameslindlibrary.org/trial_records/authored/theses.html). As is clear from the illustrative material we have assembled, developments were occurring concurrently in a number of countries, and being reported in a number of different languages. To avoid being parochial, research into this important era in the evolution of clinical trials requires knowledge in several languages, and international collaboration.⁵⁵

We have provided some tantalizing examples of relevant material published in Danish, French and German. Research funders and researchers in the countries where these languages are used need to recognize how important it is that they contribute to the investigation of an era of fundamental importance in the international development of fair tests of treatments. We hope that our findings will prompt interest in and support for research to document and understand the efforts made to develop reliable tests of treatments in a number of countries during the first half of the 20th century.

DECLARATIONS

Competing interests

None declared

Funding

None

Ethical approval

Not applicable

Guarantor

Iain Chalmers

Contributorship

All authors contributed to searches of the literature for eligible reports and preparation of the manuscript

Acknowledgements

We dedicate this article to the memory of Harry Marks, a generous adviser to the James Lind Library, and a leading and inspiring historian of the development of the randomized clinical trial, who died in 2011. We thank Ulrich Tröhler and Christian Gluud for translating material published in French, German and Danish; Rosie Wild and Jane Ferrie for independent hand searches of the Indian Medical Gazette; Patricia Atkinson, Rebecca Brice, and Olivia Clarke for clerical help; and Doug Altman, Mike Clarke, Christian Gluud, Iain Milne and Ulrich Tröhler for helpful comments on earlier drafts. Additional material for this article is available from The James Lind Library website: (http://www.jameslindlibrary.org) where it was originally published

Journal	Pre-1900	1900–1929	1930–1939	1940–1947	Total
BMJ	5	8	23	21	57
JAMA	2	18	13	16	49
Lancet	2	8	11	21	42
NEJM	1	0	6	0	7
Proc RSM	1	3	3	1	8
Elsewhere	15	18	26	18	77

References

Medical Research Council. Streptomycin treatment of pulmonary tuberculosis. BMJ 1948;2:769–82

Chalmers

. Statistical theory was not the reason that randomisation was used in the British Medical Research Council's clinical trial of streptomycin for pulmonary tuberculosis. In: Jorland

, Opinel

, Weisz

, eds. Body counts: medical quantification in historical and sociological perspectives. Montreal: McGill-Queens University Press, 2005:309–34

Chalmers

. Why the 1948 MRC trial of streptomycin used treatment allocation based on random numbers. JLL Bulletin: Commentaries on the history of treatment evaluation, 2010 (www.jameslindlibrary.org)

Cox

. Randomization for concealment. JLL Bulletin: Commentaries on the history of treatment evaluation, 2009 (www.jameslindlibrary.org)

Tröhler

. The introduction of numerical methods to assess the effects of medical interventions during the 18th century: a brief history. JLL Bulletin: commentaries on the history of treatment evaluation, 2010a (www.jameslindlibrary.org)

Kaptchuk

. A brief history of the evolution of methods to control of observer biases in tests of treatments. JLL Bulletin: Commentaries on the history of treatment evaluation, 2011 (www.jameslindlibrary.org)

Huth

. Jules Gavarret's Principes Généraux de Statistique Médicale: a pioneering text on the statistical analysis of the results of treatments. JLL Bulletin: Commentaries on the history of treatment evaluation, 2006a (www.jameslindlibrary.org)

Tibi

. Al-Razi and Islamic medicine in the 9th Century. JLL Bulletin: Commentaries on the history of treatment evaluation, 2005 (www.jameslindlibrary.org)

10.

Petrarch

. Rerum Senilium Libri. Liber XIV, Epistola 1. Letter to Boccaccio (V.3), 1364. [Letters of old age.]

11.

Massey

. A short and plain account of inoculation. With some remarks on the main argument made use of to recommend that practice, by Mr. Maitland and others. To which is added, a letter to the learned James Jurin, M.D.R.S. Secr. Col. Reg. Med. Lond. Soc. London: W. Meadows, 1723

12.

Lind

. A treatise of the scurvy. In three parts. Containing an inquiry into the nature, causes and cure, of that disease. Together with a critical and chronological view of what has been published on the subject. Edinburgh: Printed by Sands, Murray and Cochran for A Kincaid and A Donaldson, 1753

13.

Silverman

, Chalmers

. Casting and drawing lots: a time-honoured way of dealing with uncertainty and for ensuring fairness. JLL Bulletin: Commentaries on the history of treatment evaluation, 2002 (www.jameslindlibrary.org)

14.

Van Helmont

. Ortus medicinæ: Id est Initia physicæ inaudita. Progressus medicinae novus, in morborum ultionem, ad vitam longam. [The dawn of medicine: That is, the beginning of a new Physic. A new advance in medicine, a victory over disease, to [promote] a long life Amsterodami: Apud Ludovicum Elzevirium, 1648, 526–7

15.

Starkey

. Natures explication and Helmont's vindication…Or A short and sure way to a long and sound life. London: printed by E. Cotes for Thomas Alsop at the two Sugar-loaves over against St. Antholins Church at the lower end of Watling Street, 1658

16.

Mesmer

. Précis historique des faits relatifs au magnétisme animal jusques en avril 1781. Par M. Mesmer, Docteur en Médecine de la Faculté de Vienne. Ouvrage traduit de l'Allemand. [Historical account of facts relating to animal magnetism up to April 1781. By M. Mesmer, Doctor in Medicine of the Vienna Faculty. Work translated from German] A Londres [false imprint, probably Paris.], 1781;111–4; 182

17.

Hamilton

. Dissertatio Medica Inauguralis De Synocho Castrensi (Inaugural medical dissertation on camp fever). Edinburgh: J Ballantyne, 1816

18.

Milne

, Chalmers

. Hamilton's report of a controlled trial of bloodletting, 1816. JLL Bulletin: Commentaries on the history of treatment evaluation, 2002 (www.jameslindlibrary.org)

19.

Löhner

, on behalf of a society of truth-loving men. Die homöopathischen Kochsalzversuche zu Nürnberg [The homeopathic salt trials in Nuremberg]. Nuremberg, 1835

20.

Stolberg

. Inventing the randomized double-blind trial: The Nuremberg salt test of 1835. JLL Bulletin: Commentaries on the history of treatment evaluation, 2006 (www.jameslindlibrary.org)

21.

Balfour

. Quoted in West C. Lectures on the Diseases of Infancy and Childhood. London, Longman, Brown, Green and Longmans, 1854, 600

22.

Chalmers

, Toth

. 19th century controlled trials to test whether belladonna prevents scarlet fever. JLL Bulletin: Commentaries on the history of treatment evaluation, 2009 (www.jameslindlibrary.org)

23.

Pasteur

. Compte rendu sommaire des expériences rates á Pouilly-le-Fort, prés Melun, sur la vaccination charbonneuse. Comptes rendus de l'Académie des Sciences 1881;92:1378–83

24.

Gavarret

LDJ

. Principes généraux de statistique médicale: ou développement des règles qui doivent présider à son emploi. Paris: Bechet jeune & Labé, 1840

25.

Bartlett

. An essay on the philosophy of medical science. Philadelphia: Lea and Blanchard, 1844

26.

Huth

. Transatlantic ideas on the philosophy of therapeutics in the middle of the 19th century. JLL Bulletin: Commentaries on the history of treatment evaluation, 2006b (www.jameslindlibrary.org)

27.

Guy

. Croonian Lectures on the numerical method, and its application to the science and art of medicine. BMJ 1860;2:553–5

28.

Ephraim

. Uber die Bedeutung de statistischen Methode für die Medicin. [On the relevance of the statistical method for medicine] Volkmann's Sammlung Klinische Vortraege N.F. Innere Medicin (1890–1894);24:706–16. Leipzig: Breitkopf and Härtel

29.

Wanscher

. Om Diphteritis og Croup - særligt med hensyn til Tracheostomien ved samme. [On diphtheria and croup – especially regarding tracheostomy in this condition]. Disputats [Thesis]. Jacob Lund: Kjøbenhavn, 1877:67–8

30.

Polverini

. Serumtherapie gegen Beulenpest [Serum treatment of bubonic plague]. Muenchener Med. Wochenschrift 1903;50:649–51

31.

Choksy

KBNH

. On recent progress in serum-therapy of plague. BMJ 1908;1:1282–4

32.

Cousin

. Des éruptions consécutives aux injections de sérum antidiphthérique et de leur traitement prophylactique par l'ingestion de clorure de calcium. Thèse pour le Doctorat en médicine. Paris: Jules Rousset, 1905:36–44

33.

Netter

. Efficacité de l'ingestion de chlorure de calcium comme moyen préventif des éruptions consecutives aux injections de sérum. Séances et Mémoires de la Société de Biologie 1906;58:279–80

34.

Bullowa

JGM

. The use of antipneumococcic refined serum in lobar pneumonia: data necessary for a comparison between cases treated with serum and cases not so treated, and the importance of a significant control series of cases. JAMA 1928;90:1354–58

35.

Cecil

, Plummer

. Pneumococcus Type I pneumonia – a study of eleven hundred and sixty-one cases, with especial reference to specific therapy. JAMA 1930;95:1547–53

36.

Wagner-Jauregg

. Ueber die Infektionsbehandlung der progressiven Paralyse [On infection treatment of progressive paralysis]. Münchener Medizinische Wochenschrift 1931;78:4–7

37.

Fisher

. The design of experiments. Edinburgh: Oliver and Boyd, 1935

38.

Doull

, Hardy

, Clark

, Herman

. The effect of irradiation with ultra-violet light on the frequency of attacks of upper respiratory disease (common colds). American Journal of Hygiene 1931;13:460–77

39.

Theobald

. Effect of calcium and vitamin A and D on incidence of pregnancy toxaemia. Lancet 1937;2:1397–99

40.

Bell

. Pertussis prophylaxis with two doses of alum-precipitated vaccine. Public Health Reports 1941;56:1535–46

41.

Armitage

. Randomisation and alternation: a note on Diehl et al. JLL Bulletin: commentaries on the history of treatment evaluation, 2002 (www.jameslindlibrary.org)

42.

Hill

. Principles of medical statistics. London: Lancet, 1937

43.

Ramanna

. Commentary: NH Choksy and serum therapy. Int J Epidemiol Home Contents, 2011

44.

Syed

, Swaminathan

. Commentary: Dr. Choksy's dilemma. Int J Epidemiol, 2011 doi:10.1093/ije/dyr168

45.

Chakrabarti

. Commentary: An experimental theatre for vaccines: Bombay in the time of plague. Int J Epidemiol, 2011

46.

Davey-Smith

. Int J Epidemiol, 2011

47.

Podolsky

. Jesse Bullowa, specific treatment for pneumonia, and the development of the controlled clinical trial. JLL Bulletin: Commentaries on the history of treatment evaluation, 2008 (www.jameslindlibrary.org)

48.

Bryder

. The Medical Research Council and clinical trial methodologies before the 1940s: the failure to develop a ‘scientific’ approach. JLL Bulletin: Commentaries on the history of treatment evaluation, 2010 (www.jameslindlibrary.org)

49.

Middleton

. Tea for burns or scalds in the home. BMJ 1936;1:555

50.

Heiberg

. Studier over den statistiske undersøgelsesmetode som hjælpemiddel ved terapeutiske undersøgelser [Studies on the statistical study design as an aid in therapeutic trials]. Bibliotek for Læger 1897;89:1–40

51.

Wright

, Morgan

, Colebrook

, Dodgson

. Observations on prophylactic inoculation against pneumococcus infections, And on the results which have been achieved by it. Lancet 1914;1:87–95

52.

Sollmann

. The crucial test of therapeutic evidence. JAMA 1917;69:198–9

53.

Bingel

. Über Behandlung der Diphtherie mit gewöhnlichem Pferdeserum. Deutsches Archiv für Klinische Medizin 1918;125:284–332

54.

Tröhler

. Adolf Bingel's blinded, controlled comparison of different anti-diphtheritic sera in 1918. JLL Bulletin: Commentaries on the history of treatment evaluation, 2010b (www.jameslindlibrary.org)

55.

Opinel

, Tröhler

, Gluud

, Gachelin

, Davey Smith

, Podolsky

, Chalmers

. The evolution of methods to assess the effects of treatments, illustrated by the development of treatments for diphtheria, 1825–1918. Int J Epidemiol, 2011 doi:10.1093/ije/dyr162

56.

Advisory Committee on Plague Investigations in India. The serum treatment of human plague. Journal of Hygiene. Plague Supplement II, 1912;LVI:326–39

57.

Podolsky

. Pneumonia before antibiotics: therapeutic evolution and evaluation in twentieth-century America. Baltimore: Johns Hopkins University Press, 2006

58.

Medical Research Council Therapeutic Trials Committee. The serum treatment of lobar pneumonia. BMJ 1934;1:241–5

59.

Hill

. Serum treatment of pneumonia, 22 December 1933, cited in Austoker

, Bryder

, eds. Historical perspectives on the role of the MRC. Oxford: Oxford University Press 1989, 1933:46–7

60.

Medical Research Council. Clinical trial of patulin in the common cold. Lancet 1944;2:373–5

61.

Chalmers

, Clarke

. The 1944 Patulin Trial: the first properly controlled multicentre trial conducted under the aegis of the British Medical Research Council. International Journal of Epidemiology 2004;32:253–60

62.

Doll

. The role of data monitoring committees. In: Duley

, Farrell

, eds. Clinical Trials. London: BMJ Books, 2002:97–104