Sage Journals: Discover world-class research

Abstract

Get full access to this article

View all access options for this article.

References

Daston

Lorraine

Otte

Michael

, “Style in science”, Science in context, iv (1991), 231–3.

See e.g. Biagoli

Mario

, “Etiquette, interdependence, and sociability in seventeenth-century science”, Critical inquiry, xxii (1996), 193–238; Cueto

Marcos

, ” andean biology in Peru: Scientific styles on the periphery”, Isis, lxxx (1989), 1989–58; idem, “Laboratory styles in Argentine physiology”, Isis, lxxxv (1994), 1994–46; Dietrich

Michael R.

, “On the mutability of genes and geneticists: The ‘Americanization’ of Richard Goldschmidt and Victor Jollos”, Perspectives on science, iv (1996), 1996–45; Harwood

Jonathan

, “National styles in science: Genetics in Germany and the United States between the world wars”, Isis, lxxviii (1987), 1987–414; idem, Styles of scientific thought: The German genetics community, 1900–1933 (Chicago, 1993); idem, “Are there national styles of scientific thought? Genetics in Germany, 1900–1933”, in Weingart

Peter

(ed.)., Grenzberschretungen in der Wissenschaft (Baden-Baden, 1995), 31–53; Nicolson

Malcolm

, “National styles, divergent classifications: A comparative case study from the history of French and American plant ecology”, Knowledge and society: Studies in the sociology of science past and present, viii (1989), 139–86; Schweber

Sylvan S.

, “The historical temper regnant: Theoretical physics in the United States, 1920–1950”, Historical studies in the physical sciences, xvii (1986), 1986–98; and Siegmund-Schultze

Reinhar

, “National styles in mathematics between the world wars”, in Ausejo

Elena

Hormigón

Marian

(eds), Paradigms and mathematics (Madrid, 1996), 243–53.

See e.g. Henry

John

, “National styles in science: A possible factor in scientific revolution?”, in Livingstone

David N.

Withers

Charles W. J.

(eds), Geography and revolution (Chicago, 2005), 43–74; Malley

Marjorie

, “The discovery of atomic transmutation: Scientific styles and philosophies in France and Britain”, Isis, lxx (1979), 1979–23; and Russell

Nicholas

, “Independent discovery in biology: Investigating styles of scientific research”, Medical history, xxxvii (1993), 1993–41.

Exceptions include Nye

Robert A.

, “The history of sexuality in context: National sexological traditions”, Science in context, iv (1991), 387–406; and Richards

Joan L.

, “Rigor and clarity: Foundations of mathematics in France and England, 1800–1840”, Science in context, iv (1991), 1991–319.

Following social scientists' usage of ANOVA, I have emphasized the centrality of the F test. But it should be noted that some writers view ANOVA as a more individual, if systematic, approach to data analysis. see e.g. Pearce

S. C.

, “Analysis of variance”, in Balakrishnan

(eds), Encyclopedia of statistical sciences, 2nd edn (16 vols, New York, 2006), i, 133–41.

Regardless of sample size, use of ANOVA presupposes that certain distribution assumptions are satisfied. On Fisher's development of small-sample theory, see e.g. Box

Joan F.

, R. A. Fisher: The life of a scientist (New York, 1978), 113–29; and Hall

Nancy S.

, “R. A. Fisher and his advocacy of randomization”, Journal of the history of biology, xl (2007), 2007–325.

Huberty

Carl J.

Pike

Chandler J.

, “On some history regarding statistical testing”, in Thompson

Bruce

(ed.), Advances in social science methodology (5 vols, Stamford, CT, 1999), v, 1–22, p. 5.

Rucci

Anthony J.

Tweney

Ryan D.

, “Analysis of variance and the ‘second discipline’ of scientific psychology: A historical account”, Psychological bulletin, lxxxvii (1980), 166–84. Some educators knew about ANOVA by the late 1920s, however. see e.g. Kelley

Truman L.

Shen

Eugene

, “General statistical principles”, in Murchison

Carl

(ed.)., The foundations of experimental psychology (New York, 1929), 832–54, pp. 850–2.

Gigerenzer

Gerd

, The empire of chance: How probability changed science and everyday life (Cambridge, 1989), 106–8; and Huberty

Carl J.

, “Historical origins of statistical testing practices: The treatment of Fisher versus Neyman-Pearson views in textbooks”, Journal of experimental education, lxi (1993), 317–33.

10.

Gigerenzer

Gerd

Murray

David J.

, Cognition as intuitive statistics (Hillsdale, NJ, 1987), 19–22; see also Hubbard

Raymond

Parsa

Rahul A.

Luthy

Michael R.

, “The spread of statistical significance testing in psychology: The case of the Journal of Applied Psychology, 1917–1994”, Theory and psychology, vii (1997), 1997–54; and Hubbard

Raymond

Ryan

Patricia A.

, “The historical growth of statistical significance testing in psychology — And its future prospects”, Educational and psychological measurement, lx (2000), 2000–81.

11.

See e.g. Howie

David

, Interpreting probability: Controversies and developments in the early twentieth century (Cambridge, 2002), chap. 6; and Ziliak

Stephen T.

McCloskey

Deirdre N.

, The cult of statistical significance: How the standard error costs us jobs, justice, and lives (Ann Arbor, MI, 2008).

12.

Danziger

Kurt

, Constructing the subject: Historical origins of psychological research (Cambridge, 1990), 80–3, 121–6; idem, “Statistical method and the historical development of research practice in American psychology”, in Krüger

Loren

Gigerenzer

Gerd

Morgan

Mary S.

(eds), The probabilistic revolution (2 vols, Cambridge, MA, 1987), ii, 35–47; and Lovie

A. D.

, “The analysis of variance in experimental psychology: 1934–1945”, British journal of mathematical and statistical psychology, xxxii (1979), 1979–78. see also Dehue

Trudy

, “Deception, efficiency, and random groups: Psychology and the gradual origination of random group design”, Isis, lxxxviii (1997), 1997–73.

13.

Danziger , Constructing the subject (ref. 12), 147–55; and idem, “Statistical method” (ref. 12). See also Lovie , op. cit. (ref. 12); and Rucci Tweney , op. cit. (ref. 8).

14.

Psychologists made explicit analogies between agricultural and psychological terms in showcasing ANOVA. See Baxter

Brent

, “Problems in the planning of psychological experiments”, American journal of psychology, liv (1941), 270–80.

15.

Rucci Tweney , op. cit. (ref. 8), have shown that, of prominent contributors to the development of inferential statistics in psychology before the end of the Second World War, 4 of 12 psychologists received their statistical training from an educator, but none of the three educators received his or her statistical training from a psychologist. The other six notable contributors were statisticians.

16.

Cronbach

Lee J.

, “The two disciplines of scientific psychology”, American psychologist, xii (1957), 671–84, p. 674. I shall refer to Cronbach's discipline of correlation as measurement in order to avoid identifying it with a particular statistical method (contra Cronbach's intention).

17.

Ibid., 677. See also Cowles

Michael

, Statistics in psychology: An historical perspective (Hillsdale, NJ, 1989), 31; and Rucci Tweney , op. cit. (ref. 8), 172–3.

18.

Allen

James E.

Jr , “E. F. Lindquist: Educational development and aptitude testing”, Education, xci (1970), 2–3.

19.

Peterson

Julia J.

, The Iowa Testing Programs: The first fifty years (Iowa City, 1983), 51–2; and Buros

Oscar K.

, “Fifty years in testing: Some reminiscences, criticisms, and suggestions”, Educational researcher, vi (1977), 1977–15, p. 11.

20.

See papers 5, 11, 15, and 21 in Pearson

E. S.

Wishart

John

(eds), “Student's” collected papers (Cambridge, 1942).

21.

On the history of the control group in educational psychology, see e.g. Danziger , Constructing the subject (ref. 12), 113–15; Dehue , op. cit. (ref. 12); and Hacking

Ian

, “Telepathy: Origins of randomization in experimental design”, Isis, lxxix (1988), 427–51.

22.

For a detailed discussion of the procedure of pairing individual scores, see McCall

William A.

, How to experiment in education (New York, 1923), 45–62.

23.

The standard error of a difference between two uncorrelated means is given by the formula: σ _M ₁–M₂= σ(σ² _M ₁ + σ² _M ₂).

24.

When the probable error (0.6745 times the standard error) was used to evaluate the reliability of a difference between two uncorrelated means, a statistically significant difference was defined by a critical ratio of four or more.

25.

Lindquist

E. F.

, ” The laboratory method in freshman English”, Ph.D. dissertation, State University of Iowa, 1927, 81–2; see also Lindquist

E. F.

Foster

R. R.

, “On the determination of reliability in comparing the final mean scores of matched groups”, Journal of educational psychology, xx (1929), 1929–6.

26.

The derivation of the formula is presented in Wilks

Samuel S.

, “The standard error of the means of ‘matched’ samples”, Journal of educational psychology, xxii (1931), 205–8. The use of the formula is justified and extended in Lindquist

E. F.

, “The significance of a difference between the means of ‘matched’ groups”, Journal of educational psychology, xxii (1931), 1931–204; and idem, “A further note on the significance of a difference between the means of matched groups”, Journal of educational psychology, xxiv (1933), 1933–9.

27.

Of the 15 studies completed between 1930 and 1938 that used equivalent groups, 12 either applied Lindquist or Wilks's formula or asserted its application was unnecessary (because the results were significant under the more conservative test).

28.

Lindquist's first published reference to small-sample theory is his dismissal of the t test as inappropriate for use with matched pairs. See Lindquist , “A further note” (ref. 26).

29.

Lindquist

E. F.

, A first course in statistics: Their use and interpretation in education and psychology (Boston, 1938), 113 n.

30.

Lindquist

E. F.

, “The general import of recent developments in statistical inference”, in American Educational Research Association, Research on the foundations of American education (Washington, DC, 1939), 192–5, pp. 194–5.

31.

Stoddard

George D.

, Iowa Placement Examinations (University of Iowa, Studies in education, iii/2; Iowa City, 1925); and idem, “Iowa Placement Examinations”, School and society, xxiv (1926), 1926–16.

32.

Lindquist

E. F.

, “The Iowa Testing Programs: A retrospective view”, Education, xci (1970), 7–23, p. 8; and Peterson , op. cit. (ref. 19), 1–6.

33.

Lindquist

E. F.

, “Changing values in educational measurement”, Educational record, xvii (1936), 64–81, p. 71; and Peterson , op. cit. (ref. 19), 32.

34.

Peterson , op. cit. (ref. 19), 24, 50–1.

35.

In order to dispel misconceptions that fuelled criticism of testing, Lindquist explicated the functions and limitations of standardized tests, together with the advantages of the ITP. See Lindquist , op. cit. (ref. 33); idem, “Factors determining reliability of test norms”, Journal of educational psychology, xxi (1930), 512–26; idem, “Basic considerations”, Review of educational research, iii (1933), 1933–20; idem, “The technique of constructing tests”, Educational record, xv (1934), 1934–86; idem, “Cooperative achievement testing”, Journal of educational research, xxviii (1935), 1935–20; and idem, “Standardized achievement tests and their relation to curriculum content”, National elementary principal, xvi (1937), 1937–84. see also Lindquist

E. F.

Anderson

H. R.

, “Achievement tests in the social studies”, Educational record, xiv (1933), 1933–256.

36.

A widespread criticism of educational tests was that they were constructed and validated on the basis of — And used to maintain — The status quo. See e.g. Buros , op. cit. (ref. 19), 12; Carson

John

, The measure of merit: Talents, intelligence, and inequality in the French and American republics, 1750–1940 (Princeton, 2007), 248–50; Echols

James P.

, “The rise of the evaluation movement, 1920–1942”, Ph.D. dissertation, Stanford University, 1973, 61–9, 150–1; Minton

Henry L.

, “Lewis M. Terman and mental testing: In search of the democratic ideal”, in Sokal

Michael M.

(ed.), Psychological testing and American society, 1890–1930 (New Brunswick, NJ, 1990), 95–112, pp. 102–4; and Weinland

Thomas P.

, “A history of the IQ in America, 1890–1941”, Ph.D. dissertation, Columbia University, 1970, 183–210.

37.

Echols , op. cit. (ref. 36), 148, 154.

38.

Chapman

Paul D.

, Schools as sorters: Lewis M. Terman, applied psychology, and the intelligence testing movement, 1890–1930 (New York, 1988), 28–9, 92–7.

39.

See e.g. Iowa Every-Pupil Program, Manual for administration and interpretation of 1936 Iowa Every-Pupil Tests of Basic Skills (Iowa City, 1936), 14; and idem, Manual for administration and interpretation of1938 Iowa Every-Pupil Tests of Basic Skills (Iowa City, 1938), 21. Because participation in the ITP was voluntary, the tests were not standardized on the basis of random samples. Nevertheless, Lindquist maintained that the norms were representative of nationwide achievement because based on an exhaustive sampling of the population of Iowa schools. See Peterson , op. cit. (ref. 19), 52. Lindquist's contemporaries, however, questioned the tests' representativeness. See Hobson

James R.

, “Review of silent reading comprehension: Iowa Every-Pupil Tests of Basic Skills, Test A”; and Wrightstone

J. Wayne

, “Review of work-study skills: Iowa Every-Pupil Tests of Basic Skills, Test B, new edition”, in Buros

Oscar K.

(ed.), The third mental measurements yearbook (New Brunswick, NJ, 1949), 530, 571.

40.

Lindquist

E. F.

, “The gap between promise and fulfillment in ninth grade algebra”, School review, xlii (1934), 762–71.

41.

Lindquist

E. F.

, “Summary report of results of the Iowa Every-Pupil Intelligence Testing Program of January 29, 1934”, mimeographed (Iowa City, 1934).

42.

Ibid., 14.

43.

Peterson , op. cit. (ref. 19), 23. Beginning in 1966, percentile norms for levels of intelligence, as measured by the Lorge-Thorndike Intelligence Tests, were offered for the ITP's basic skills tests, Ibid., 35, 233.

44.

Lindquist , “Factors determining reliability of test norms” (ref. 35), 515, italics original.

45.

Ibid., 515.

46.

Ibid., 517; and loc. cit. (ref. 39).

47.

Lindquist , op. cit. (ref. 33), 71; and Lindquist , “Standardized achievement tests” (ref. 35), 484.

48.

See e.g. Iowa Every-Pupil Program, Summary report of results for the 1936 Iowa Every-Pupil High School Testing Program (Iowa City, 1936).

49.

Lindquist

E. F.

, Statistical analysis in educational research (Boston, 1940), 72–5, 104–32.

50.

Ibid., 139–44.

51.

ANCOVA subtracts initial differences — The average within-group regression of posttest on pretest scores — From the variances of the F ratio in the proportion that the regression coefficient contributes to those variances. On the use of ANCOVA with designs involving replication in several schools, see Lindquist , op. cit. (ref. 49), 196–203.

52.

On principles of test construction during the interwar period, see e.g. Hawkes

Herbert E.

Lindquist

E. F.

Mann

C. R.

(eds), The construction and use of achievement examinations: A manual for secondary school teachers (Boston, 1936); Monroe

Walter S.

, An introduction to the theory of educational measurements (Boston, 1923), chaps. 4–7; Monroe

Walter S.

DeVoss

James C.

Kelly

Frederick J.

, Educational tests and measurements (Boston, 1924), chap. 11; and Ruch

G. M.

Stoddard

George D.

, Tests and measurements in high school instruction (New York, 1927), chaps. 17–20.

53.

Lindquist , op. cit. (ref. 49), 219–28.

54.

Lindquist was critical of regarding samples comprised of intact groups as simple random samples of individuals unless ” The assumptions of homogeneity of the means and variances of the classes are strongly supported by a priori considerations as well as by the outcomes of statistical tests”. Lindquist

E. F.

, Design and analysis of experiments in psychology and education (New York, 1953), 75–6.

55.

Lindquist , op. cit. (ref. 49), 114, 133.

56.

Ibid., 110, 113. If the experiment is replicated within schools by random assignment of classes, one to each treatment — The class here being the unit of sampling — The analysis of variance will yield three components: ‘treatments’, ‘schools’, and the ‘treatments-by-schools’ interaction, the latter of which serves as the error term in the F test, Ibid., 104–14. If individuals are randomly assigned to classes in each school, the ANOVA will include a ‘within-classes’ component, which, in special circumstances, serves as the error term in the F test, Ibid., 114–19.

57.

The titles of dissertations were obtained from Edwards

Sarah

, Theses and dissertations presented in the graduate college of the State University of Iowa, 1900–1950 (Iowa City, 1952).

58.

The dominant source of data is unambiguous in all but three dissertations, whose analyses appear to be based equally on data from two sources. In these cases, the last source reported in the dissertation was designated as the dominant one.

59.

Koos

Leonard V.

, The questionnaire in education: A critique and manual (New York, 1928). Koos's survey category (i.e., the questionnaire method in combination with other methods) was eliminated because it described so few dissertations.

60.

Although behavioural scientists and statisticians today would demand that experiment meet other criteria besides manipulation of an independent variable, this criterion is sufficient to distinguish between the studies I designate as experiment and measurement. More properly, studies that do not use randomized designs may be regarded as quasi-experiments. See e.g. Keppel

Geoffrey

Zedeck

Sheldon

, Data analysis for research designs: Analysis of variance and multiple regression/correlation approaches (New York, 1989), 385–8.

61.

State University of Iowa, Catalogue number 1928–1929 (Iowa City, 1928).

62.

The t statistic, used with one or two samples, is algebraically equivalent to the F statistic. It should be noted that although William S. Gosset was the originator of the one-sample t test, Fisher is to be credited with both changing Gosset's z ratio to a t ratio and extending Gosset's approach to a broader class of problems. See Eisenhart

Churchill

, “On the transition from ‘Student's’ z to ‘Student's’ t”, The American statistician, xxxiii (1979), 6–10.

63.

An interaction was tested in two dissertations by ANOVA and in one dissertation by the chi-square test.

64.

Three dissertations classified as measurement used a randomized block design to control for order effects.

65.

For biographical information on Johnson, see Beck

Robert H.

, Beyond pedagogy: A history of the University of Minnesota College of Education (St. Paul, MN, 1980), 86, 134–5, 146.

66.

Eckert

Ruth E.

Keller

Robert J.

, “Origin and background of the institutional research program”, in Eckert

Ruth E.

Keller

Robert J.

(eds), A university looks at its program: The report of the University of Minnesota Bureau of Institutional Research, 1942–1952 (Minneapolis, 1954), 3–12, pp. 4–5; and University of Minnesota, Committee on Educational Research, Collegiate educational research: Report of the Committee on Educational Research for the biennium, 1928–1930 (Minneapolis, 1931), 14–15.

67.

The Johnson-Neyman technique, to be discussed later in this section, was introduced in 1936 as a general solution to problems posed by ANOVA and ANCOVA. See Johnson

Palmer O.

Neyman

, “Tests of certain linear hypotheses and their application to some educational problems”, Statistical research memoirs, i (1936), 57–93. But in providing both the ‘best’ estimate of F and an estimate of the variance of F, the technique reflects the Neyman-Pearson approach to hypothesis testing. On the difference between the Fisherian and Neyman-Pearson schools, see e.g. Baird

Davis

, Inductive logic: Probability and statistics (Englewood Cliffs, NJ, 1992), 354–60; Gigerenzer ., op. cit. (ref. 9), 90–106; and Oakes

Michael

, Statistical inference: A commentary for the social and behavioural sciences (Chichester, 1986), 118–29.

68.

Wechsler

Harold S.

, The qualified student: A history of selective college admission in America (New York, 1977), 238–40.

69.

Levine

Donald O.

, The American college and the culture of aspiration, 1915–1940 (Ithaca, NY, 1986), 165–6; and Ibid., 241–2.

70.

Wechsler , op. cit. (ref. 68), 97–9, 102–5.

71.

Lemann

Nicholas

, The big test (New York, 2000), 31–2; Noble

David F.

, America by design: Science, technology, and the rise of corporate capitalism (Oxford, 1977), 255; and Wechsler , op. cit. (ref. 68), 248.

72.

Noble , op. cit. (ref. 71), 255.

73.

University of Minnesota, Committee on Educational Research, Collegiate educational research: Report of the Committee on Educational Research for the biennium, 1930–1932 (Minneapolis, 1933), 3.

74.

McConnell

T. R.

, “Introduction”, in University of Minnesota, Committee on Educational Research, Studies in higher education: Report of the Committee on Educational Research for the biennium, 1936–1938 (Minneapolis, 1939), 1–3, p. 3.

75.

Wechsler , op. cit. (ref. 68), 279.

76.

Johnston

John B.

, “Selection of students”, in Kent

Raymond A.

(ed.), Higher education in America (Boston, 1930), 409–40, pp. 419, 439.

77.

University of Minnesota, Committee on Educational Research, op. cit. (ref. 66), 6.

78.

University of Minnesota, Committee on Educational Research, op. cit. (ref. 74), 21.

79.

Beck , op. cit. (ref. 65), 58, 108.

80.

This characterization of predictive studies is based on an analysis of doctoral dissertations completed at the University of Minnesota's college of education between 1929 and 1937.

81.

Were the analysis to terminate at this point, however, it would not be possible to predict the success of individual students. Toward this end, a further step was sometimes taken of deriving a multiple regression equation.

82.

On the basis of select descriptions, only studies categorized as ‘articulation’ (i.e., of high school to college), ‘examinations and measurements’, or ‘higher education’ were assumed to rely exclusively on measurement.

83.

McConnell , op. cit. (ref. 74), 2–3.

84.

University of Minnesota, University of Minnesota studies in predicting scholastic achievement (Minneapolis, 1942); Johnson

Palmer O.

, “The construction and evaluation of comprehensive examinations in the college of pharmacy”, in University of Minnesota, Committee on Educational Research, Studies in higher education: Biennial report of the Committee on Educational Research, 1938–1940 (Minneapolis, 1941), 178–87; idem, ” The construction and evaluation of comprehensive examinations in the College of Pharmacy”, in University of Minnesota, Committee on Educational Research, Studies in higher education: Biennial report of the Committee on Educational Research, 1940–1942 (Minneapolis, 1943), 105–13; and University of Minnesota, Committee on Educational Research, op. cit. (ref. 74), 6–21.

85.

The number of faculty publications devoted to instruction increased from 38 in 1924–30 to 60 (of a total of 574) in 1931–36. University of Minnesota, Committee on Educational Research, Studies in higher education: Report of the Committee on Educational Research for the quadrennium, 1932–1936 (Minneapolis, 1937), 38.

86.

Williamson

Edmund G.

, quoted in University of Minnesota, Committee on Educational Research, op. cit. (ref. 74), 21.

87.

McConnell

T. R.

, “Foreword”, in University of Minnesota, op. cit. (ref. 84), pp. iii–iv, p. iii, italics original.

88.

Ibid., pp. iii–iv.

89.

The number of covariates is based on an analysis of dissertations completed at the University of Minnesota's college of education between 1929 and 1937.

90.

Beck , op. cit. (ref. 65), 133–4.

91.

For a comparison of the Johnson-Neyman technique and ANCOVA, see Huitema

Bradley E.

, The analysis of covariance and alternatives (New York, 1980), chap. 13. Because its assumptions are less restrictive, use of the Johnson-Neyman technique is not inappropriate even when the difference between the group regression coefficients is not statistically significant. See Potthoff

Richard F.

, “Johnson-Neyman technique”, in Balakrishnan (eds), Encyclopedia of statistical sciences (ref. 5), vi, 3745–9.

92.

Johnson Neyman , op. cit. (ref. 67), 81–7.

93.

Ibid., 89–90, indicate how the technique could be extended to more than two covariates, but one of Johnson's students was the first to provide appropriate procedures for representing a region of significance in three dimensions. See Hoyt

Cyril

, “Tests of certain linear hypotheses and their application to educational problems in elementary college physics”, Ph.D. thesis, University of Minnesota, 1944.

94.

Hoyt , op. cit. (ref. 93), extended the Johnson-Neyman technique to three-group designs. Only recently, however, have approaches been developed for analysis of two-factor designs with heterogeneous regression slopes. See Huitema , op. cit. (ref. 91), 290–1.

95.

Johnson

P. O.

Tsao

Fei

, “Factorial design and covariance in the study of individual educational development”, Psychometrika, x (1945), 133–62, p. 134.

96.

Ibid.; and Johnson

P. O.

Tsao

Fei

, “Factorial design in the determination of differential limen values”, Psychometrika, ix (1944), 107–44.

97.

Johnson

P. O.

, “Measurement in higher education”, in American Educational Research Association, Reconstructing education through research (Washington, DC, 1936), 18–20; idem, “Statistical methods”, Review of educational research, ix (1939), 1939–54, 626–9; and idem, “Uses of Fisherian statistics”, Journal of educational research, xxxvi (1943), 1943–30.

98.

Freeman

Edward M.

Johnson

P. O.

, “Prediction of success in the College of Agriculture, Forestry, and Home Economics”, in University of Minnesota, op. cit. (ref. 84), 33–65, p. 47.

99.

Johnson , “Uses of Fisherian statistics” (ref. 97), 630.

100.

Johnson

Palmer O.

, Statistical methods in research (New York, 1949).

101.

University of Minnesota, The Graduate School, Announcement for the years 1936–1938, Bulletin, xxxix/41 (Minneapolis, 1936), 77.

102.

Titles of dissertations were obtained from University of Minnesota, Register of Ph.D. degrees conferred by the University of Minnesota, 1888 — June, 1938 (Minneapolis, 1939); and University of Minnesota, Register of Ph.D. degrees conferred by the University of Minnesota, 1938 — June, 1956 (Minneapolis, 1957).

103.

University of Minnesota, College of Education, Announcement of program for the year 1929–30, Bulletin, xxxii/50 (Minneapolis, 1929).

104.

The t test appeared in three dissertations between 1935 and 1937.

105.

A significance test on adjusted variances was coded as an ANCOVA, although Minnesota dissertators regarded such a test as involving an ANOVA as well.

106.

Included in the count were six tests of ANCOVA's assumption of homogeneity of group regression coefficients, which is analogous to testing the interaction in a two-factor ANOVA. It should also be noted that excluded from the count were four applications of the Johnson-Neyman technique solely for the purpose of statistical control.

107.

See e.g. Fink

Stuart D.

, “Instructional organization in the public elementary schools of American cities”, Ph.D. thesis, University of Minnesota, 1945, 228 n; and Smith

Victor C.

, “Factors affecting learning of general science”, Ph.D. thesis, University of Minnesota, 1943, 55.

108.

In one application of the Johnson-Neyman technique, the assumption of homogeneity of variance was tested using the critical ratio test, and in four applications of multiple correlation, ANOVA or the t test was used for this purpose.

109.

The Johnson-Neyman technique was the primary technique of analysis in 7 of 15 experiments. Three relied primarily on ANCOVA, two each ANOVA and the t test, and one the critical ratio test.

110.

Loc. cit. (ref. 17).

111.

Educators' struggle to bring unity to methodological diversity is evident in the heightened interest in classificatory schemas in the 1930s, which reflected an anxiety about the scientific status of education. Although some educators distinguished a statistical method, others regarded statistics not as a method of research but as a method of data analysis unrestricted in its scope. see e.g. Barr

A. S.

, “A symposium on the classification of educational research”, Journal of educational research, xxiii (1931), 353–82, and xxiv (1931), 1–22; Good

Carter V.

Barr

A. S.

Scates

Douglas E.

, The methodology of educational research (New York, 1936), chap. 5; and Kelley

Truman L.

, Scientific method: Its function in research and education (Columbus, OH, 1929).

112.

I identified 16 textbooks of educational statistics published between 1946 and 1960 that were intended for use in an intermediate or advanced course, all of which discussed Fisherian statistics. See Cornell

Francis G.

, The essentials of educational statistics (New York, 1956); Dixon

Wilfred J.

Massey

Frank J.

, Introduction to statistical analysis (New York, 1951); idem, Introduction to statistical analysis, 2nd edn (New York, 1957); Edwards

Allen L.

, Statistical analysis for students in psychology and education (New York, 1946); idem, Statistical methods for the behavioral sciences (New York, 1954); Ferguson

George A.

, Statistical analysis in psychology and education (New York, 1959); Garrett

Henry E.

, Statistics in psychology and education, 3rd edn (New York, 1947); idem, Statistics in psychology and education, 4th edn (New York, 1953); Guilford

J. P.

, Fundamental statistics in psychology and education, 2nd edn (New York, 1950); idem, Fundamental statistics in psychology and education, 3rd edn (New York, 1956); Johnson , op. cit. (ref. 100); Johnson

Palmer O.

Jackson

Robert W. B.

, Modern statistical methods: Descriptive and inductive (Chicago, 1959); Lindquist , op. cit. (ref. 54); Tate

Merle W.

, Statistics in education (New York, 1955); Walker

Helen M.

Lev

Joseph

, Statistical inference (New York, 1953); and Wert

James E.

Neidt

Charles O.

Ahmann

J. Stanley

, Statistical methods in educational and psychological research (New York, 1954).

113.

Typologies of interdisciplinarity, which includes tool-borrowing, are based on the degree of interaction between disciplines abstracted from their larger intellectual context. For a review, see Klein

Julie T.

, Interdisciplinarity: History, theory, and practice (Detroit, MI, 1990), 24, 41–2, 64. For case studies of tool-borrowing, see e.g. Gigerenzer Murray , op. cit. (ref. 10); Kellert

Stephen H.

, Borrowed knowledge: Chaos theory and the challenge of learning across disciplines (Chicago, 2008); and Mirowski

Philip

, More heat than light: Economics as social physics, physics as nature's economics (Cambridge, 1989).

114.

Clifford

James

, Routes: Travel and translation in the twentieth century (Cambridge, MA, 1997), 3.

115.

Harwood , Styles of scientific thought (ref. 2), 9–10; and idem, “Are there national styles of scientific thought?” (ref. 2), 33.

116.

Rucci Tweney , op. cit. (ref. 8), 172, 178–9.

117.

For the titles of textbooks included in the analysis, see loc. cit. (ref. 112).

118.

Lindquist , op. cit. (ref. 54), chap. 8. In the other three worked examples of a randomized block design, the blocks comprised individuals or matched pairs, with the result that the problem of sampling intact groups was not addressed. See Cornell , op. cit. (ref. 112), 269–76; Edwards , Statistical analysis (ref. 112), 220–5; and Johnson , op. cit. (ref. 100), 287–92. Four other textbooks either defined a randomized block design or supplied an example without instructions, thereby according it little practical importance.

119.

Johnson Jackson , op. cit. (ref. 112), 425–44; and Walker Lev , op. cit. (ref. 112), 384–6, 400–411.

120.

See e.g. Edwards , Statistical analysis (ref. 112), 284–6; and Ferguson , op. cit. (ref. 112), 135.

121.

Several textbook writers, however, justified usage of Fisherian techniques on the basis of their general advantages and/or the special needs of educational research. See Garrett , Statistics in psychology and education, 4th edn (ref. 112), 289; Johnson Jackson , op. cit. (ref. 112), 424–5; Lindquist , op. cit. (ref. 54), 172; Walker Lev , op. cit. (ref. 112), 382–4; and Wert Neidt Ahmann , op. cit. (ref. 112), 101, 126.

122.

Cf. Lovie

A. D.

, “On the early history of ANOVA in the analysis of repeated measures designs in psychology”, British journal of mathematical and statistical psychology, xxxiv (1981), 1–15.

123.

Collins

H. M.

, Changing order: Replication and induction in scientific practice (Chicago, 1992), chap. 4; and idem, Gravity's shadow: The search for gravitational waves (Chicago, 2004), chap. 10.

124.

During the decades 1940–60 when inferential statistics were institutionalized in the USA, the number of critical articles published in various fields was roughly a third the number published in the 1980s and a sixth the number published in the 1990s. See Anderson

David R.

Burnham

Kenneth P.

Thompson

William L.

, “Null hypothesis testing: Problems, prevalence, and an alternative”, Journal of wildlife management, lxiv (2000), 912–23, p. 913. Moreover, outside the sociological literature, few articles explicitly defending the tests in response to criticism were published prior to 1970. See Morrison

Denton E.

Henkel

Ramon E.

(eds), The significance test controversy: A reader (Chicago, 1970). The avoidance of controversy is taken to an extreme in the ‘hybrid logic’ of statistical inference, whereby Fisher and Neyman-Pearson approaches to hypothesis testing are amalgamated in a way that neither school would accept. See Gigerenzer

Gerd

, “The superego, the ego, and the id in statistical reasoning”, in Keren

Gideon

Lewis

Charles

(eds), A handbook for data analysis in the behavioral sciences: Methodological issues (Hillsdale, NJ, 1993), 311–39; Gigerenzer Murray , op. cit. (ref. 10), 21–2; and Gigerenzer ., op. cit. (ref. 9), 106–9; see also Huberty , op. cit. (ref. 9).

125.

See e.g. Hilaire-Pérez

Lilian

, “Technology as a public culture in the eighteenth century: The artisans' legacy”, History of science, xlv (2007), 135–53; Klein

Ursula

, “Apothecary shops, laboratories and chemical manufacture in eighteenth-century Germany”, in Roberts

Lissa

Schaffer

Simon

Dear

Peter

(eds), The mindful hand: Inquiry and invention from the late Renaissance to early industrialization (Amsterdam, 2007), 247–76; idem, “The laboratory challenge: Some revisions of the standard view of early modern experimentation”, Isis, xcix (2008), 2008–82; Kranakis

Eda

, “Hybrid careers and the interaction of science and technology”, in Kroes

Peter

Bakker

Martijn

(eds), Technological development and science in the industrial age (Dordrecht, 1992), 177–204; and Mokyr

Joel

, The gifts of Athena: Historical origins of the knowledge economy (Princeton, NJ, 2002), 53–4, 65–6, 82–3, 90–1.

Not as the Crow Flies: ‘Styles’ of Educational Measurement in the Reception of Inferential Statistics at Iowa and Minnesota

Abstract

Get full access to this article

References