DastonLorraineOtteMichael, “Style in science”, Science in context, iv (1991), 231–3.
2.
See e.g. BiagoliMario, “Etiquette, interdependence, and sociability in seventeenth-century science”, Critical inquiry, xxii (1996), 193–238; CuetoMarcos, ” andean biology in Peru: Scientific styles on the periphery”, Isis, lxxx (1989), 1989–58; idem, “Laboratory styles in Argentine physiology”, Isis, lxxxv (1994), 1994–46; DietrichMichael R., “On the mutability of genes and geneticists: The ‘Americanization’ of Richard Goldschmidt and Victor Jollos”, Perspectives on science, iv (1996), 1996–45; HarwoodJonathan, “National styles in science: Genetics in Germany and the United States between the world wars”, Isis, lxxviii (1987), 1987–414; idem, Styles of scientific thought: The German genetics community, 1900–1933 (Chicago, 1993); idem, “Are there national styles of scientific thought? Genetics in Germany, 1900–1933”, in WeingartPeter (ed.)., Grenzberschretungen in der Wissenschaft (Baden-Baden, 1995), 31–53; NicolsonMalcolm, “National styles, divergent classifications: A comparative case study from the history of French and American plant ecology”, Knowledge and society: Studies in the sociology of science past and present, viii (1989), 139–86; SchweberSylvan S., “The historical temper regnant: Theoretical physics in the United States, 1920–1950”, Historical studies in the physical sciences, xvii (1986), 1986–98; and Siegmund-SchultzeReinhar, “National styles in mathematics between the world wars”, in AusejoElenaHormigónMarian (eds), Paradigms and mathematics (Madrid, 1996), 243–53.
3.
See e.g. HenryJohn, “National styles in science: A possible factor in scientific revolution?”, in LivingstoneDavid N.WithersCharles W. J. (eds), Geography and revolution (Chicago, 2005), 43–74; MalleyMarjorie, “The discovery of atomic transmutation: Scientific styles and philosophies in France and Britain”, Isis, lxx (1979), 1979–23; and RussellNicholas, “Independent discovery in biology: Investigating styles of scientific research”, Medical history, xxxvii (1993), 1993–41.
4.
Exceptions include NyeRobert A., “The history of sexuality in context: National sexological traditions”, Science in context, iv (1991), 387–406; and RichardsJoan L., “Rigor and clarity: Foundations of mathematics in France and England, 1800–1840”, Science in context, iv (1991), 1991–319.
5.
Following social scientists' usage of ANOVA, I have emphasized the centrality of the F test. But it should be noted that some writers view ANOVA as a more individual, if systematic, approach to data analysis. see e.g. PearceS. C., “Analysis of variance”, in BalakrishnanN. (eds), Encyclopedia of statistical sciences, 2nd edn (16 vols, New York, 2006), i, 133–41.
6.
Regardless of sample size, use of ANOVA presupposes that certain distribution assumptions are satisfied. On Fisher's development of small-sample theory, see e.g. BoxJoan F., R. A. Fisher: The life of a scientist (New York, 1978), 113–29; and HallNancy S., “R. A. Fisher and his advocacy of randomization”, Journal of the history of biology, xl (2007), 2007–325.
7.
HubertyCarl J.PikeChandler J., “On some history regarding statistical testing”, in ThompsonBruce (ed.), Advances in social science methodology (5 vols, Stamford, CT, 1999), v, 1–22, p. 5.
8.
RucciAnthony J.TweneyRyan D., “Analysis of variance and the ‘second discipline’ of scientific psychology: A historical account”, Psychological bulletin, lxxxvii (1980), 166–84. Some educators knew about ANOVA by the late 1920s, however. see e.g. KelleyTruman L.ShenEugene, “General statistical principles”, in MurchisonCarl (ed.)., The foundations of experimental psychology (New York, 1929), 832–54, pp. 850–2.
9.
GigerenzerGerd, The empire of chance: How probability changed science and everyday life (Cambridge, 1989), 106–8; and HubertyCarl J., “Historical origins of statistical testing practices: The treatment of Fisher versus Neyman-Pearson views in textbooks”, Journal of experimental education, lxi (1993), 317–33.
10.
GigerenzerGerdMurrayDavid J., Cognition as intuitive statistics (Hillsdale, NJ, 1987), 19–22; see also HubbardRaymondParsaRahul A.LuthyMichael R., “The spread of statistical significance testing in psychology: The case of the Journal of Applied Psychology, 1917–1994”, Theory and psychology, vii (1997), 1997–54; and HubbardRaymondRyanPatricia A., “The historical growth of statistical significance testing in psychology — And its future prospects”, Educational and psychological measurement, lx (2000), 2000–81.
11.
See e.g. HowieDavid, Interpreting probability: Controversies and developments in the early twentieth century (Cambridge, 2002), chap. 6; and ZiliakStephen T.McCloskeyDeirdre N., The cult of statistical significance: How the standard error costs us jobs, justice, and lives (Ann Arbor, MI, 2008).
12.
DanzigerKurt, Constructing the subject: Historical origins of psychological research (Cambridge, 1990), 80–3, 121–6; idem, “Statistical method and the historical development of research practice in American psychology”, in KrügerLorenGigerenzerGerdMorganMary S. (eds), The probabilistic revolution (2 vols, Cambridge, MA, 1987), ii, 35–47; and LovieA. D., “The analysis of variance in experimental psychology: 1934–1945”, British journal of mathematical and statistical psychology, xxxii (1979), 1979–78. see also DehueTrudy, “Deception, efficiency, and random groups: Psychology and the gradual origination of random group design”, Isis, lxxxviii (1997), 1997–73.
13.
Danziger, Constructing the subject (ref. 12), 147–55; and idem, “Statistical method” (ref. 12). See also Lovie, op. cit. (ref. 12); and RucciTweney, op. cit. (ref. 8).
14.
Psychologists made explicit analogies between agricultural and psychological terms in showcasing ANOVA. See BaxterBrent, “Problems in the planning of psychological experiments”, American journal of psychology, liv (1941), 270–80.
15.
RucciTweney, op. cit. (ref. 8), have shown that, of prominent contributors to the development of inferential statistics in psychology before the end of the Second World War, 4 of 12 psychologists received their statistical training from an educator, but none of the three educators received his or her statistical training from a psychologist. The other six notable contributors were statisticians.
16.
CronbachLee J., “The two disciplines of scientific psychology”, American psychologist, xii (1957), 671–84, p. 674. I shall refer to Cronbach's discipline of correlation as measurement in order to avoid identifying it with a particular statistical method (contra Cronbach's intention).
17.
Ibid., 677. See also CowlesMichael, Statistics in psychology: An historical perspective (Hillsdale, NJ, 1989), 31; and RucciTweney, op. cit. (ref. 8), 172–3.
18.
AllenJames E.Jr, “E. F. Lindquist: Educational development and aptitude testing”, Education, xci (1970), 2–3.
19.
PetersonJulia J., The Iowa Testing Programs: The first fifty years (Iowa City, 1983), 51–2; and BurosOscar K., “Fifty years in testing: Some reminiscences, criticisms, and suggestions”, Educational researcher, vi (1977), 1977–15, p. 11.
20.
See papers 5, 11, 15, and 21 in PearsonE. S.WishartJohn (eds), “Student's” collected papers (Cambridge, 1942).
21.
On the history of the control group in educational psychology, see e.g. Danziger, Constructing the subject (ref. 12), 113–15; Dehue, op. cit. (ref. 12); and HackingIan, “Telepathy: Origins of randomization in experimental design”, Isis, lxxix (1988), 427–51.
22.
For a detailed discussion of the procedure of pairing individual scores, see McCallWilliam A., How to experiment in education (New York, 1923), 45–62.
23.
The standard error of a difference between two uncorrelated means is given by the formula: σM1–M2= σ(σ2M1 + σ2M2).
24.
When the probable error (0.6745 times the standard error) was used to evaluate the reliability of a difference between two uncorrelated means, a statistically significant difference was defined by a critical ratio of four or more.
25.
LindquistE. F., ” The laboratory method in freshman English”, Ph.D. dissertation, State University of Iowa, 1927, 81–2; see also LindquistE. F.FosterR. R., “On the determination of reliability in comparing the final mean scores of matched groups”, Journal of educational psychology, xx (1929), 1929–6.
26.
The derivation of the formula is presented in WilksSamuel S., “The standard error of the means of ‘matched’ samples”, Journal of educational psychology, xxii (1931), 205–8. The use of the formula is justified and extended in LindquistE. F., “The significance of a difference between the means of ‘matched’ groups”, Journal of educational psychology, xxii (1931), 1931–204; and idem, “A further note on the significance of a difference between the means of matched groups”, Journal of educational psychology, xxiv (1933), 1933–9.
27.
Of the 15 studies completed between 1930 and 1938 that used equivalent groups, 12 either applied Lindquist or Wilks's formula or asserted its application was unnecessary (because the results were significant under the more conservative test).
28.
Lindquist's first published reference to small-sample theory is his dismissal of the t test as inappropriate for use with matched pairs. See Lindquist, “A further note” (ref. 26).
29.
LindquistE. F., A first course in statistics: Their use and interpretation in education and psychology (Boston, 1938), 113 n.
30.
LindquistE. F., “The general import of recent developments in statistical inference”, in American Educational Research Association, Research on the foundations of American education (Washington, DC, 1939), 192–5, pp. 194–5.
31.
StoddardGeorge D., Iowa Placement Examinations (University of Iowa, Studies in education, iii/2; Iowa City, 1925); and idem, “Iowa Placement Examinations”, School and society, xxiv (1926), 1926–16.
32.
LindquistE. F., “The Iowa Testing Programs: A retrospective view”, Education, xci (1970), 7–23, p. 8; and Peterson, op. cit. (ref. 19), 1–6.
33.
LindquistE. F., “Changing values in educational measurement”, Educational record, xvii (1936), 64–81, p. 71; and Peterson, op. cit. (ref. 19), 32.
34.
Peterson, op. cit. (ref. 19), 24, 50–1.
35.
In order to dispel misconceptions that fuelled criticism of testing, Lindquist explicated the functions and limitations of standardized tests, together with the advantages of the ITP. See Lindquist, op. cit. (ref. 33); idem, “Factors determining reliability of test norms”, Journal of educational psychology, xxi (1930), 512–26; idem, “Basic considerations”, Review of educational research, iii (1933), 1933–20; idem, “The technique of constructing tests”, Educational record, xv (1934), 1934–86; idem, “Cooperative achievement testing”, Journal of educational research, xxviii (1935), 1935–20; and idem, “Standardized achievement tests and their relation to curriculum content”, National elementary principal, xvi (1937), 1937–84. see also LindquistE. F.AndersonH. R., “Achievement tests in the social studies”, Educational record, xiv (1933), 1933–256.
36.
A widespread criticism of educational tests was that they were constructed and validated on the basis of — And used to maintain — The status quo. See e.g. Buros, op. cit. (ref. 19), 12; CarsonJohn, The measure of merit: Talents, intelligence, and inequality in the French and American republics, 1750–1940 (Princeton, 2007), 248–50; EcholsJames P., “The rise of the evaluation movement, 1920–1942”, Ph.D. dissertation, Stanford University, 1973, 61–9, 150–1; MintonHenry L., “Lewis M. Terman and mental testing: In search of the democratic ideal”, in SokalMichael M. (ed.), Psychological testing and American society, 1890–1930 (New Brunswick, NJ, 1990), 95–112, pp. 102–4; and WeinlandThomas P., “A history of the IQ in America, 1890–1941”, Ph.D. dissertation, Columbia University, 1970, 183–210.
37.
Echols, op. cit. (ref. 36), 148, 154.
38.
ChapmanPaul D., Schools as sorters: Lewis M. Terman, applied psychology, and the intelligence testing movement, 1890–1930 (New York, 1988), 28–9, 92–7.
39.
See e.g. Iowa Every-Pupil Program, Manual for administration and interpretation of 1936 Iowa Every-Pupil Tests of Basic Skills (Iowa City, 1936), 14; and idem, Manual for administration and interpretation of1938 Iowa Every-Pupil Tests of Basic Skills (Iowa City, 1938), 21. Because participation in the ITP was voluntary, the tests were not standardized on the basis of random samples. Nevertheless, Lindquist maintained that the norms were representative of nationwide achievement because based on an exhaustive sampling of the population of Iowa schools. See Peterson, op. cit. (ref. 19), 52. Lindquist's contemporaries, however, questioned the tests' representativeness. See HobsonJames R., “Review of silent reading comprehension: Iowa Every-Pupil Tests of Basic Skills, Test A”; and WrightstoneJ. Wayne, “Review of work-study skills: Iowa Every-Pupil Tests of Basic Skills, Test B, new edition”, in BurosOscar K. (ed.), The third mental measurements yearbook (New Brunswick, NJ, 1949), 530, 571.
40.
LindquistE. F., “The gap between promise and fulfillment in ninth grade algebra”, School review, xlii (1934), 762–71.
41.
LindquistE. F., “Summary report of results of the Iowa Every-Pupil Intelligence Testing Program of January 29, 1934”, mimeographed (Iowa City, 1934).
42.
Ibid., 14.
43.
Peterson, op. cit. (ref. 19), 23. Beginning in 1966, percentile norms for levels of intelligence, as measured by the Lorge-Thorndike Intelligence Tests, were offered for the ITP's basic skills tests, Ibid., 35, 233.
44.
Lindquist, “Factors determining reliability of test norms” (ref. 35), 515, italics original.
See e.g. Iowa Every-Pupil Program, Summary report of results for the 1936 Iowa Every-Pupil High School Testing Program (Iowa City, 1936).
49.
LindquistE. F., Statistical analysis in educational research (Boston, 1940), 72–5, 104–32.
50.
Ibid., 139–44.
51.
ANCOVA subtracts initial differences — The average within-group regression of posttest on pretest scores — From the variances of the F ratio in the proportion that the regression coefficient contributes to those variances. On the use of ANCOVA with designs involving replication in several schools, see Lindquist, op. cit. (ref. 49), 196–203.
52.
On principles of test construction during the interwar period, see e.g. HawkesHerbert E.LindquistE. F.MannC. R. (eds), The construction and use of achievement examinations: A manual for secondary school teachers (Boston, 1936); MonroeWalter S., An introduction to the theory of educational measurements (Boston, 1923), chaps. 4–7; MonroeWalter S.DeVossJames C.KellyFrederick J., Educational tests and measurements (Boston, 1924), chap. 11; and RuchG. M.StoddardGeorge D., Tests and measurements in high school instruction (New York, 1927), chaps. 17–20.
53.
Lindquist, op. cit. (ref. 49), 219–28.
54.
Lindquist was critical of regarding samples comprised of intact groups as simple random samples of individuals unless ” The assumptions of homogeneity of the means and variances of the classes are strongly supported by a priori considerations as well as by the outcomes of statistical tests”. LindquistE. F., Design and analysis of experiments in psychology and education (New York, 1953), 75–6.
55.
Lindquist, op. cit. (ref. 49), 114, 133.
56.
Ibid., 110, 113. If the experiment is replicated within schools by random assignment of classes, one to each treatment — The class here being the unit of sampling — The analysis of variance will yield three components: ‘treatments’, ‘schools’, and the ‘treatments-by-schools’ interaction, the latter of which serves as the error term in the F test, Ibid., 104–14. If individuals are randomly assigned to classes in each school, the ANOVA will include a ‘within-classes’ component, which, in special circumstances, serves as the error term in the F test, Ibid., 114–19.
57.
The titles of dissertations were obtained from EdwardsSarah, Theses and dissertations presented in the graduate college of the State University of Iowa, 1900–1950 (Iowa City, 1952).
58.
The dominant source of data is unambiguous in all but three dissertations, whose analyses appear to be based equally on data from two sources. In these cases, the last source reported in the dissertation was designated as the dominant one.
59.
KoosLeonard V., The questionnaire in education: A critique and manual (New York, 1928). Koos's survey category (i.e., the questionnaire method in combination with other methods) was eliminated because it described so few dissertations.
60.
Although behavioural scientists and statisticians today would demand that experiment meet other criteria besides manipulation of an independent variable, this criterion is sufficient to distinguish between the studies I designate as experiment and measurement. More properly, studies that do not use randomized designs may be regarded as quasi-experiments. See e.g. KeppelGeoffreyZedeckSheldon, Data analysis for research designs: Analysis of variance and multiple regression/correlation approaches (New York, 1989), 385–8.
61.
State University of Iowa, Catalogue number 1928–1929 (Iowa City, 1928).
62.
The t statistic, used with one or two samples, is algebraically equivalent to the F statistic. It should be noted that although William S. Gosset was the originator of the one-sample t test, Fisher is to be credited with both changing Gosset's z ratio to a t ratio and extending Gosset's approach to a broader class of problems. See EisenhartChurchill, “On the transition from ‘Student's’ z to ‘Student's’ t”, The American statistician, xxxiii (1979), 6–10.
63.
An interaction was tested in two dissertations by ANOVA and in one dissertation by the chi-square test.
64.
Three dissertations classified as measurement used a randomized block design to control for order effects.
65.
For biographical information on Johnson, see BeckRobert H., Beyond pedagogy: A history of the University of Minnesota College of Education (St. Paul, MN, 1980), 86, 134–5, 146.
66.
EckertRuth E.KellerRobert J., “Origin and background of the institutional research program”, in EckertRuth E.KellerRobert J. (eds), A university looks at its program: The report of the University of Minnesota Bureau of Institutional Research, 1942–1952 (Minneapolis, 1954), 3–12, pp. 4–5; and University of Minnesota, Committee on Educational Research, Collegiate educational research: Report of the Committee on Educational Research for the biennium, 1928–1930 (Minneapolis, 1931), 14–15.
67.
The Johnson-Neyman technique, to be discussed later in this section, was introduced in 1936 as a general solution to problems posed by ANOVA and ANCOVA. See JohnsonPalmer O.NeymanJ., “Tests of certain linear hypotheses and their application to some educational problems”, Statistical research memoirs, i (1936), 57–93. But in providing both the ‘best’ estimate of F and an estimate of the variance of F, the technique reflects the Neyman-Pearson approach to hypothesis testing. On the difference between the Fisherian and Neyman-Pearson schools, see e.g. BairdDavis, Inductive logic: Probability and statistics (Englewood Cliffs, NJ, 1992), 354–60; Gigerenzer., op. cit. (ref. 9), 90–106; and OakesMichael, Statistical inference: A commentary for the social and behavioural sciences (Chichester, 1986), 118–29.
68.
WechslerHarold S., The qualified student: A history of selective college admission in America (New York, 1977), 238–40.
69.
LevineDonald O., The American college and the culture of aspiration, 1915–1940 (Ithaca, NY, 1986), 165–6; and Ibid., 241–2.
70.
Wechsler, op. cit. (ref. 68), 97–9, 102–5.
71.
LemannNicholas, The big test (New York, 2000), 31–2; NobleDavid F., America by design: Science, technology, and the rise of corporate capitalism (Oxford, 1977), 255; and Wechsler, op. cit. (ref. 68), 248.
72.
Noble, op. cit. (ref. 71), 255.
73.
University of Minnesota, Committee on Educational Research, Collegiate educational research: Report of the Committee on Educational Research for the biennium, 1930–1932 (Minneapolis, 1933), 3.
74.
McConnellT. R., “Introduction”, in University of Minnesota, Committee on Educational Research, Studies in higher education: Report of the Committee on Educational Research for the biennium, 1936–1938 (Minneapolis, 1939), 1–3, p. 3.
75.
Wechsler, op. cit. (ref. 68), 279.
76.
JohnstonJohn B., “Selection of students”, in KentRaymond A. (ed.), Higher education in America (Boston, 1930), 409–40, pp. 419, 439.
77.
University of Minnesota, Committee on Educational Research, op. cit. (ref. 66), 6.
78.
University of Minnesota, Committee on Educational Research, op. cit. (ref. 74), 21.
79.
Beck, op. cit. (ref. 65), 58, 108.
80.
This characterization of predictive studies is based on an analysis of doctoral dissertations completed at the University of Minnesota's college of education between 1929 and 1937.
81.
Were the analysis to terminate at this point, however, it would not be possible to predict the success of individual students. Toward this end, a further step was sometimes taken of deriving a multiple regression equation.
82.
On the basis of select descriptions, only studies categorized as ‘articulation’ (i.e., of high school to college), ‘examinations and measurements’, or ‘higher education’ were assumed to rely exclusively on measurement.
83.
McConnell, op. cit. (ref. 74), 2–3.
84.
University of Minnesota, University of Minnesota studies in predicting scholastic achievement (Minneapolis, 1942); JohnsonPalmer O., “The construction and evaluation of comprehensive examinations in the college of pharmacy”, in University of Minnesota, Committee on Educational Research, Studies in higher education: Biennial report of the Committee on Educational Research, 1938–1940 (Minneapolis, 1941), 178–87; idem, ” The construction and evaluation of comprehensive examinations in the College of Pharmacy”, in University of Minnesota, Committee on Educational Research, Studies in higher education: Biennial report of the Committee on Educational Research, 1940–1942 (Minneapolis, 1943), 105–13; and University of Minnesota, Committee on Educational Research, op. cit. (ref. 74), 6–21.
85.
The number of faculty publications devoted to instruction increased from 38 in 1924–30 to 60 (of a total of 574) in 1931–36. University of Minnesota, Committee on Educational Research, Studies in higher education: Report of the Committee on Educational Research for the quadrennium, 1932–1936 (Minneapolis, 1937), 38.
86.
WilliamsonEdmund G., quoted in University of Minnesota, Committee on Educational Research, op. cit. (ref. 74), 21.
87.
McConnellT. R., “Foreword”, in University of Minnesota, op. cit. (ref. 84), pp. iii–iv, p. iii, italics original.
88.
Ibid., pp. iii–iv.
89.
The number of covariates is based on an analysis of dissertations completed at the University of Minnesota's college of education between 1929 and 1937.
90.
Beck, op. cit. (ref. 65), 133–4.
91.
For a comparison of the Johnson-Neyman technique and ANCOVA, see HuitemaBradley E., The analysis of covariance and alternatives (New York, 1980), chap. 13. Because its assumptions are less restrictive, use of the Johnson-Neyman technique is not inappropriate even when the difference between the group regression coefficients is not statistically significant. See PotthoffRichard F., “Johnson-Neyman technique”, in Balakrishnan (eds), Encyclopedia of statistical sciences (ref. 5), vi, 3745–9.
92.
JohnsonNeyman, op. cit. (ref. 67), 81–7.
93.
Ibid., 89–90, indicate how the technique could be extended to more than two covariates, but one of Johnson's students was the first to provide appropriate procedures for representing a region of significance in three dimensions. See HoytCyril, “Tests of certain linear hypotheses and their application to educational problems in elementary college physics”, Ph.D. thesis, University of Minnesota, 1944.
94.
Hoyt, op. cit. (ref. 93), extended the Johnson-Neyman technique to three-group designs. Only recently, however, have approaches been developed for analysis of two-factor designs with heterogeneous regression slopes. See Huitema, op. cit. (ref. 91), 290–1.
95.
JohnsonP. O.TsaoFei, “Factorial design and covariance in the study of individual educational development”, Psychometrika, x (1945), 133–62, p. 134.
96.
Ibid.; and JohnsonP. O.TsaoFei, “Factorial design in the determination of differential limen values”, Psychometrika, ix (1944), 107–44.
97.
JohnsonP. O., “Measurement in higher education”, in American Educational Research Association, Reconstructing education through research (Washington, DC, 1936), 18–20; idem, “Statistical methods”, Review of educational research, ix (1939), 1939–54, 626–9; and idem, “Uses of Fisherian statistics”, Journal of educational research, xxxvi (1943), 1943–30.
98.
FreemanEdward M.JohnsonP. O., “Prediction of success in the College of Agriculture, Forestry, and Home Economics”, in University of Minnesota, op. cit. (ref. 84), 33–65, p. 47.
99.
Johnson, “Uses of Fisherian statistics” (ref. 97), 630.
100.
JohnsonPalmer O., Statistical methods in research (New York, 1949).
101.
University of Minnesota, The Graduate School, Announcement for the years 1936–1938, Bulletin, xxxix/41 (Minneapolis, 1936), 77.
102.
Titles of dissertations were obtained from University of Minnesota, Register of Ph.D. degrees conferred by the University of Minnesota, 1888 — June, 1938 (Minneapolis, 1939); and University of Minnesota, Register of Ph.D. degrees conferred by the University of Minnesota, 1938 — June, 1956 (Minneapolis, 1957).
103.
University of Minnesota, College of Education, Announcement of program for the year 1929–30, Bulletin, xxxii/50 (Minneapolis, 1929).
104.
The t test appeared in three dissertations between 1935 and 1937.
105.
A significance test on adjusted variances was coded as an ANCOVA, although Minnesota dissertators regarded such a test as involving an ANOVA as well.
106.
Included in the count were six tests of ANCOVA's assumption of homogeneity of group regression coefficients, which is analogous to testing the interaction in a two-factor ANOVA. It should also be noted that excluded from the count were four applications of the Johnson-Neyman technique solely for the purpose of statistical control.
107.
See e.g. FinkStuart D., “Instructional organization in the public elementary schools of American cities”, Ph.D. thesis, University of Minnesota, 1945, 228 n; and SmithVictor C., “Factors affecting learning of general science”, Ph.D. thesis, University of Minnesota, 1943, 55.
108.
In one application of the Johnson-Neyman technique, the assumption of homogeneity of variance was tested using the critical ratio test, and in four applications of multiple correlation, ANOVA or the t test was used for this purpose.
109.
The Johnson-Neyman technique was the primary technique of analysis in 7 of 15 experiments. Three relied primarily on ANCOVA, two each ANOVA and the t test, and one the critical ratio test.
110.
Loc. cit. (ref. 17).
111.
Educators' struggle to bring unity to methodological diversity is evident in the heightened interest in classificatory schemas in the 1930s, which reflected an anxiety about the scientific status of education. Although some educators distinguished a statistical method, others regarded statistics not as a method of research but as a method of data analysis unrestricted in its scope. see e.g. BarrA. S., “A symposium on the classification of educational research”, Journal of educational research, xxiii (1931), 353–82, and xxiv (1931), 1–22; GoodCarter V.BarrA. S.ScatesDouglas E., The methodology of educational research (New York, 1936), chap. 5; and KelleyTruman L., Scientific method: Its function in research and education (Columbus, OH, 1929).
112.
I identified 16 textbooks of educational statistics published between 1946 and 1960 that were intended for use in an intermediate or advanced course, all of which discussed Fisherian statistics. See CornellFrancis G., The essentials of educational statistics (New York, 1956); DixonWilfred J.MasseyFrank J., Introduction to statistical analysis (New York, 1951); idem, Introduction to statistical analysis, 2nd edn (New York, 1957); EdwardsAllen L., Statistical analysis for students in psychology and education (New York, 1946); idem, Statistical methods for the behavioral sciences (New York, 1954); FergusonGeorge A., Statistical analysis in psychology and education (New York, 1959); GarrettHenry E., Statistics in psychology and education, 3rd edn (New York, 1947); idem, Statistics in psychology and education, 4th edn (New York, 1953); GuilfordJ. P., Fundamental statistics in psychology and education, 2nd edn (New York, 1950); idem, Fundamental statistics in psychology and education, 3rd edn (New York, 1956); Johnson, op. cit. (ref. 100); JohnsonPalmer O.JacksonRobert W. B., Modern statistical methods: Descriptive and inductive (Chicago, 1959); Lindquist, op. cit. (ref. 54); TateMerle W., Statistics in education (New York, 1955); WalkerHelen M.LevJoseph, Statistical inference (New York, 1953); and WertJames E.NeidtCharles O.AhmannJ. Stanley, Statistical methods in educational and psychological research (New York, 1954).
113.
Typologies of interdisciplinarity, which includes tool-borrowing, are based on the degree of interaction between disciplines abstracted from their larger intellectual context. For a review, see KleinJulie T., Interdisciplinarity: History, theory, and practice (Detroit, MI, 1990), 24, 41–2, 64. For case studies of tool-borrowing, see e.g. GigerenzerMurray, op. cit. (ref. 10); KellertStephen H., Borrowed knowledge: Chaos theory and the challenge of learning across disciplines (Chicago, 2008); and MirowskiPhilip, More heat than light: Economics as social physics, physics as nature's economics (Cambridge, 1989).
114.
CliffordJames, Routes: Travel and translation in the twentieth century (Cambridge, MA, 1997), 3.
115.
Harwood, Styles of scientific thought (ref. 2), 9–10; and idem, “Are there national styles of scientific thought?” (ref. 2), 33.
116.
RucciTweney, op. cit. (ref. 8), 172, 178–9.
117.
For the titles of textbooks included in the analysis, see loc. cit. (ref. 112).
118.
Lindquist, op. cit. (ref. 54), chap. 8. In the other three worked examples of a randomized block design, the blocks comprised individuals or matched pairs, with the result that the problem of sampling intact groups was not addressed. See Cornell, op. cit. (ref. 112), 269–76; Edwards, Statistical analysis (ref. 112), 220–5; and Johnson, op. cit. (ref. 100), 287–92. Four other textbooks either defined a randomized block design or supplied an example without instructions, thereby according it little practical importance.
See e.g. Edwards, Statistical analysis (ref. 112), 284–6; and Ferguson, op. cit. (ref. 112), 135.
121.
Several textbook writers, however, justified usage of Fisherian techniques on the basis of their general advantages and/or the special needs of educational research. See Garrett, Statistics in psychology and education, 4th edn (ref. 112), 289; JohnsonJackson, op. cit. (ref. 112), 424–5; Lindquist, op. cit. (ref. 54), 172; WalkerLev, op. cit. (ref. 112), 382–4; and WertNeidtAhmann, op. cit. (ref. 112), 101, 126.
122.
Cf. LovieA. D., “On the early history of ANOVA in the analysis of repeated measures designs in psychology”, British journal of mathematical and statistical psychology, xxxiv (1981), 1–15.
123.
CollinsH. M., Changing order: Replication and induction in scientific practice (Chicago, 1992), chap. 4; and idem, Gravity's shadow: The search for gravitational waves (Chicago, 2004), chap. 10.
124.
During the decades 1940–60 when inferential statistics were institutionalized in the USA, the number of critical articles published in various fields was roughly a third the number published in the 1980s and a sixth the number published in the 1990s. See AndersonDavid R.BurnhamKenneth P.ThompsonWilliam L., “Null hypothesis testing: Problems, prevalence, and an alternative”, Journal of wildlife management, lxiv (2000), 912–23, p. 913. Moreover, outside the sociological literature, few articles explicitly defending the tests in response to criticism were published prior to 1970. See MorrisonDenton E.HenkelRamon E. (eds), The significance test controversy: A reader (Chicago, 1970). The avoidance of controversy is taken to an extreme in the ‘hybrid logic’ of statistical inference, whereby Fisher and Neyman-Pearson approaches to hypothesis testing are amalgamated in a way that neither school would accept. See GigerenzerGerd, “The superego, the ego, and the id in statistical reasoning”, in KerenGideonLewisCharles (eds), A handbook for data analysis in the behavioral sciences: Methodological issues (Hillsdale, NJ, 1993), 311–39; GigerenzerMurray, op. cit. (ref. 10), 21–2; and Gigerenzer., op. cit. (ref. 9), 106–9; see also Huberty, op. cit. (ref. 9).
125.
See e.g. Hilaire-PérezLilian, “Technology as a public culture in the eighteenth century: The artisans' legacy”, History of science, xlv (2007), 135–53; KleinUrsula, “Apothecary shops, laboratories and chemical manufacture in eighteenth-century Germany”, in RobertsLissaSchafferSimonDearPeter (eds), The mindful hand: Inquiry and invention from the late Renaissance to early industrialization (Amsterdam, 2007), 247–76; idem, “The laboratory challenge: Some revisions of the standard view of early modern experimentation”, Isis, xcix (2008), 2008–82; KranakisEda, “Hybrid careers and the interaction of science and technology”, in KroesPeterBakkerMartijn (eds), Technological development and science in the industrial age (Dordrecht, 1992), 177–204; and MokyrJoel, The gifts of Athena: Historical origins of the knowledge economy (Princeton, NJ, 2002), 53–4, 65–6, 82–3, 90–1.