Sage Journals: Discover world-class research

Abstract

When determining how many items to include on a criterion-referenced test, practitioners must re solve various nonstatistical issues before a par ticular solution can be applied. A fundamental problem is deciding which of three true scores should be used. The first is based on the prob ability that an examinee is correct on a "typical" test item. The second is the probability of having acquired a typical skill among a domain of skills, and the third is based on latent trait models. Once a particular true score is settled upon, there are several perspectives that might be used to de termine test length. The paper reviews and critiques these solutions. Some new results are described that apply when latent structure models are used to esti mate an examinee's true score.

Get full access to this article

View all access options for this article.

References

Aitchison, J. , & Dunsmore, I.R. Statistical prediction analysis. London: Cambridge University Press, 1975.

Anderson, T.W. On estimation of parameters in latent structure analysis. Psychometrika, 1954,19, 1-10.

Andrews, D.F. , Bickel, P.J. , Hampel, F.R. , Huber, P.J. , Rogers, W.H. , & Tukey, J.W. Robust estimates of location. Princeton, NJ: Princeton University Press, 1972.

Baker, F.B. Advances in item analysis. Review of Educational Research , 1977, 47, 151-178.

Baker, F.B. , & Hubert, L.J. Inference procedures for ordering theory. Journal of Educational Statistics, 1977,2,217-233.

Bartlett, M.S. Untitled comment on "The estimation of many parameters" by D. V. Lindley. In V. P. Godambe & D. A. Sprott , Foundations of statistical inference. Toronto: Holt, Rinehart, & Wins-ton, 1971.

Bechhofer, R.E. A single-sample multiple decision procedure for ranking means of normal populations with known variances. Annals of Mathematical Statistics, 1954, 25, 16-39.

Bergan, J.R. , Cancelli, A.A. , & Luiten, J.W. Mastery assessment with latent class and quasi-independence models representing homogeneous item domains. Journal of Educational Statistics , 1980, 5,65-81.

Birnbaum, A. Some latent trait models and their use in inferring an examinee's ability . In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores. Reading, MA: Addison-Wesley, 1968.

10.

Blischke, W.R. Estimating the parameters of mixtures of binomial distributions. Journal of the American Statistical Association, 1964, 59, 510-528.

11.

Cochran, W.G. Some methods for strengthening the common χ2 tests. Biometrica, 1954, 10, 417-451.

12.

Dayton, C.M. , & Macready, G.B. A probabilistic model for validation of behavioral heirarchies. Psychometrika, 1976, 41, 189-204.

13.

Dempster, A.P. New approaches for reasoning towards posterior distributions based on sample data. Annals of Mathematical Statistics, 1966 , 37, 355-374.

14.

Dempster, A.P. Upper and lower probabilities induced by a multivalued mapping. Annals of Mathematical Statistics, 1967, 36, 325-339.

15.

Dempster, A.P. Upper and lower probabilities generated by a random closed interval. Annals of Mathematical Statistics, 1968, 39, 957-966. (a)

16.

Dempster, A.P. A generalization of Bayesian inference. Journal of the Royal Statistical Society, Ser. B, 1968, 30, 205-232. (b)

17.

Duncan, G.T. An empirical Bayes approach to scoring multiple-choice tests in the misinformation model. Journal of the American Statistical Association, 1974, 69, 50-57.

18.

Fhanér, S. Item sampling and decision making in achievement testing. British Journal of Mathematical and Statistical Psychology, 1974, 27, 172-175.

19.

Fienberg, S.E. , & Holland, P.W. Simultaneous estimation of multinomial cell probabilities. Journal of the American Statistical Association, 1973, 68, 683-691.

20.

Freeman, M.F. , & Tukey, J.W. Transformations related to the angular and the square root. The Annals of Mathematical Statistics, 1950, 21, 607-611.

21.

Gelfand, A. , & Thomas, D. Discrimination between the binomial and hypergeometric models. Communications in Statistics A, Theory and Methods, 1976,18, 225-240.

22.

Goodman, L.A. Exploratory latent structure analysis using both indentifiable and unidentifiable models. Biometrika, 1974, 61, 215-231.

23.

Goodman, L.A. On the estimation of parameters in latent structure analysis. Psychometrika, 1979, 44, 123-128.

24.

Goodman, L.A. , & Kruskal, W.H. Measures of association for cross classifications II: Further discussion and references. Journal of the American Statistical Association , 1959, 54, 123-163.

25.

Gustafsson, J. Testing and obtaining fit of data to the Rasch model. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, April 1979 .

26.

Hambleton, R. , & Cook, L. Latent trait models and their use in the analysis of educational test data . Journal of Educational Measurement, 1977 , 14, 75-96.

27.

Hambleton, R.K. , Swaminathan, H. , Algina, J. , & Coulson, D.B. Criterion-referenced testing and measurement: A review of technical issues and developments. Review of Educational Research, 1978, 48, 1-47.

28.

Hambleton, R.K. , Swaminathan, H. , Cook, L.L. , Eignor, D.R. , & Gifford, J.A. Developments in latent trait theory: Models, technical issues, and applications . Review of Educational Research, 1978, 48, 467-510.

29.

Harris, C.W. Some technical characteristics of mastery tests. In C. W. Harris , M. C. Alken , & W. James Popham (Eds.), Problems in criterion-referenced measurement (Center for the Study of Evaluation Monograph No. 3. Los Angeles: Center for the Study of Evaluation, 1974.

30.

Harris, C.W. , & Pearlman, A.P. An index for a domain of completion or short answer items. Journal of Educational Statistics, 1978, 3, 285-304.

31.

Hartke, A.R. The use of latent partition analysis to identify homogeneity of an item population . Journal of Educational Measurement, 1978 , 15, 43-47.

32.

Huang, W. Bayes approach to a problem in partitioning k normal populations. Bulletin of the Institute of Mathematics Academia Sinica, 1975, 3, 87-97.

33.

Huynh, H. Statistical consideration of mastery scores. Psychometrika , 1976, 41, 65-78.

34.

Huynh, H. Statistical inference for false positive and false negative error rates in mastery testing. Psychometrika, 1980, 45, 107-120.

35.

IBM Application Program, System/360. Scientific subroutines package (360-CM-03X). Version III. Programmer's manual. White Plains, NY: IBM Corporation Technical Publications Department, 1971 .

36.

Katz, L. Unified treatment of a broad class of discrete probability distributions. In G. P. Patil (Ed.), Classical and contagious discrete distributions. New York: Pergamon Press, 1963.

37.

Keats, J.A. , & Lord, F.M. A theoretical distribution for mental test scores. Psychometrika , 1962, 27, 59-72.

38.

Kendall, M.G. , & Stuart, A. The advanced theory of statistics (Vol. 2). New York: Hafner, 1973.

39.

Lam, K. , & Chiu, W.K. On the probability of correctly selecting the best of several normal populations . Biometrika, 1976, 63, 410-411.

40.

Lazarsfeld, P.F. , & Henry, N.W. Latent structure analysis. New York: Houghton Mifflin, 1968.

41.

Livingston, S.A. , & Wingersky, M.S. Assessing the reliability of tests used to make pass/fail decisions. Journal of Educational Measurement, 1979, 16, 247-260.

42.

Lord, F.M. A strong true-score theory, with applications. Psychometrika , 1965, 30, 239-270.

43.

Lord, F.M. Estimating true-score distributions in psychological testing (An empirical Bayes estimation problem. Psychometrika, 1969 , 34, 259-299.

44.

Lord, F.M. Individualized testing and item characteristic curve theory. In D. H. Krantz , R. C. Atkinson , R. D. Luce , & P. Suppes (Eds.), Contemporary developments in mathematical psychology (Vol. 2). San Francisco, CA: Freeman, 1974 .

45.

Lord, F.M. , & Novick, M.R. Statistical theories of mental test scores. Reading, MA : Addison-Wesley, 1968.

46.

Macready, G.B. , & Dayton, C.M. The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 1977, 2, 99-120.

47.

Maritz, J.S. Empirical Bayes methods. London: Methuen, 1970.

48.

Marks, E. , & Noll, G.A. Procedures and criteria for evaluating reading and listening comprehension tests. Educational and Psychological Measurement, 1967, 27, 335-348.

49.

McHugh, R.B. Efficient estimation and local identification in latent class analysis. Psychometrika, 1956,21,331-347.

50.

Messick, S. The standard problem: Meaning and values in measurement and evaluation. American Psychologist, 1975, 30, 955-966.

51.

Millman, J. Passing scores and test lengths for domain-referenced measures. Review of Educational Research, 1973, 43, 205-216.

52.

Molenaar, W. On Bayesian formula scores for random guessing in multiple choice tests. British Journal of Mathematical and Statistical Psychology, 1977, 30, 70-89.

53.

Morgan, G. A criterion-referenced measurement model with corrections for guessing and carelessness (Occasional Paper No. 13. Victoria: The Australian Council for Educational Research Limited, 1979.

54.

Murray, G.D. A note on the estimation of probability density functions. Biometrika, 1977, 64, 150-151.

55.

Novick, M.R. High school attainment: An example of a computer-assisted Bayesian approach to data analysis. International Statistical Review, 1973, 41,264-271.

56.

Novick, M.R. , & Jackson, P.H. Statistical methods for educational and psychological research. New York: McGraw-Hill, 1974.

57.

Novick, M.R. , & Lewis, C. Prescribing test length for criterion-referenced measurement. In C. W. Harris , M. C. Alkin , & W. J. Popham (Eds.), Problems in criterion-referenced measurement (CSE Monograph Series in Evaluation, No. 3. Los Angeles : University of California, Center for the Study of Evaluation, 1974 .

58.

Novick, M.R. , Lewis, C. , & Jackson, P.H. The estimation of proportions in m groups. Psychometrika, 1973, 38, 19-46.

59.

Reulecke, W.A. A statistical analysis of deterministic theories. In H. Spada & F. Kempf (Eds.), Structural models of thinking and learning. Bern: Huber, 1977.

60.

Rustagi, J.S. Variational methods of statistics. New York: Academic Press, 1976.

61.

Skibinsky, M. Sharp upper bounds for probability on an interval when the first three moments are known. The Annals of Statistics, 1976, 4, 187-213.

62.

Skibinsky, M. The maximum probability on an interval when the mean and variance are known . Sankhya, Series A, 1977, 39, 144-159.

63.

Springer, M.D. The algebra of random variables. New York: Wiley, 1979.

64.

Tarone, R.E. Testing the goodness of fit of the binomial distribution. Biometrika, 1979, 66, 585-590.

65.

Tong, Y.L. , & Wetzell, D.E. On the behaviour of the probability function for selecting the best normal population. Biometrika, 1979, 66, 174-176.

66.

van den Brink, W.P. , & Koele, P. Item sampling, guessing, and decision making in achievement testing. British Journal of Mathematical and Statistical Psychology, 1980, 33, 104-108.

67.

van der Linden, W. Forgetting, guessing, and mastery : The Macready and Dayton models revisited and compared with a latent trait approach. Journal of Educational Statistics, 1978, 3, 305-317.

68.

von Mises, R. A mathematical theory of probability and statistics. New York: Academic Press, 1964.

69.

Weitzman, R.A. Ideal multiple-choice items. Journal of the American Statistical Association, 1970, 65, 71-89.

70.

Wilcox, R.R. Estimating the likelihood of a false-positive or false-negative decision with a mastery test: An empirical Bayes approach. Journal of Educational Statistics, 1977, 2, 289-307.

71.

Wilcox, R.R. Applying ranking and selection techniques to determine the length of a mastery test. Educational and Psychological Measurement, 1979,39,13-22. (a)

72.

Wilcox, R.R. Comparing examinees to a control. Psychometrika, 1979, 44, 55-68. (b)

73.

Wilcox, R.R. On false-positive and false-negative decisions with a mastery test. Journal of Educational Statistics, 1979, 4, 59-73. (c)

74.

Wilcox, R.R. Achievement tests and latent structure models. British Journal of Mathematical and Statistical Psychology, 1979, 32, 61-71. (d)

75.

Wilcox, R.R. Estimating the parameters of the beta-binomial distribution. Educational and Psychological Measurement, 1979, 39, 527-535. (e)

76.

Wilcox, R.R. Some results and comments on using latent structure models to measure achievement . Educational and Psychological Measurement, 1980, in press. (a)

77.

Wilcox, R.R. An approach to measuring the achievement or proficiency of an examinee. Applied Psychological Measurement, 1980, 4, 241-251. (b)

78.

Wood, R. Trait measurement and item banks. In D. de Gruijter & L. van der Kamp (Eds.), Advances in Psychological and Educational Measurement. New York: Wiley, 1976.

Determining the Length of a Criterion-Referenced Test

Abstract

Get full access to this article

References