Several methods have been proposed for estimating the reliability of a criterion-referenced test. This paper describes and compares seven procedures which can be applied to the more general case of proficiency tests that are scored with latent structure models. Results suggest that the predictive estimate is the most accurate of the procedures.
Get full access to this article
View all access options for this article.
References
1.
Algina, J. and Noe, M. J.A study of the accuracy of Subkoviak's single-administration estimate of the coefficient of agreement using two true-score estimates. Journal of Educational Measurement, 1978, 15, 101-110.
2.
Aitchison, J. and Dunsmore, I. R.Statistical prediction analysis. Cambridge University Press, 1975.
3.
Cohen, J.A coefficient of agreement for nominal scales. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20, 37-46.
4.
Consul, P. C.A simple urn model dependent upon predetermined strategy. Sankhya, 1974 , 36, Series B., 391-399.
5.
Duncan, G. T.An empirical Bayes approach to scoring multiple-choice tests in the misinformation model. Journal of the American Statistical Association, 1974, 69, 50-57.
6.
Fienberg, S. E. and Holland, P. W.Simultaneous estimation of multinomial cell probabilities . Journal of the American Statistical Association, 1973, 68, 683-691.
7.
Glass, G.Standards and criteria. Journal of Educational Measurement, 1978, 15, 237-261.
8.
Good, I. J.The estimation of probabilities: An essay on modern Bayesian methods. Cambridge, MA: MIT Press, 1965.
9.
Goodman, L. A.On the estimation of parameters in latent structure analysis. Psychometrika, 1979, 44, 123-128.
10.
Haberman, S. J.Product models for frequency tables involving indirect observation. The Annals of Statistics , 1977, 6, 1124-1147.
11.
Hambleton, R. K. and Novick, M. R.Toward an integration of theory and method for criterion-referenced tests. Journal of Educational Measurement, 1973, 10, 159-170.
12.
Harris, C. W., Pearlman, A., and Wilcox, R. R.Achievement test items—methods of study. CSE Monograph Series in Evaluation, No. 6. Los Angeles: Center for the Study of Evaluation, University of California, 1977 .
13.
Keats, J. A.Some generalizations of a theoretical distribution of mental test scores. Psychometrika, 1964, 29, 215-231, (a)
14.
Keats, J. A.Survey of test score data with respect to curvilinear relationships. Psychological Reports , 1964, 15, 871-874, (b)
15.
Keats, J. A. and Lord, F. M.A theoretical distribution for mental test scores. Psychometrika, 1962, 27, 59-72.
16.
Lazarsfeld, P. F. and Henry, N. W.Latent structure analysis. New York: Houghton Mifflin, 1968.
17.
Loeve, M.Probability theory. New York: D. Van Nostrand, 1963.
18.
Lord, F. M. and Novick, M. R.Statistical theories of mental test scores. Reading, MA: Addison-Wesley, 1968 .
19.
Marshall, J. L. and Haertel, E. H.A single-administration reliability index for criterion-referenced tests: The mean split-half coefficient of agreement. Paper read at the Annual meeting of the American Educational Research Association , 1975.
20.
McLachlan, G. J.A comparison of the estimative and predictive methods of estimating posterior probabilities. Communications in Statistics-Theory and Methods, 1979, 8, 919-929.
21.
Morrison, D. G. and Brockway, G.A modified beta-binomial model with applications to multiple choice and taste tests. Psychometrika, 1979, 44, 427-442.
22.
Mosimann, J. E.On the compound multinomial distribution, the multivariate β-distribution, and correlations among proportions. Biometrika, 1962, 49, 65-82.
23.
Murray, G. D.A note on the estimation of probability density functions. Biometrika, 1977, 64, 150-151.
24.
Rao, C. R.Linear statistical inference and applications. New York: Wiley , 1973.
25.
Rutherford, J. R. and Krutchkoff, R. G.The empirical Bayes approach : estimating the prior distribution. Biometrika, 1967, 54, 326-328.
26.
Springer, M. D.The algebra of random variables . New York: Wiley, 1979.
27.
Stone, M.Cross-validation and multinomial prediction. Biometrika, 1974, 61, 509-515.
28.
Subkoviak, M. J.Estimating reliability from a single administration of a criterion-referenced test. Journal of Educational Measurement, 1978, 15, 111-116.
29.
Subkoviak, M.Decision-consistency approaches . In R. Berk (Ed.) Criterion-Referenced Measurement: The State of the Art, Baltimore, MD: The Johns Hopkins University Press , 1980.
30.
Swaminathan, H., Hambleton, R. K., and Algina, J.Reliability of criterion referenced tests: A decision-theoretic formulation . Journal of Educational Measurement, 1974 , 11, 263-267.
31.
von Mises, R.Mathematical theory of probability and statistics. New York: Academic Press, 1964.
32.
Wilcox, R. R.Estimating the likelihood of false-positive and false-negative decisions with a mastery test: An empirical Bayes approach. Journal of Educational Statistics, 1977, 2, 289-307.
33.
Wilcox, R. R.Estimating true score in the compound binomial error model. Psychometrika, 1978, 43, 245-258.
34.
Wilcox, R.An approach to measuring the achievement or proficiency of an examinee. Applied Psychological Measurement , 1979, in press, (a)
35.
Wilcox, R.Achievement tests and latent structure models. British Journal of Mathematical and Statistical Psychology , 1979, 32, 61-71, (b)
36.
Wilcox, R. R.Prediction analysis and the reliability of a mastery test. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1979, 39, 825-839 (c)
37.
Wilcox, R.Analyzing the distractors of multiple-choice test items or partitioning multinomial cell probabilities with respect to a standard. Journal of American Statistical Association, 1979, Submitted for Publication, (d)
38.
Wilcox, R. R.Estimating the parameters of the beta-binomial distribution. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1979, 31, 527-535, (e)
39.
Wilcox, R.Determining the length of a criterion-referenced test. Applied Psychological Measurement, 1980 , 4, 425-446.