Sage Journals: Discover world-class research

Abstract

For a specific achievement test item and a randomly selected examinee, let p be the probability of correctly determining whether the examinee knows the correct response. Various techniques have been proposed for estimating p. This paper describes and illustrates how results in the engineering literature on "k out of n system reliability" can be used to study and characterize tests based on the estimated values of p. In particular, we can empirically determine the minimum number of distractors required for multiple-choice tests. If we estimate p with an answer-until-correct scoring procedure, we can also determine the minimum number of examinees needed to be reasonably certain about whether y is less than or greater than some predetermined constant, where y = [UNKNOWN]_pi and p_i is the value of p for the ith item on an n-item test. In other words, we can determine whether the expected number of correct decisions on an n-item test is reasonably large.

Get full access to this article

View all access options for this article.

References

Barlow, R. E. and Proschan, F. Statistical theory of reliability and life testing: Probability models. New York: Holt, Rinehart & Winston, 1975.

Dayton, C. M. and Macready, G. B. A probabilistic model for validation of behavioral hierarchies. Psychometrika, 1976, 41, 189-204.

Dayton, C. M. and Macready, G. B. A scaling model with response errors and intrinsically unscalable respondents. Psychometrika, 1980, 45, 343-356.

Fhanér, S. Item sampling and decision-making in achievement testing . British Journal of Mathematical and Statistical Psychology, 1974, 27, 172-175.

Gleser, L. J. On the distribution of the number of successes in independent trials . Annals of Probability, 1975, 3, 182-188.

Hambleton, R. K. , Swaminathan, H. , Cook, L. , Eignor, D. R. , and Gifford, J. Developments in latent trait theory: Models, technical issues, and applications. Review of Educational Research, 1978, 48, 467-510.

Harris, C. W., Houang, R., Pearlman, A., and Bamett, B. Final Report, No. NIE-G-78-0085. National Institute of Education, 1980.

Hoeffding, M. On the distribution of the number of successes in independent trials . Annals of Mathematical Statistics, 1956, 27, 713-721.

Lord, F. M. and Novick, M. R. Statistical theories of mental test scores. Reading, Mass: Addison-Wesley, 1968.

10.

Macready, G. B. and Dayton, C. M. The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 1977, 2, 199-120.

11.

Marshall, A. W. and Olkin, I. Inequalities: Theory of majorization and its applications. New York: Academic Press, 1979.

12.

Molenaar, I. W. On Wilcox's latent structure model for guessing . British Journal of Mathematical and Statistical Psychology, 1981, 34, 224-228.

13.

Pledger, G. and Proschan, F. Comparisons of order statistics and of spacings from heterogeneous distributions. In J. S. Rustagi (Ed.) Optimizing Methods in Statistics. New York: Academic Press, 1971.

14.

Wilcox, R. R. Some empirical and theoretical results on an answer-until-correct scoring procedure. Los Angeles, CA: Center for the Study of Evaluation, UCLA Graduate School of Education, 1980.

15.

Wilcox, R. R. Applying ranking and selection techniques to determine the length of a mastery test . Educational and Psychological Measurement, 1979, 31, 13-22.

16.

Wilcox, R. R. Some results and comments on using latent structure models to measure achievement . Educational and Psychological Measurement, 1980, 40, 645-658.

17.

Wilcox, R. R. Solving measurement problems with an answer-until correct scoring procedure . Applied Psychological Measurement, RAND R. WILCOX 1981, 5, 399-414. (a)

18.

Wilcox, R. R. Methods and recent advances in measuring achievement: A response to Molenaar . British Journal of Mathematical and Statistical Psychology, 1981, 34, 229-237. (b)

Using Results on K Out of N System Reliability to Study and Characterize Tests'

Abstract

Get full access to this article

References