A Bayesian alternative to interpretations based on classical reliability theory is presented. The central issue in reliability is defined as the extent to which a test score can predict itself, rather than a hypothetical true score. Procedures are detailed for calculation of a posterior score and credible interval with joint consideration of item sample and occasion error.
Get full access to this article
View all access options for this article.
References
1.
Camilli, G. (1988). Scale shrinkage and the estimation of latent distribution parameters. Journal of Educational Statistics, 13, 227-241.
2.
Cronbach, L. J. , Gleser, G. C., Nanda, H., and Rajaratnam, N. (1972). The dependability of behavioral measurements: theory of generalizability of scores and profiles. New York: Wiley.
3.
Fornell, C. and Rust, R. T. (1989). Incorporating prior theory in covariance structure analysis: a Bayesian approach. Psychometrika, 54, 249-259.
4.
Hambleton, R. K. and Novick, M. R. (1973). Toward an integration of theory and method for criterion-referenced tests. Journal of Educational Measurement, 10, 159-170.
5.
Hopkins, K. D. , Stanley, J. C., and Hopkins, B. R. (1990). Educational and psychological measurement and evaluation (7th ed.). Engelwood Cliffs, NJ: Prentice-Hall.
6.
Horst, P. (1966). Psychological measurement and prediction. Belmont, CA: Wadsworth.
7.
Huynh, H. (1976). On the reliability of decisions in domain-referenced testing. Journal of Educational Measurement, 13, 253-264.
8.
Huynh, H. (1982). A Bayesian procedure for mastery decisions based on multivariate normal test data. Psyc hometrika, 47, 309-313.
9.
Jones, W. P. (1989). A proposal for the use of Bayesian probabilities in neuropsychological assessment. Neuropsychology, 3, 17-22.
10.
Jones, W. P. and Newman, F. L. (1971). Bayesian techniques for test selection. Educational and Psychological Measurement, 31, 851-856.
11.
Mislevy, R. J. (1988). Exploiting auxiliary information about items in the estimation of Rasch item difficulty parameters. Applied Psychological Measurement, 12, 281-296.
12.
Pearson, E. S. (1970). The Neyman-Pearson story, 1926-1934. In E. S. Pearson and M. G. Kendall (Eds.), Studies in the history of statistics and probability (pp. 455-477). London: Charles Griffin and Company, Ltd.
13.
Phillips, L. D. (1973). Bayesian statistics for social scientists. New York: Crowell.
14.
Schlaifer, R. (1959). Probability and statistics for business decisions. New York: McGraw-Hill.
15.
Schlaifer, R. (1961). Introduction to statistics for business decisions. New York: McGraw-Hill.
16.
Shavelson, R. J. , Webb, N. M., and Rowley, G. L. (1989). Generalizability theory. American Psychologist, 44, 922-932.
17.
Thorndike, R. L. (1986a). The role of Bayesian concepts in test development and test interpretation. Journal of Counseling and Development, 65, 54-56.
18.
Thorndike, R. L. (1986b). Bayesian concepts and test making. Journal of Counseling and Development, 65, 110-111.
19.
Thorndike, R. L. (1986c). Bayesian concepts and test interpretation. Journal of Counseling and Development, 65, 170-172.
20.
Wainer, H. and Thissen, D. (1987). Estimating ability with the wrong model. Journal of Educational Statistics, 12, 339-368.
21.
Webb, N. M. , Rowley, G. L., and Shavelson, R. J. (1988). Using generalizability theory in counseling and development. Measurement & Evaluation in Counseling and Development, 21, 81-90.
22.
Wechsler, D. (1974a). Wechsler Intelligence Scale for Children-Revised. New York: Psychological Corporation.
23.
Wechsler, D. (1974b). WISC-R Manual. New York: Psychological Corporation.
24.
Wedding, D. and Faust, D. (1989). Clinical judgment and decision making in neuropsychology. Archives of Clinical Neuropsychology, 4, 231-265.