The Maryland School Performance Assessment Program: Performance Assessment with Psychometric Quality Suitable for High Stakes Usage

Abstract

The Maryland School Performance Assessment Program (MSPAP) is an innovative performance-based testing program covering reading, writing, language usage, mathematics, science, and social studies. MSPAP is administered annually on a census basis to students in Grades 3, 5, and 8, and the results are used for high-stakes, yearly evaluations of school performance and for tracking school improvement. The present article describes the program design and highlights its psychometric characteristics with respect to scaling, equating, standard setting, score accuracy, and validity.

Get full access to this article

View all access options for this article.

References

Allen, M. , & Yen, W. M. (1979). Introduction to measurement theory. Monterey, CA: Brooks/Cole.

Almasi, J. F. , Afflerbach, P. P. , Gutbrie, J. T. , & Schafer, W. D. (1994, April). The impact of a statewide performance assessment program on classroom instructional practice in literacy. Paper presented at the annual meeting of the American Educational Research Association, New Orleans.

Burket, G. R. (1991). PARDUX, Version 1.4. Monterey, CA: CTB Macmillan/McGraw-Hill.

Candell, G. L. , & Ercikan, K. (1994). Assessing the reliability of the Maryland School Performance AssessmentProgranm.InternationalJournalofEducationalResearch, 21, 267-269,274, 277-278.

Cronbach, L. J. (1971). Test validation. In R. L. Thomdike (Ed.), Educational measurement (2nd ed., pp. 443-507). New York: American Council on Education.

Cronbach, L. J. , Gleser, G. C. , Nanda, H. , & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability of scores and profiles. New York: John Wiley.

Cronbach, L. J. , & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281-302.

CTB Macmillan/McGraw-Hill . (1989). Comprehensive tests of basic skills, fourth edition, technical report. Monterey, CA: Author.

CTB Macmillan/McGraw-Hill . (1992). Final technical report: Maryland School Perforniance Assessment Program, 1991. (Available from the Maryland State Department of Education, Baltimore, MD)

10.

Ferrara, S. , Huynh, H. , & Baghi, H. (in press). Contextual characteristics of locally dependent open-ended item clusters in a large-scale performance assessment. Applied Measurement in Education.

11.

Fitzpatrick, A. R. , Ercikan, K. , & Ferrara, S. (1992, April). An analysis of the technical characteristics of scoring rules for constructed-response items. Paperpresented at the annual meeting of the National Council on Measurement in Education, San Francisco.

12.

Fitzpatrick, A. R. , & Yen, W. M. (1995). The psychometric characteristics of choice items. Journal of Educational Measurement, 32, 243-259.

13.

Goldberg, G. L , & Kapinus, B. (1993). Problematic responses in reading performance assessment tasks: Sources and implications. Applied Measurement in Education, 6(4), 281-305.

14.

Green, D. R. , Fitzpatrick, A. R. , Candell, G. , & Miller, E. (1992, April). Bias in performance assessment. Paper presented at the annual meeting of the National Council on Measurement in Education, Atlanta.

15.

Langer, J. A. (1990). The process of understanding: Reading for literary and informative purposes. Research in the Teaching of English, 24, 229-257.

16.

Linn, R. L. , & Harnisch, D. (1981). Interactions between item content and group membership in achievement test items. Journal of Educational Measurement, 18, 109-118.

17.

Madaus, G. F. (Ed.). (1983). The courts, validity, and minimum competency testing. Boston: Kluwer-Nijhoff.

18.

Maryland State Departnent of Education . (1989). Maryland writing test II: Technical report. Baltimore: Author.

19.

Maryland State Department of Education . (1990). Maryland writing test HI: Technical report. Baltimore: Author.

20.

Maryland State Department of Education . (1991). Maryland writing test II: Technical report. Baltimore: Author.

21.

Maryland State Department of Education . (1995, February). 1994 MSPAP and beyond: Maryland school performance assessment program score interpretation guide. Baltimore: Author.

22.

Maryland State Department of Education, CTB/McGraw-Hill, and Measurement Incorporated . (1996, January). 1995 Maryland school performance assessment program technical report. Baltimore: Maryland State Department of Education.

23.

Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-174.

24.

Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-103). New York: American Council on Education/Macmillan.

25.

Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159-176.

26.

National Council of Teachers of Mathematics . (1989). Curriculum and evaluation standards for school mathematics. Reston, VA: Author.

27.

Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30, 187-213.