Sage Journals: Discover world-class research

Abstract

There are several techniques that increase the precision of subscores by borrowing information from other parts of the test. These techniques have been criticized on validity grounds in several of the recent publications. In this note, the authors question the argument used in these publications and suggest both inherent limits to the validity argument and empirical issues worth examining.

Keywords

subscores validity augmented subscore

Get full access to this article

View all access options for this article.

References

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

de la Torre

Patz

R. J.

(2005). Making the most of what we have: A practical application of multidimensional IRT in test scoring. Journal of Educational and Behavioral Statistics, 30, 295-311.

Dwyer

Boughton

K. A.

Yao

Steffen

Lewis

(2006). A comparison of subscale score augmentation methods using empirical data. Paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco, CA.

Haberman

S. J.

(2008a). When can subscores have value? Journal of Educational and Behavioral Statistics, 33, 204-229.

Haberman

S. J.

(2008b). Subscores and validity (ETS Research Report No. RR-08-64). Princeton, NJ: Educational Testing Services.

Haberman

S. J.

Sinharay

(2010). Reporting of subscores using multidimensional item response theory. Psychometrika, 75, 209-227.

Kane

M. T.

(2006). Validation. In Brennan

R. L.

(Ed.), Educational measurement (4th ed., pp.18-64). Westport, CT: Praeger.

Luecht

R. M.

(2003, April). Applications of multidimensional diagnostic scoring for certification and licensure tests. Paper presented at the meeting of the National Council on Measurement in Education, Chicago, IL.

Lyren

(2009). Reporting subscores from college admission tests. Practical Assessment, Research, and Evaluation, 14, 1-10.

10.

Messick

(1989). Validity. In Linn

R. L.

(Ed.) Educational measurement (3rd ed., pp. 13-103). Washington, DC: National Council on Measurement in Education and American Council on Education.

11.

National Research Council. (2001). Knowing what students know: The science and design of educational assessment. Washington, DC: National Academies Press.

12.

Puhan

Sinharay

Haberman

S. J.

Larkin

(2010). Comparison of subscores based on classical test theory. Applied Measurement in Education, 23, 1-20.

13.

Reckase

M. D.

(1997). The past and future of multidimensional item response theory. Applied Psychological Measurement, 21, 25-36.

14.

Sinharay

(2010). How often do subscores have added value? Results from operational and simulated data. Journal of Educational Measurement, 47, 150-174.

15.

Skorupski

W. P.

Carvajal

(2010). A comparison of approaches for improving the reliability of objective level scores. Educational and Psychological Measurement, 70, 357-375.

16.

Stone

C. A.

Zhu

Lane

(2010). Providing subscale scores for diagnostic information: A case study when the test is essentially unidimensional. Applied Measurement in Education, 23, 63-86.

17.

Wainer

Sheehan

Wang

(2000). Some paths toward making praxis scores more useful. Journal of Educational Measurement, 37, 113-140.

18.

Wainer

Vevea

J. L.

Camacho

Reeve

B. B.

Rosa

Nelson

Swygert

K. A.

. . . Thissen

(2001). Augmented scores—"Borrowing strength" to compute scores based on small numbers of items. In Thissen

Wainer

(Eds.), Test scoring (pp. 343-387). Mahwah, NJ: Lawrence Erlbaum.

19.

Yao

Boughton

K. A.

(2007). A multidimensional item response modeling approach for improving subscale proficiency estimation and classification. Applied Psychological Measurement, 31, 83-105.

20.

Yen

W. M.

(1987, June). A Bayesian/IRT index of objective performance. Paper presented at the annual meeting of the Psychometric Society, Montreal, Quebec, Canada.

Do Adjusted Subscores Lack Validity? Don’t Blame the Messenger

Abstract

Keywords

Get full access to this article

References