This article provides an overview of item response theory (IRT) models and how they can be appropriately applied to patient-reported outcomes (PROs) measurement. Specifically, the following topics are discussed: (a) basics of IRT, (b) types of IRT models, (c) how IRT models have been applied to date, and (d) new directions in applying IRT to PRO measurements.
Ackerman, T. (1996). Graphical representation of multidimensional item response theory analyses. Applied Psychological Measurement, 20(4), 311-329.
2.
Andrich, D. (1978a). Application of a psychometric rating model to ordered categories which are scored with successive integers. Applied Psychological Measurement, 2(4), 581-594.
3.
Andrich, D. (1978b). A rating formulation for ordered response categories. Psychometrika, 43(4), 561-573.
4.
Andrich, D. (1978c). Scaling attitude items constructed and scored in the Likert tradition. Educational & Psychological Measurement, 38(3), 665-680.
5.
Baker, F. B. (1992). Item response theory: Parameter estimation techniques. New York: Marcel Dekker.
6.
Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37(1), 29-51.
7.
Bock, R. D., & Gibbons, R. D. (1996). High-dimensional multivariate probit analysis. Biometrics, 52(4), 1183-1194.
8.
Bock, R. D., Gibbons, R., & Muraki, E. (1988). Full-information item factor-analysis. Applied Psychological Measurement, 12(3), 261-280.
9.
Camilli, G., & Shepard, L. A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage.
10.
Davier, A. A. V., Holland, P. W., & Thayer, D. T. (2004). The Kernel method of test equating. New York: Springer.
11.
Embretson, S. E. (1996). The new rules of measurement. Psychological Assessment, 8(4), 341-349.
12.
Embretson, S. E., & Hershberger, S. L. (1999). The new rules of measurement: What every psychologist and educator should know. Mahwah, NJ: Lawrence Erlbaum.
13.
Embretson, S. E., & Reise, S. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum.
14.
Fischer, G. H., & Molenaar, I. W. (1995). Rasch models: Foundations, recent developments, and applications. New York: Springer-Verlag.
15.
Gibbons, R. D., & Hedeker, D. R. (1992). Full-information item bi-factor analysis. Psychometrika, 57(3), 423-436.
16.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage.
17.
Hammond, S. (2003). Introduction to nonparametric item response theory. British Journal of Mathematical & Statistical Psychology, 56, 385-386.
18.
Holland, P. W., & Wainer, H. (1992). Differential item functioning: Theory and practice. Hillsdale, NJ: Lawrence Erlbaum.
19.
Junker, B. W., & Sijtsma, K. (2001). Nonparametric item response theory in action: An overview of the special issue. Applied Psychological Measurement, 25(3), 211-220.
20.
Kolen, M. J., & Brennan, R. L. (1995). Test equating: Methods and practices. New York: Springer-Verlag.
21.
Lai, J. -L., Teresi, J., & Gershon, R. (2005). Procedures for the analysis of differential item functioning (DIF) for small sample sizes. Evaluation & the Health Professions, 28.
22.
Linden, W. J. V. D., & Hambleton, R. K. (1997). Handbook of modern item response theory. New York: Springer.
23.
Lord, F. M. (1980). Applications of item response to theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.
24.
Lord, F. M., Novick, M. R., & Birnbaum, A. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
25.
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149-174.
26.
Molenaar, I. W. (2001). Thirty years of nonparametric item response theory. Applied Psychological Measurement, 25(3), 295-299.
27.
Muraki, E. (1992). A generalized partial credit model: Application of an Em algorithm. Applied Psychological Measurement, 16(2), 159-176.
28.
Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56(4), 611-630.
29.
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark: Danmarks Paedogogiske Institut.
30.
Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests (Expanded ed.). Chicago: University of Chicago Press.
31.
Reckase, M. D. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9(4), 401-412.
32.
Reckase, M. D. (1997). The past and future of multidimensional item response theory. Applied Psychological Measurement, 21(1), 25-36.
33.
Reise, S. P. (2004). Item response theory and its applications for cancer outcomes measurement. In J. Lipscomb, C. C. Gotay, & C. Snyder (Eds.), Outcomes assessment in cancer: Measures, methods, and applications (pp. 425-444). Cambridge, UK: Cambridge University Press.
34.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika, 34(17), Part 2.
35.
Samejima, F. (1972). General Model for Free-Response Data. Psychometrika (Suppl. 18).
36.
Sijtsma, K., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. Thousand Oaks, CA: Sage.
37.
Stout, W. (2001). Nonparametric item response theory: A maturing and applicable measurement modeling approach. Applied Psychological Measurement, 25(3), 300-306.
38.
Thissen, D., & Steinberg, L. (1988). Data-analysis using item response theory. Psychological Bulletin, 104(3), 385-395.
39.
Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 147-169). Hillsdale, NJ: Lawrence Erlbaum.
40.
Thissen, D., & Wainer, H. (2001). Test scoring. Mahwah, NJ: Lawrence Erlbaum.
41.
Wainer, H., Dorans, N. J., Green, B. F., Steinberg, L., Flaugher, R., Mislevy, R. J., et al. (1990). Computerized adaptive testing: A primer. Hillsdale, NJ: Lawrence Erlbaum.
42.
Ware, J. E., Jr., Bjorner, J. B., & Kosinski, M. (2000). Practical implications of item response theory and computerized adaptive testing: A brief summary of ongoing studies of widely used headache impact scales. Med Care, 38(Suppl. 9), 73-82.
43.
Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive theory. Applied Psychological Measurement, 6(4), 473-492.
44.
Weiss, D. J. (1985). Adaptive testing by computer. Journal of Consulting & Clinical Psychology, 53(6), 774-789.
45.
Weiss, D. J., & Kingsbury, G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361-375.