The Impact of Varied Discrimination Parameters on Mixed-Format Item Response Theory Model Selection

Abstract

Whittaker, Chang, and Dodd compared the performance of model selection criteria when selecting among mixed-format IRT models and found that the criteria did not perform adequately when selecting the more parameterized models. It was suggested by M. S. Johnson that the problems when selecting the more parameterized models may be because of the low variance of the discrimination parameters used to generate the data. This simulation study reproduced the Whittaker et al. study by incorporating more variability in the discrimination parameter estimates used to generate the data. The results indicated that the majority of the criteria performed more accurately when selecting the more parameterized models. Differences among the criteria performance under certain conditions and implications for model selection practice are discussed.

Keywords

IRT model selection mixed-format IRT discrimination variability model selection criteria

Get full access to this article

View all access options for this article.

References

Akaike

(1973). Information theory and an extension of the maximum likelihood principle. In Petrov

B. N.

Csaki

(Eds.), Second international symposium on information theory. Budapest, Hungary: Akademiai Kiado.

Birnbaum

(1968). Some latent trait models and their use in inferring an examinee’s ability. In Lord

Novick

(Eds.), Statistical theories of mental scores (pp. 395-479). Reading, MA: Addison-Wesley.

Bozdogan

(1987). Model selection and Akaike’s information criterion (AIC): The general theory and its analytical extensions. Psychometrika, 52, 345-370.

de Ayala

R. J.

(2009). The theory and practice of item response theory. New York, NY: Guilford Press.

Hannon

E. J.

Quinn

B. G.

(1979). The determination of the order of an autoregression. Journal of the Royal Statistical Society, Series B, 41, 190-195.

Hurvich

C. M.

Tsai

C. L.

(1989). Regression and time series model selection in small samples. Biometrika, 72, 297-307.

Kang

Cohen

A. S.

(2007). IRT model selection methods for dichotomous items. Applied Psychological Measurement, 31, 331-358.

Kang

Cohen

A. S.

Sung

H.-J.

(2009). Model selection indices for polytomous items. Applied Psychological Measurement, 33, 499-518.

Masters

G. N.

(1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-174.

10.

Muraki

(1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159-176.

11.

Muraki

Bock

R. D.

(2003). PARSCALE 4: IRT item analysis and test scoring for rating scale data [Computer program]. Chicago, IL: Scientific Software.

12.

Pastor

D. A.

Dodd

B. G.

Chang

H. H.

(2002). A comparison of item selection techniques and exposure control mechanisms in CATs using the generalized partial credit model. Applied Psychological Measurement, 26, 147-163.

13.

Rasch

(1960). Probabilistic models for some intelligence and attainment tests. Chicago, IL: University of Chicago Press.

14.

SAS Institute Inc. (2007). SAS (Version 9.2) [Computer Software]. Cary, NC: SAS Institute.

15.

Schwarz

(1978). Estimating the dimension of a model. Annals of Statistics, 6, 461-464.

16.

Whittaker

T. A.

Chang

Dodd

B. G.

(2012). The performance of IRT model selection methods with mixed-format tests. Applied Psychological Measurement, 36, 159-180.

17.

Whittaker

T. A.

Fitzpatrick

S. J.

Williams

N. J.

Dodd

B. G.

(2003). IRTGEN: A SAS macro program to generate known trait scores and item responses for commonly used item response theory models. Applied Psychological Measurement, 27, 299-300.

18.

Wise

S. L.

(2006). An investigation of the differential effort received by items on a low-stakes computer-based test. Applied Measurement in Education, 19, 95-114.

19.

Yen

W. M.

(1981). Using simulation results to choose a latent trait model. Applied Psychological Measurement, 5, 245-262.