Mplus is a powerful latent variable modeling software program that has become an increasingly popular choice for fitting complex item response theory models. In this short note, we demonstrate that the two-parameter logistic testlet model can be estimated as a constrained bifactor model in Mplus with three estimators encompassing limited- and full-information estimation methods.
AsparouhovT.MuthénB. (2016). IRT in Mplus (Technical report). Los Angeles, CA: Muthén & Muthén.
2.
BoltD. M. (2005). Limited and full-information IRT estimation. In Maydeu-OlivaresA.McArdleJ. (Eds.), Contemporary psychometrics (pp. 27-71). Mahwah, NJ: Lawrence Erlbaum.
3.
BradlowE. T.WainerH.WangX. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153-168.
4.
CaiL.ThissenD.du ToitS. H. C. (2015). IRTPRO for Windows [Computer software]. Lincolnwood, IL: Scientific Software International.
5.
FinchH.BolinJ. (2017). Multilevel modeling using Mplus. Boca Raton, FL: CRC.
6.
FraserC.McDonaldR. P. (1988). NOHARM: Least squares item factor analysis. Multivariate Behavioral Research, 23, 267-269.
7.
GelmanA.RubinD. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457-472.
Huggins-ManleyA. C.AlginaJ. (2015). The partial credit model and generalized partial credit model as constrained nominal response models, with applications in Mplus. Structural Equation Modeling, 22, 308-318.
10.
JiaoH.WangS.HeW. (2013). Estimation methods for one-parameter testlet models. Journal of Educational Measurement, 50, 186-203.
11.
KnolD. L.BergerM. P. (1991). Empirical comparison between factor analysis and multidimensional item response models. Multivariate Behavioral Research, 26, 457-477.
12.
KoziolN. A. (2016). Parameter recovery and classification accuracy under conditions of testlet dependency: A comparison of the traditional 2PL, testlet, and bi-factor models. Applied Measurement in Education, 29, 184-195.
13.
LiY.BoltD. M.FuJ. (2006). A comparison of alternative models for testlets. Applied Psychological Measurement, 30, 3-21.
14.
LiY.LiS.WangL. (2010). Application of a general polytomous testlet model to the reading section of a large-scale English language assessment (ETS RR-10-21). Princeton, NJ: Educational Testing Service.
15.
LunnD. J.ThomasA.BestN.SpiegelhalterD. (2000). WinBUGS-a Bayesian modelling framework: Concepts, structure, and extensibility. Statistics and Computing, 10, 325-337.
16.
McDonaldR. P. (1999). Test theory: A unified approach. Mahwah, NJ: Lawrence Erlbaum.
17.
MuthénB.du ToitS. H. C.SpisicD. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Unpublished manuscript.
ReckaseM. (2009). Multidimensional item response theory (Vol. 150). New York, NY: Springer.
20.
RijmenF. (2010). Formal relations and an empirical comparison among the bi-factor, the testlet, and a second-order multidimensional IRT model. Journal of Educational Measurement, 47, 361-372.
21.
SAS Institute. (2015). SAS/STAT user’s guide (Version 9.4). Cary, NC: Author.
22.
SvetinaD.LevyR. (2016). Dimensionality in compensatory MIRT when complex structure exists: Evaluation of DETECT and NOHARM. Journal of Experimental Education, 84, 398-420.
23.
WangX.BradlowE. T.WainerH. (2004). User’s guide for SCORIGHT (Version 3.0): A computer program for scoring tests built of testlets including a module for covariate analysis (ETS Research Report RR 04-49). Princeton, NJ: Educational Testing Service.
24.
WilsonD. T.WoodR.GibbonsR. D. (1991). TESTFACT: Test scoring, item statistics, and item factor analysis. Skokie, IL: Scientific Software International.
25.
WuM. L.AdamsR. J.WilsonM. R.HaldaneS. (2007). ACER ConQuest 2.0: General item response modelling software [computer program manual]. Camberwell, Victoria: Australian Council for Educational Research Press.