Sage Journals: Discover world-class research

Abstract

This Monte Carlo study compares the ability of the parametric bootstrap version of DIMTEST with three goodness-of-fit tests calculated from a fitted NOHARM model to detect violations of the assumption of unidimensionality in testing data. The effectiveness of the procedures was evaluated for different numbers of items, numbers of examinees, correlations between underlying ability dimensions, skewness of underlying ability distributions, and the presence or absence of a guessing parameter. In the absence of guessing, DIMTEST and the NOHARM-based statistics had similar power, with the χ² statistic having a very low Type I error rate. In the presence of guessing, however, two of the NOHARM-based statistics had unacceptably high Type I error rates, while the third performed similarly to DIMTEST. Given this inflated error rate, the study compares the empirical powers after adjusting for the discrepancy in Type I error rates.

Keywords

Index terms: DIMTEST NOHARM item response theory dimensionality multidimensionality unidimensionality

Get full access to this article

View all access options for this article.

References

Ansley, R.A. , & Forsyth, T.N. (1985). An examination of the characteristics of unidimensional IRT parameter estimates derived from two-dimensional data. Applied Psychological Measurement, 9, 37-48.

Chen, W.H. , & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265-289.

Donoghue, J.R. , & Allen, N.L. (1993). Thin versus thick matching in the Mantel-Haenszel procedure for detecting DIF. Journal of Educational Statistics , 18, 131-154.

Folske, J.C. , Gessaroli, M.E. , & De Champlain, A.F. (1998, April). Comparing a likelihood-ratio chi-square statistic and DIMTEST in conditions of correlated proficiencies and pseudo-guessing. Paper presented at the annual meeting of the National Council on Measurement in Education , San Diego, CA.

Fraser, C. , & McDonald, R.P. (1988). NOHARM: Least squares item factor analysis. Multivariate Behavioral Research, 23, 267-269.

Froelich, A.G. , & Habing, B. (2003). Conditional covariance based subtest selection for DIMTEST. Manuscript submitted for publication.

Froelich, A.G. , & Stout, W. (2003). A new bias correction method for the DIMTEST procedure . Manuscript submitted for publication.

Gessaroli, M.E. , & De Champlain, A.F. (1996). Using an approximate chi-square statistic to test the number of dimensions underlying the responses to a set of items. Journal of Educational Measurement, 33, 157-179.

Gessaroli, M.E. , De Champlain, A.F. , & Folske, J.C. (1997, March). Assessing dimensionality using a likelihood-ratio chi-square test based on a nonlinear factor analysis of item response data . Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, IL.

10.

Harwell, M.R. , Stone, C.A. , Hsu, T.C. , & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20, 101-125.

11.

Hattie, J. , Krakowski, K. , Rogers, J. , & Swaminathan, H. (1996). An assessment of Stout's index of essential dimensionality. Applied Psychological Measurement , 20, 1-14.

12.

Headrick, T.C. , & Sawilowsky, S.S. (1999). Simulating correlated multivariate nonnormal distributions: Extending the Fleishman power method. Psychometrika, 64, 25-35.

13.

Kim, H.R. (1994). New techniques for the dimensionality assessment of standardized test data. Unpublished doctoral dissertation, University of Illinois at Urbana—Champaign.

14.

Lord, F.M. (1968). An analysis of the verbal Scholastic Achievement Test using Birnbaum's three-parameter logistic model. Educational and Psychological Measurement, 28, 989-1020.

15.

Maydeu-Olivares, A. (2001a). Limited information estimation and testing of Thurstonian models for paired comparison data under multiple judgment sampling . Psychometrika, 66, 209-228.

16.

Maydeu-Olivares, A. (2001b). Multidimensional item response theory modeling of binary data: Large sample properties of NOHARM estimates. Journal of Educational and Behavioral Statistics, 26, 51-71.

17.

McDonald, R.P. (1967). Nonlinear factor analysis (Psychometric Monographs No. 15). Richmond, VA: William Byrd Press.

18.

McDonald, R.P. (1997). Normal-ogive multidimensional model. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 257-269). New York: Springer-Verlag .

19.

McDonald, R.P. (2000). A basis for multidimensional item response theory . Applied Psychological Measurement, 24, 99-114.

20.

Nandakumar, R. , & Stout, W. (1993). Refinements of Stout's procedure for assessing latent trait unidimensionality. Journal of Educational and Behavioral Statistics, 18, 41-68.

21.

Pyo, K.H. (2000, April). Assessing dimensionality of a set of language test data. Paper presented at the annual meeting of the American Educational Research Association, New Orleans, LA.

22.

Reckase, M.D. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9, 401-412.

23.

Reckase, M.D. (1997). A linear logistic model for dichotomous item response data. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 271-286).

24.

Reckase, M.D. , Carlson, J.E. , Ackerman, T.A. , & Spray, J.A. (1986). The interpretation of unidimensional IRT parameters when estimated from multidimensional data. Paper presented at the annual meeting of the Psychometric Society, Toronto, Canada.

25.

Rosenbaum, P.R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory. Psychometrika, 49, 425-435.

26.

Roussos, L.A. , Stout, W.F. , & Marden, J.I. (1998). Using new proximity measures with hierarchical cluster analysis to detect multidimensionality. Journal of Educational Measurement, 35, 1-30.

27.

Seraphine, A.E. (2000). The performance of DIMTEST when latent trait and item difficulty distributions differ. Applied Psychological Measurment, 24, 82-94.

28.

Sireci, S.G. , Thissen, D. , & Wainer, H. (1991). On the reliability of testlet-based tests. Journal of Educational Measurement, 28, 237-247.

29.

Stout, W.F. (1987). A nonparametric approach for assessing latent trait dimensionality. Psychometrika, 52, 589-617.

30.

Stout, W.F. , Froelich, A.G. , & Gao, F. (2001). Using resampling to produce an improved DIMTEST procedure. In A. Boomsma , M. A. J. van Duijn , & T. A. B. Snijders (Eds.), Essays on item response theory (pp. 357-375). New York: Springer-Verlag.

31.

Stout, W.F. , Habing, B. , Douglas, J. , Kim, H.R. , Roussos, L. , & Zhang, J. (1996). Conditional covariance-based nonparametric multidimensionality assessment. Applied Psychological Measurement, 19, 331-354.

32.

Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125-145.

33.

Yen, W.M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement , 30, 187-213.

34.

Zhang, J. , & Stout, W.F. (1999a). Conditional covariance structure of generalized compensatory multidimensional items. Psychometrika, 64, 129-152.

35.

Zhang, J. , & Stout, W.F. (1999b). The theoretical DETECT index of dimensionality and its application to approximate simple structure. Psychometrika , 64, 213-249.

Performance of DIMTEST- and NOHARM-Based Statistics for Testing Unidimensionality

Abstract

Keywords

Get full access to this article

References