Sage Journals: Discover world-class research

Abstract

Get full access to this article

View all access options for this article.

References

Agresti, A. (1990). Categorical data analysis. New York: Wiley.

Allen, N.L. , & Holland, P.W. (1993). A model for missing information about the group membership of examinees in DIF studies. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 241-252). Hillsdale NJ: Erlbaum.

Andrich, D. (1978). A rating formulation for ordered response categories . Psychometrika, 43, 561-573.

Bock, R.D. (1972). Estimating item parameters and latent ability when the responses are scored in two or more nominal categories. Psychometrika, 37, 29-51.

Bock, R.D. (1993). Different DIFs: Comment on the papers read by Neil Dorans and David Thissen. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 115-122). Hillsdale NJ: Erlbaum.

Bock, R.D. , & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of the EM algorithm. Psychometrika , 46, 442-449.

Bock, R.D. , & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179-197.

Bock, R.D. , Muraki, E. , & Pfeiffenberger, W. (1988). Item pool maintenance in the presence of item parameter drift. Journal of Educational Measurement, 25, 275-285.

Chang, H.-H. , & Mazzeo, J. (1994). The unique correspondence of the item response function and the item category response functions in polytomously scored item response models. Psychometrika, 59, 391-404.

10.

Chang, H. , Mazzeo, J. , & Roussos, L.A. (1995). Detecting DIF for polytomously scored items: An adaptation of the SIBTEST procedure (Research Rep. No. 95-5). Princeton NJ: Educational Testing Service.

11.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale NJ: Erlbaum .

12.

Donoghue, J.R. , Holland, P.W. , & Thayer, D.T. (1993). A monte carlo study of factors that affect the Mantel-Haenszel and standardization measures of differential item functioning . In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 137-166). Hillsdale NJ: Erlbaum.

13.

Dorans, N.J. (1991, November). Implications of choice of metric for DIF effect size on decisions about DIF. Paper presented at the International Symposium on Modem Theories in Measurement: Problems and Issues, Montebello, Quebec, Canada.

14.

Dorans, N.J. , & Holland, P.W. (1993). DIF detection and description: Mantel-Haenszel and standardization. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 35-66). Hillsdale NJ: Erlbaum .

15.

Dorans, N.J. , & Kulick, E. (1983). Assessing unexpected differential item performance offemale candidates on SAT and TSWE forms administered in December 1977: An application of the standardization approach (RR-83-9. Princeton NJ: Educational Testing Service.

16.

Dorans, N.J. , & Kulick, E. (1986). Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test. Journal of Educational Measurement, 23, 355-368.

17.

Dorans, N.J. , & Schmitt, A.P. (1993). Constructed response and differential item functioning: A pragmatic perspective. In R. E. Bennett & W. C. Ward (Eds.), Construction versus choice in cognitive measurement (pp. 135-165). Hillsdale NJ: Erlbaum.

18.

Grima, A. (1993, April). Extending the Mantel-Haenszel DIF procedure to polytomously scored items. Paper presented at the annual meeting of the National Council on Measurement in Education, Atlanta GA.

19.

Holland, P.W. , & Thayer, D.T. (1985). An alternative definition of the ETS delta scale of item difficulty (RR-85-43. Princeton NJ: Educational Testing Service.

20.

Holland, P.W. , & Thayer, D.T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H. Braun (Eds.), Test validity (pp. 129-145). Hillsdale NJ: Erlbaum.

21.

Holland, P. W. , & Wainer, H. (Eds.). (1993). Differential item functioning . Hillsdale NJ: Erlbaum.

22.

Kelderman, H. (1989). Item bias detection using loglinear IRT. Psychometrika, 54, 681-697.

23.

Kelley, T.L. (1927). The interpretation of educational measurements . New York: World Book.

24.

Longford, N.T. , Holland, P.W. , & Thayer, D.T. (1993). Stability of the MH D-DIF statistics across populations . In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 171-196). Hillsdale NJ: Erlbaum.

25.

Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale NJ: Erlbaum.

26.

Lord, F.M. , & Novick, M.R. (1968). Statistical theories of mental test scores. Reading MA: Addison-Wesley.

27.

Mantel, N. (1963). Chi-square tests with one degree of freedom: Extensions of the Mantel-Haenszel procedure. Journal of the American Statistical Association, 58, 690-700.

28.

Mantel, N. , & Haenszel, W.M. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719-748.

29.

Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-174.

30.

McLaughlin, M.E. , & Drasgow, F. (1987). Lord's chi-square test of item bias with estimated and with known ability parameters. Applied Psychological Measurement , 11, 161-173.

31.

Miller, T.R. , & Spray, J.A. (1993). Logistic discriminant function analysis for DIF identification of polytomously scored items. Journal of Educational Measurement, 30, 107-122.

32.

Millsap, R.E. , & Everson, H.T. (1993). Methodology review: Statistical approaches for assessing measurement bias. Applied Psychological Measurement , 17, 297-334.

33.

Mislevy, R.J. (1993). Foundations of a new test theory. In N. Frederiksen , R. J. Mislevy , & I. I. Bejar (Eds.), Test theory for a new generation of tests (pp. 19-39). Hillsdale NJ : Erlbaum.

34.

Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159-176.

35.

Muraki, E. (1993, April). Implementing item parameter drift and bias in polytomous item response models. Paper presented at the annual meeting of the National Council on Measurement in Education , Atlanta GA.

36.

Muraki, E. , & Englehard, G. (1989, April). Examining differential item functioning with BIMAIN. Paper presented at the annual meeting of the American Educational Research Association, San Francisco CA.

37.

Muthén, B. , & Lehman, J. (1985). Multiple group IRT modeling: Applications to item bias analysis. Journal of Educational Statistics, 10, 133-142.

38.

Ramsay, J.O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve problems. Psychometrika, 56, 611-630.

39.

Rogers, H.J. , & Swaminathan, H. (1993, April). Differential item functioning procedures for non-dichotomous responses. Paper presented at the annual meeting of the National Council on Measurement in Education, Atlanta GA.

40.

Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph, No. 17.

41.

Scheuneman, J.D. (1975, April). A new method of assessing bias in test items. Paper presented at the annual meeting of the American Educational Research Association, Washington DC. (ERIC Document Reproduction Service No. ED 106 359)

42.

Scheuneman, J.D. , & Bleistein, C.A. (1989). A consumer's guide to statistics for identifying differential item functioning. Applied Measurement in Education, 2, 255-275.

43.

Shealy, R.T. , & Stout, W.F. (1993a). An item response model for test bias and differential test functioning. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 197-239). Hillsdale NJ: Erlbaum.

44.

Shealy, R.T. , & Stout, W.F. (1993b). A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/ DIF as well as item bias/DIF. Psychometrika, 54, 159-194.

45.

Swaminathan, H. , & Rogers, H.J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement , 27, 361-370.

46.

Thissen, D. , & Steinberg, L. (1984). A response model for multiple choice items. Psychometrika, 49, 501-519.

47.

Thissen, D. , & Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 51, 567-577.

48.

Thissen, D. , Steinberg, L. , & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H. Braun (Eds.), Test validity (pp. 147-169). Hillsdale NJ: Erlbaum.

49.

Thissen, D. , Steinberg, L. , & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67-113). Hillsdale NJ: Erlbaum.

50.

Thissen, D. , & Wainer, H. (1982). Some standard errors in item response theory. Psychometrika, 56, 611-630.

51.

Wainer, H. (1993). Model-based standardized measurement of an item's differential impact. In P. W. Holland & H. Wainer (Eds.),Differential item functioning (pp. 123-135). Hillsdale NJ: Erlbaum.

52.

Wainer, H. , Sireci, S.G. , & Thissen, D. (1991). Differential testlet functioning: Definitions and detection. Journal of Educational Measurement, 28, 197-219.

53.

Welch, C. , & Hoover, H.D. (1993). Procedures for extending item bias techniques to polytomously scored items. Applied Measurement in Education , 6, 1-19.

54.

Wilson, A.W. , Spray, J.A. , & Miller, T.R. (1993, April). Logistic regression and its use in detecting nonuniform differential item functioning. Paper presented at the annual meeting of the National Council on Measurement in Education, Atlanta GA.

55.

Zwick, R. , Donoghue, J.R. , & Grima, A. (1993a). Assessing differential item functioning in performance tasks (RR-93-14. Princeton NJ: Educational Testing Service.

56.

Zwick, R. , Donoghue, J.R. , & Grima, A. (1993b). Assessment of differential item functioning for performance tasks. Journal of Educational Measurement, 30, 233-251.

DIF Assessment for Polytomously Scored Items: A Framework for Classification and Evaluation

Abstract

Get full access to this article

References