Recently, there has been increased interest in tests of measurement equivalence/ invariance (ME/I). This study uses simulated data with known properties to assess the appropriateness, similarities, and differences between confirmatory factor analysis and item response theory methods of assessing ME/I. Results indicate that although neither approach is without flaw, the item response theory–based approach seems to be better suited for some types of ME/I analyses.
Baker, F. (1994). GENIRV: Computer program for generating item response theory data. Madison: University of Wisconsin, Laboratory of Experimental Design.
2.
Bentler, P. M., & Yaun, K.-H. (1999). Structural equation models with small samples: Test statistics. Multivariate Behavioral Research, 34, 181-197.
3.
Boles, J. S., Dean, D. H., Ricks, J. M., Short, J. C.,& Wang, G. (2000). The dimensionality of the Maslach Burnout Inventory across small business owners and educators. Journal of Vocational Behavior, 56, 12-34.
4.
Camilli, G.,& Shepard, L. A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage.
5.
Chan, D. (2000). Detection of differential item functioning on the Kirton Adaption-Innovation Inventory using multiple-group mean and covariance structure analyses. Multivariate Behavioral Research, 35, 169-199.
6.
Clarke, I., III. (2000). Extreme response style in cross-cultural research: An empirical investigation. Journal of Social Behavior and Personality, 15, 137-152.
7.
Cohen, A. S., Kim, S. H.,& Wollack, J. A. (1996). An investigation of the likelihood ratio test for detection of differential item functioning. Applied Psychological Measurement, 20, 15-26.
8.
Collins, W. C., Raju, N. S.,& Edwards, J. E. (2000). Assessing differential functioning in a satisfaction scale. Journal of Applied Psychology, 85, 451-461.
9.
Craig, S. B., & Kaiser, R. B. (2003, April). Using item response theory to assess measurement equivalence of 360 performance ratings across organizational levels. Paper presented at the 18th Annual Meeting of the Society for Industrial/Organizational Psychology, Orlando, FL.
10.
Donovan, M. A., Drasgow, F.,& Probst, T. M. (2000). Does computerizing paper-and-pencil job attitude scales make a difference? New IRT analyses offer insight. Journal of Applied Psychology, 85, 305-313.
11.
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum.
12.
Facteau, J. D., & Craig, S. B. (2001). Are performance appraisal ratings from different rating sources comparable?Journal of Applied Psychology, 86, 215-227.
13.
Flowers, C. P., Raju, N. S.,& Oshima, T. C. (2002, April). A comparison of measurement equivalence methods based on confirmatory factor analysis and item response theory. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.
14.
Fromen, A.,& Raju, N. (2003, April). Telecommuters and non telecommuters: Assessing measurement equivalence on an employee survey. Paper presented at the 18th Annual Meeting of the Society for Industrial/Organizational Psychology, Orlando, FL.
15.
Ghorpade, J., Hattrup, K., & Lackritz, J. R. (1999). The use of personality measures in crosscultural research: A test of three personality scales across two countries. Journal of Applied Psychology, 84, 670-679.
16.
Golembiewski, R. T., Billingsley, K.,& Yeager, S. (1976). Measuring change and persistence in human affairs: Types of change generated by OD designs. Journal of Applied Behavioral Science, 12, 133-157.
17.
Horn, J. L., & McArdle, J. J. (1992). A practical and theoretical guide to measurement invariance in aging research. Experimental Aging Research, 18, 117-144.
18.
Hui, C. H.,& Triandis, H. C. (1989). Effects of culture and response format on extreme response style. Journal of Cross-Cultural Psychology, 20, 296-309.
19.
Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36, 409-426.
Kim, S.,& Cohen, A. S. (1997, March). An investigation of the likelihood ratio test for detection of differential item functioning under the graded response model. Paper presented at the annual meeting of the American Educational Research Association, Chicago.
22.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.
23.
Luczak, S. E., Raine, A., & Venables, P. H. (2001). Invariance in the MAST across religious groups. Journal of Studies on Alcohol, 62, 834-837.
24.
Martin, L. R.,& Friedman, H. S. (2000). Comparing personality scales across time: An illustrative study of validity and consistency in life-span archival data. Journal of Personality, 68, 85-110.
25.
Maurer, T. J., Raju, N. S.,& Collins, W. C. (1998). Peer and subordinate performance appraisal measurement equivalence. Journal of Applied Psychology, 83, 693-702.
26.
McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Lawrence Erlbaum.
27.
McLaughlin, M. E.,& Drasgow, F. (1987). Lord’s chi-square test of item bias with estimated and known person parameters. Applied Psychological Measurement, 21, 161-173.
28.
Meade, A. W., & Lautenschlager, G. J. (2004). A Monte-Carlo study of confirmatory factor analytic tests of measurement equivalence/invariance. Structural Equation Modeling, 11, 60-72.
29.
Meade, A. W., Lautenschlager, G. J., Michels, L. C., & Gentry, W. (2003, April). The equivalence of online and paper and pencil assessments. Paper presented at the 18th Annual Conference of the Society for Industrial and Organizational Psychology, Orlando, FL.
30.
Ployhart, R. E.,& Oswald, F. L. (2004). Applications of mean and covariance structure analysis: Integrating correlational and experimental approaches. Organizational Research Methods, 7, 27-65.
31.
Ployhart, R. E., Wiechmann, D., Schmitt, N., Sacco, J. M., & Rogg, K. (2002). The crosscultural equivalence of job performance ratings. Human Performance, 16, 49-79.
32.
Raju, N. S., Laffitte, L. J., & Byrne, B. M. (2002). Measurement equivalence: A comparison of methods based on confirmatory factor analysis and item response theory. Journal of Applied Psychology, 87, 517-529.
33.
Reise, S. P., Widaman, K. F., & Pugh, R. H. (1993). Confirmatory factor analysis and item response theory: Two approaches for exploring measurement invariance. Psychological Bulletin, 114, 552-566.
34.
Riordan, C. M., Richardson, H. A., Schaffer, B. S.,& Vandenberg, R. J. (2001). Alpha, beta, and gamma change: A review of past research with recommendations for new directions. In C. A. Schriesheim& L. L. Neider (Eds.), Equivalence of measurement. Greenwich, CT: Information Age Publishing.
35.
Riordan, C. M., & Vandenberg, R. J. (1994). A central question in cross-cultural research: Do employees of different cultures interpret work-related measures in an equivalent manner?Journal of Management, 20, 643-671.
36.
Rock, D. A., Werts, C. E.,& Flaugher, R. L. (1978). The use of analysis of covariance structures for comparing the psychometric properties of multiple variables across populations. Multivariate Behavioral Research, 13, 403-418.
37.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 34, 100-114.
38.
Schaubroeck, J., & Green, S. G. (1989). Confirmatory factor analytic procedures for assessing change during organizational entry. Journal of Applied Psychology, 74, 892-900.
39.
Schmit, M. J., Kihm, J. A.,& Robie, C. (2000). Development of a global measure of personality. Personnel Psychology, 53(1), 153-193.
40.
Schmitt, N. (1982). The use of analysis of covariance structures to assess beta and gamma change. Multivariate Behavioral Research, 17, 343-358.
41.
Schriesheim, C. A., & Neider, L. L. (Eds.). (2001). Equivalence of measurement. Greenwich, CT: Information Age Publishing.
42.
Steenkamp, J. B. E. M.,& Baumgartner, H. (1998). Assessing measurement invariance in crossnational consumer research. Journal of Consumer Research, 25, 78-90.
43.
Taris, T. W., Bok, I. A.,& Meijer, Z. Y. (1998). Assessing stability and change of psychometric properties of multi-item concepts across different situations: A general approach. Journal of Psychology, 132, 301-316.
44.
Thissen, D. (1991). MULTILOG users guide: Multiple categorical item analysis and test scoring using item response theory [Computer software]. Chicago: Scientific Software International.
45.
Thissen, D. (2001). IRTLRDIF v.2.02b: Software for the computation of the statistics involved in item response theory likelihood-ratio test for differential item functioning [Computer software]. Chapel Hill, NC: LL Thurstone Psychometric Laboratory.
46.
Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H. I. Braun (Eds.), Test validity(pp. 147-169). Hillsdale, NJ: Lawrence Erlbaum.
47.
Thissen, D., Steinberg, L.,& Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In P. W. Holland & H. Wainer (Eds.), Differential item functioning(pp. 67-113). Hillsdale, NJ: Lawrence Erlbaum.
48.
Vandenberg, R. J. (2002). Toward a further understanding of and improvement in measurement invariance methods and procedures. Organizational Research Methods, 5, 139-158.
49.
Vandenberg, R. J.,& Lance, C. E. (2000). Are view and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4-69.
50.
Vandenberg, R. J., & Self, R. M. (1993). Assessing newcomer’s changing commitments to the organization during the first 6 months of work. Journal of Applied Psychology, 78, 557-568.
51.
Watkins, D.,& Cheung, S. (1995). Culture, gender, and response bias: An analysis of responses to the Self-Description Questionnaire. Journal of Cross-Cultural Psychology, 26, 490-504.
52.
Yoo, B. (2002). Cross-group comparisons: A cautionary note. Psychology and Marketing, 19, 357-368.
53.
Zickar, M. J., & Robie, C. (1999). Modeling faking good on personality items: An item-level analysis. Journal of Applied Psychology, 84, 551-563.