An IRT-Based Two-Wave Model for Studying Short-Term Stability in Personality Measurement

Abstract

This article describes an item response theorybased structural equation model that allows the short-term stability and the magnitude of retest effects to be assessed for some types of personality traits. The relations between the model estimates and the usual procedures for assessing invariance in IRT are described. An empirical application of the model is given, the substantive implications of the results are discussed, and suggestions for further research and methodological procedures are presented.

Get full access to this article

View all access options for this article.

References

Aish, A. M. , & Jöreskog, K. G. (1990). A panel model for political efficacy and responsiveness: An application of LISREL 7 with weighted least squares. Quality and Quantity, 24, 405-426.

Alsup, R. , & Gillespie, D. F. (1997). Stability of attitudes toward abortion and sex roles: A two-factor measurement model at two points of time. Structural Equation Modeling, 4, 338-352.

Anastasi, A. (1968). Psychological testing. New York: Macmillan.

Angleitner, A. , John, O. P. , & Lörh, F. J. (1986). It’s what you ask and how you ask it: An itemmetric analysis of personality questionnaires. In A. Angleitner & J. S. Wiggins (Eds.), Personality assessment via questionnaires (pp. 61-107). Berlin: Springer-Verlag.

Blalock, H. M. (1970). Estimating measurement error using multiple indicators and several points of time. American Sociological Review, 35, 101-111.

Bock, R. D. , & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of the EM algorithm. Psychometrika, 46, 443-459.

Browne, M. W. , & Cudeck, R. (1989). Single sample cross-validation indices for covariance structures. Multivariate Behavioral Research, 24, 445-455.

Campbell, D. T. , & Stanley, J. C. (1966). Experimental and quasi-experimental designs for research. Chicago: Rand McNally.

Cattell, R. B. (1986). The psychometric properties of tests: Consistency, validity and efficiency. In R. B. Cattell & R. C. Johnson (Eds.), Functional psychological testing (pp. 54-78). New York: Brunner/Mazel.

10.

Christoffersson, A. (1975). Factor analysis of dichotomized variables. Psychometrika, 40(1), 5-31.

11.

Conley, J. J. (1984). The hierarchy of consistency: A review and model of longitudinal findings on adult individual differences in intelligence, personality and self-opinion. Personality and Individual Differences, 5, 11-25.

12.

Converse, P. E. , & Markus, G. B. (1979). Plus ça change...: The new CPS election study panel. American Political Science Review, 73, 32-49.

13.

Cook, T. D. , & Campbell, D. T. (1979). Quasi-experimentation. Boston: Houghton Mifﬂin.

14.

Costa, P. T. , & McCrae, R. R. (1985). Concurrent validation after 20 years: The implications of personality stability for its assessment. In J. N. Butcher & C. D. Spielberger (Eds.), Advances in personality assessment (Vol. 4, pp. 31-54). Hillsdale, NJ: Lawrence Earlbaum.

15.

Costa, P. T. , & McCrae, R. R. (1997). Longitudinal stability of adult personality. In R. Hogan , J. Johnson , & S. Briggs (Eds.), Handbook of personality psychology (pp. 269-290). New York: Academic Press.

16.

Costa, P. T. , McCrae, R. R. , & Arenberg, D. (1980). Enduring dispositions in adult males. Journal of Personality and Social Psychology, 38, 793-800.

17.

Eysenck, S. B. G. , Eysenck, H. J. , & Barrett, P. T. (1985). A revised version of the Psychoticism scale. Personality and Individual Differences, 6, 21-29.

18.

Finch, J. F. , & West, S. G. (1997). The investigation of personality structure: Statistical models. Journal of Research in Personality, 31, 439-485.

19.

Goldberg, L. R. (1978). The reliability of reliability: The generality and correlates of intra-individual consistency in responses to structured personality inventories. Applied Psychological Measurement, 2, 269-291.

20.

Heise, D. R. (1969). Separating reliability and stability in test-retest correlation. American Sociological Review, 34, 93-101.

21.

Hu, L. , & Bentler, P. M. (1999). Cutoff criteria for fit indices in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.

22.

Jöreskog, K. G. (1979). Statistical models and methods for the analysis of longitudinal data. In J. Madgison (Ed.), Advances in factor analysis and structural equation models (pp. 129-171). Cambridge, MA: Abt Books.

23.

Jöreskog, K. G. (1990). New developments in LISREL: Analysis of ordinal variables using polychoric correlations and weighted least squares. Quality and Quantity, 24, 387-404.

24.

Kenny, D. A. (1979). Correlation and causality.New York: John Wiley.

25.

Kenny, D. A. , & Campbell, D. T. (1989). On the measurement of stability in over-time data. Journal of Personality, 57, 445-481.

26.

Kline, P. (1983). Personality: Measurement and theory. London: Hutchinson.

27.

Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.

28.

McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Lawrence Erlbaum.

29.

Mislevy, R. J. , & Bock, R. D. (1990). BILOG 3 Item analysis and test scoring with binary logistic models. Mooresville, IN: Scientific Software.

30.

Muthén, B. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika, 43, 551-560.

31.

Muthén, B. (1981). Factor analysis of dichotomous variables: American attitudes toward abortion. In D. J. Jackson & E. F. Borgatta (Eds.), Factor analysis and measurement in sociological research: A multidimensional perspective (pp. 201-214). London: Sage.

32.

Muthén, B. (1984). A general structural equation model with dichotomous, ordered, categorical and continuous latent variable indicators. Psychometrika, 49, 115-132.

33.

Muthén, B. (1993). Goodness of fit with categorical and other nonnormal variables. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 205-234). Newbury Park, CA: Sage.

34.

Muthén, B. , & Christofferson, A. (1981). Simultaneous factor analysis of dichotomous variables in several groups. Psychometrika, 46, 407-419.

35.

Muthén, B. , & Lehman, J. (1985). Multiple group IRT modeling: Applications to item bias analysis. Journal of Educational Statistics, 10, 133-142.

36.

Muthén, L. K. , & Muthén, B. (1999). Mplus user’s guide. Los Angeles: Muthén & Muthén.

37.

Nunnally, J. C. (1970). Introduction to psychological measurement. New York: McGraw-Hill.

38.

Potthast, M. J. (1993) Confirmatory factor analysis of ordered categorical variables with large models. British Journal of Mathematical and Statistical Psychology, 46, 273-286.

39.

Reise, S. P. (1999). Personality measurement issues viewed through the eyes of IRT. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement (pp. 219-241). Hillsdale, NJ: Lawrence Erlbaum.

40.

Reise, S. P. , & Waller, N. G. (1990). Fitting the two-parameter model to personality data. Applied Psychological Measurement, 14, 45-58.

41.

Rozelle, R. M. , & Campbell, D. T. (1969). More plausible rival hypotheses in the cross-lagged panel correlation technique. Psychological Bulletin, 71, 74-80.

42.

Schuerger, J. M. , Tait, E. , & Tavernelli, M. (1982). Temporal stability of personality by questionnaire. Journal of Personality and Social Psychology, 43, 176-182.

43.

Schuerger, J. M. , Zarella, K. L. , & Hotz, A. S. (1989). Factors that inﬂuence the temporal stability of personality by questionnaire. Journal of Personality and Social Psychology, 56, 777-783.

44.

Smith, D. D. (1992). Longitudinal stability of personality. Psychological Reports, 70, 483-498.

45.

Sörbom, D. (1979). Detection of correlated errors in longitudinal data. In J. Magidson (Ed.), Advances in factor analysis and structural equation models (pp. 171-184). Cambridge, MA: Abt Books.

46.

Steiger, J. H. (1989). EzPATH: A supplementary module for SYSTAT and SYGRAPH. Evanston, IL: SYSTAT.

47.

Steiger, J. H. (1998). A note on multiple sample extensions of the RMSEA fit index. Structural Equation Modeling, 5, 411-419.

48.

Steinberg, L. , & Thissen, D. (1995). Item response theory in personality research. In P. E. Shrout & S. T. Fiske (Eds.), Personality research, methods and theory: A festschrift honoring Donald W. Fiske (pp. 161-181). Hillsdale, NJ: Lawrence Erlbaum.

49.

Takane, Y. , & de Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52(3), 393-408.

50.

Thorndike, R. L. (1951). Reliability. In E. F. Lindquist (Ed.), Educational measurement (pp. 560-619). Washington, DC: American Council on Education.

51.

Waller, N. G. , Tellegen, A. , McDonald, R. P. , & Lykken, D. T. (1996). Exploring nonlinear models in personality assessment: Development and validation of a negative emotionality scale. Journal of Personality, 64, 545-576.

52.

Werts, C. E. , Jöreskog, K. G. , & Linn, R. L. (1971). Comment on “The estimation of measurement error in panel data.”American Sociological Review, 36, 110-113.

53.

Wheaton, B. , Muthén, B. , Alwin, D. , & Summers, G. (1977). Assessing reliability and stability in panel models. In D. R. Heise (Ed.), Sociological methodology 1977 (pp. 84-136). San Francisco: Jossey-Bass.

54.

Wiley, D. E. , & Wiley, J. A. (1970). The estimation of measurement error in panel data. American Sociological Review, 35, 112-117.

55.

Wiley, J. A. , & Wiley, M. G. (1974). A note on correlated errors in repeated measurements. Sociological Methods & Research, 3, 172-188.