The Precision of Simulation Study Results

Abstract

The number of replications in monte carlo simulation studies can be modified to improve the precision of parameter estimates. Given the speed and power of microcomputers, it is not necessary to hold the number of replications to past levels. Reasons why increasing the number of replications is not necessary for satisfactory levels of precision are discussed. Some guidelines are offered in the context of an error tolerance analysis for determining how much precision is needed.

Get full access to this article

View all access options for this article.

References

Baker, F. B. (1989). GENIRV: A FORTRAN computer program for generating item response vectors[Computer program]. Madison WI: University of Wisconsin, Laboratory of Experimental Design.

Baker, F. B. (1993). EQUATE 2.0: A computer program for the characteristic curve method of IRT equating. Applied Psychological Measurement, 17, 20–20.

Brennan, R. L. (1983). Elements of generalizability theory. Iowa City IA: American College Testing Program.

Crick, J. E. , & Brennan, R. L. (1984). GENOVA: A general purpose analysis of variance system[Computer program]. Iowa City IA: American College Testing Program.

Cronbach, L. J. , Gleser, G. C. , Nanda, H. , & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley.

Drasgow, F. (1989). An evaluation of marginal maximum likelihood estimation for the two-parameter logistic model. Applied Psychological Measurement, 13, 77–90.

Harwell, M. R. ,& Janosky, J. E. (1991). An empirical study of the effects of small datasets and varying prior variances on item parameter estimation in BILOG. Applied Psychological Measurement, 15, 279–291.

Harwell, M. R. , Stone, C. A. , Hsu, T.-C. , & Kirisci,L. (1996). Monte carlo studies in item response theory. Applied Psychological Measurement, 20, 101–125.

Kane, M. T. (1996). The precision of measurements.Applied Measurement in Education, 9, 355–379.

10.

Kim, S.-H. , & Cohen, A. S. (1992). Effects of linking methods on detection of DIF. Journal of Educational Measurement, 29, 51–66.

11.

Mislevy, R. J. , & Bock, R. D. (1990). BILOG 3: Item analysis and test scoring with binary logistic models[Computer program]. Mooresville IN: Scientific Software.

12.

Stocking, M. L. , & Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7, 201–210.