American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
2.
CastellanoK. E.HoA. D. (2015). Practical differences among aggregate-level conditional status metrics: From median student growth percentiles to value-added models. Journal of Educational and Behavioral Statistics, 40, 35–68.
3.
HabermanS. J. (2008). When can subscores have value?Journal of Educational and Behavioral Statistics, 33, 204–229.
4.
LuechtR. M. (2012). An Introduction to assessment engineering for automatic item generation. In GierlM.HaladynaT. (Eds.), Automatic item generation (pp. 59–101). New York, NY: Taylor-Francis/Routledge.
5.
LuechtR. M. (2013). Assessment engineering task model maps, task models and templates as a new way to develop and implement test specifications. Journal of Applied Testing Technology, 14, 1–38.
6.
Maydeu-OlivaresA. (2015). Evaluating the fit of IRT models. In RieseS. P.ReveckiD. A. (Eds.), Handbook of item response theory modeling: Applications to typical performance assessment (pp. 111–127). New York, NY: Taylor & Francis (Routledge).
7.
RoussosL.StoutW. F.MardenJ. (1998). Using new proximity measures with hierarchical cluster analysis to detect multidimensionality. Journal of Educational Measurement, 35, 1–30.
8.
StevensS. S. (1946). On the theory of scales of measurement. Science, 103, 677–680.
9.
StoutW. F. (1987). A nonparametric approach to assessing latent trait dimensionality. Psychometrika, 52, 589–617.
10.
ZhangJ.StoutW. F. (1999). The theoretical DETECT index of dimensionality and its application to approximate simple structure. Psychometrika, 64, 213–249.