Abstract
The proposed theory provides a basis for both measuring and correcting rater stringency error in some grossly incomplete rating data matrices. The theoretical model fit (R = .92, .85, .82; joint p <.000001) ratings made byfaculty and resident physicians (n = 47, 31, and 29) of student clinical performance in each of three junior year medical student cohorts (n = 29, 30, and 35) better than alternative models. In these data the percentage of variation attributable to stringency and ability was about 35 and 40, respectively. Three-month test-retest reliability for rater stringencies was .16 < r < .29 (joint p < .04). Cross-validation supported the proposed model (r = .61) over the conventional alternative (r = .41; z = 2.62, p < .004). Both reliability and convergent validity of the ability construct were .20 greater for one corrected rating than for one observed (uncorrected) rating.
Get full access to this article
View all access options for this article.
