Abstract
Rating procedures typically require several raters to rate individuals on several items so that the consistency of two facets (raters and items), each of which has more than two levels, need to be examined. Classical test theory is inadequate to describe reliability in this context because the presence of errors of severity, leniency and central tendency violates the assumption of parallelism. Cronbach's generalizability theory is eminently suited to the reliability estimation of ratings because it does not require this assumption. Apart from yielding intraclass correlations as generalizability coefficients, it provides for the separate estimation of the variability of raters and items and presents explicit guidelines for deciding whether generalizabililty should be raised by increasing raters or items. These theoretical implications are demonstrated in terms of a numerical example published in the South African literature.
Get full access to this article
View all access options for this article.
