Abstract
This article describes the limitations of certain statistical techniques and approaches for handling multiple criteria in assessing reliability, concurrent validity, and generalizability. The article also suggests alternatives to using these approaches on performance assessment measures. The application of the latent variable modeling approach to the data revealed a significant improvement in the degree of interrater reliability, concurrent validity, and generalizability (over raters and topics) on a scoring rubric. The improvement in concurrent validity was particularly noticeable. However, it should be noted that one of the main limitations of this study was the use of a small number of subjects, which could have affected the validity of some of the findings.
Get full access to this article
View all access options for this article.
