Abstract
Portfolios and other open-ended assessments are increasingly incorporated into evaluations and testing programs. However, questions about the reliability of such assessments continue to be raised. After reviewing forces that may be leading to increased interest in and use of portfolio assessment, we investigate the interrater reliability of a portfolio assessment used in a small-scale program evaluation. Three types of portfolio scores were investigated—analytic, combined analytic (formed by summing across analytic scores), and holistic. The interrater reliability coefficient was highest for summed analytic scores (r 5 .86). Results indicate that at least three raters are required to obtain acceptable levels of reliability for holistic and individual analytic scores.
Get full access to this article
View all access options for this article.
