Abstract
This commentary on Williams et al. (2017) focuses on an additional and equally important issue not addressed in their critique: inter-rater reliability --particularly reliability in field settings. A growing body of evidence indicates that risk assessment instruments administered in applied (and especially adversarial) contexts may be considerably less stable across examiners than what typically is reported in well-controlled, peer-reviewed journal publications. Because reliability constrains validity, effect sizes from such published research may overestimate predictive validity in real-world contexts. Although validity evidence is important, field reliability remains “the boss” when considering how well an assessment procedure will perform in applied settings.
Get full access to this article
View all access options for this article.
