Sage Journals: Discover world-class research

Abstract

This commentary on Williams et al. (2017) focuses on an additional and equally important issue not addressed in their critique: inter-rater reliability --particularly reliability in field settings. A growing body of evidence indicates that risk assessment instruments administered in applied (and especially adversarial) contexts may be considerably less stable across examiners than what typically is reported in well-controlled, peer-reviewed journal publications. Because reliability constrains validity, effect sizes from such published research may overestimate predictive validity in real-world contexts. Although validity evidence is important, field reliability remains “the boss” when considering how well an assessment procedure will perform in applied settings.

Get full access to this article

View all access options for this article.

References

Acklin

M. W.

Fuger

(2016). Assessing field reliability of forensic decision making in criminal court. Journal of Forensic Psychology Practice, 16(2), 74–93. doi:10.1080/15228932.2016.1148452

American Educational Research Association, American Psychological Association, National Council on Measurement in Education . (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

Boccaccini

M. T.

Murrie

D. C.

Caperton

J. D.

Hawes

S. W.

(2009). Field validity of the Static-99 and MnSOST-R among sex offenders evaluated for civil commitment as sexually violent predators. Psychology, Public Policy, & Law, 15(4), 278–314. doi:10.1037/a0017232

Boccaccini

M. T.

Murrie

D. C.

Mercado

Quesada

Hawes

Rice

A. K.

Jeglic

E. L.

(2012). Implications of Static-99 field reliability findings for score use and reporting. Criminal Justice and Behavior, 39(1), 42–58. doi:10.1177/0093854811427131

Boccaccini

M. T.

Murrie

D. C.

Rufino

K. A.

Gardner

B. O.

(2014). Evaluator differences in Psychopathy Checklist-Revised factor and facet scores. Law and Human Behavior, 38(4), 337–345. doi:10.1037/lhb0000069

Boccaccini

M. T.

Turner

D. B.

Murrie

D. C.

Rufino

K. A.

(2012). Do PCL-R scores from state or defense experts best predict future misconduct among civilly committed sex offenders? Law and Human Behavior, 36(3), 159–169. doi:10.1037/h0093949

Boccaccini

M. T.

Turner

D. B.

Murrie

D. C.

(2008). Do some evaluators report consistently higher or lower scores on the PCL-R? Findings from a state-wide sample of sexually violent predator evaluations. Psychology, Public Policy, and Law, 14(4), 262–283. doi:10.1037/a0014523

Boer

D. P.

Hart

S. D.

Kropp

P. R.

Webster

C. D.

(1997). Manual for the Sexual Violence Risk-20: Professional guidelines for assessing risk of sexual violence. Burnaby, Canada: Simon Fraser University.

Borum

Bartel

Forth

(2006). Structured Assessment of Violence Risk in Youth: Professional manual. Lutz, FL: Psychological Assessment Resources.

10.

Chevalier

C. S.

Boccaccini

M. T.

Murrie

D. C.

Varela

J. G.

(2015). Static-99R reporting practices in sexually violent predator cases: Does norm selection reflect adversarial allegiance? Law and Human Behavior, 39(3), 209–218. doi:10.1037/lhb0000114

11.

DeMatteo

Edens

J. F.

Galloway

Cox

Smith

S. T.

Formon

(2014). The role and reliability of the Psychopathy Checklist-Revised in U.S. sexually violent predator evaluations: A case law survey. Law and Human Behavior, 38(3), 248–255. doi:10.1037/lhb0000059

12.

Dror

I. E.

Kassin

S. M.

Kukucka

(2013). New application of psychology to law: improving forensic evidence and expert witness contributions. Journal of Applied Research in Memory and Cognition, 2(1), 78–81. doi:10.1016/j.jarmac.2013.02.003

13.

Douglas

K. S.

Hart

S. D.

Webster

C. D.

Belfrage

Guy

L. S.

Wilson

C. M.

(2014). Historical-Clinical-Risk Management-20, Version 3 (HCR-20 v3): Development and overview. International Journal of Forensic Mental Health, 13, 93–98. doi:10.1080/14999013.2014.906519

14.

Edens

J.F.

Boccaccini

M.T.

(press). Taking forensic mental health assessment “out of the lab” and into “the real world:” Introduction to the special issue on the field utility of forensic assessment instruments and procedures. Psychological Assessment.

15.

Edens

J. F.

Cox

Smith

S. T.

DeMatteo

Sörman

(2015). How reliable are Psychopathy Checklist-Revised scores in Canadian criminal trials? A case law review. Psychological Assessment, 27(2), 447–456. doi:10.1037/pas0000048

16.

Edens

J. F.

Magyar

M. S.

Cox

(2013). Taking psychopathy measures “out of the lab” and into the legal system: Some practical concerns. In Kiehl

Sinnott-Armstrong

(Eds.), Handbook on psychopathy and law (pp. 250–272). New York, NY: Oxford University Press.

17.

Edens

J. F.

Penson

B. N.

Ruchensky

J. R.

Cox

Smith

S. T.

(2016). Inter-rater reliability of Violence Risk Appraisal Guide scores provided in Canadian criminal proceedings. Psychological Assessment, 28(12), 1543–1549. doi:10.1037/pas0000278

18.

Edens

J. F.

Petrila

Kelley

S. E.

(press). Legal and ethical issues in the assessment and treatment of psychopathy. In Patrick

(Ed.), Handbook of psychopathy (2nd ed.). New York, NY: Guilford.

19.

Hanson

R. K.

Lunetta

Phenix

Neeley

Epperson

(2014). The field validity of Static-99/R sex offender risk assessment tool in California. Journal of Threat Assessment and Management, 1(2), 102–117. doi:10.1037/tam0000014

20.

Hare

R. D.

(2003). Hare Psychopathy Checklist—Revised manual (2nd ed.). Toronto, Canada: Multi-Health Systems.

21.

Heilbrun

(1992). The role of psychological testing in forensic assessment. Law and Human Behavior, 16(3), 257–272. doi:10.1007/BF01044769

22.

Helmus

Thornton

Hanson

R. K.

Babchishin

K. M.

(2012). Improving the predictive accuracy of Static-99 and Static-2002 with older sex offenders: Revised age weights. Sexual Abuse: Journal of Research and Treatment, 24(1), 64–101. doi:10.1177/1079063211409951

23.

Ismail

Looman

(press). Field inter-rater reliability of the Psychopathy Checklist-Revised. International Journal of Offender Therapy and Comparative Criminology. Advance online publication. doi:10.1177/0306624x16652452

24.

Jeandarme

Edens

J. F.

Habets

Bruckers

Oei

Bogaerts

(press). Psychopathy Checklist-Revised field validity in prison and hospital settings. Law and Human Behavior. Advance online publication. doi:10.1037/lhb0000222

25.

Jeandarme

Pouls

Laender

J. D.

Oei

T. I.

Bogaerts

(press). Field validity of the HCR-20 in forensic medium security units in Flanders. Psychology, Crime, & Law. Advance online publication. doi:10.1080/1068316X.2016.1258467

26.

Kassin

S. M.

Dror

I. E.

Kukucka

(2013). The forensic confirmation bias: Problems, perspectives, and proposed solutions. Journal of Applied Research in Memory and Cognition, 2(1), 42–52. doi:10.1016/j.jarmac.2013.01.001

27.

Kennealy

P. J.

Skeem

J. L.

Hernandez

I. R.

(press). Does staff see what experts see? Accuracy of front line staff in scoring juveniles' risk factors. Psychological Assessment. Advance online publication. doi:10.1037/pas0000316

28.

Lawing

Childs

Frick

Vincent

(press). Use of structured professional judgment by probation officers to assess risk for recidivism in adolescent offenders. Psychological Assessment.

29.

Miller

C. S.

Kimonis

E. R.

Otto

R. K.

Kline

S. M.

Wasserman

A. L.

(2012). Reliability of risk assessment measures used in sexually violent predator proceedings. Psychological Assessment, 24(4), 944–953. doi:10.1037/a0028411

30.

Murrie

D. C.

Boccaccini

M. T.

Caperton

Rufino

(2012). Field validity of the Psychopathy Checklist-Revised in sex offender risk assessment. Psychological Assessment, 24(2), 524–529. doi:10.1037/a0026015

31.

Murrie

D. C.

Boccaccini

M. T.

Guarnera

L. A.

Rufino

(2013). Are forensic experts biased by the side that retained them? Psychological Science, 24(10), 1889–1897. doi:10.1177/0956797613481812

32.

Murrie

D. C.

Boccaccini

M. T.

Johnson

Janke

(2008). Does interrater (dis)agreement on Psychopathy Checklist scores in sexually violent predator trials suggest partisan allegiance in forensic evaluations? Law and Human Behavior, 32(4), 353–362. doi:10.1007/s10979-007-9097-5

33.

Murrie

D.C.

Boccaccini

Turner

Meeks

Woods

Tussey

(2009). Rater (dis)agreement on risk assessment measures in sexually violent predator proceedings: Evidence of adversarial allegiance in forensic evaluation? Psychology, Public Policy, and Law, 15(1), 19–53. doi;10.1037/a0014897

34.

Neal

T. M. S.

Miller

S. L.

Shealy

R. C.

(2015). A field study of a comprehensive violence risk assessment battery. Criminal Justice and Behavior, 42(9), 952–968. doi:10.1177/0093854815572252

35.

Penney

S. R.

McMaster

Wilkie

(2014). Multirater reliability of the Historical, Clinical, and Risk Management-20. Assessment, 21(1), 15–27. doi:10.1177/1073191113514107

36.

Quesada

S. P.

Calkins

Jeglic

E. L.

(2014). An examination of the interrater reliability between practitioners and researchers on the Static-99. International Journal of Offender Therapy and Comparative Criminology, 58(11), 1364–1375. doi:10.1177/0306624x13495504

37.

Quinsey

V. L.

Harris

G. T.

Rice

M. E.

Cormier

C. A.

(2006). Violent offenders: Appraising and managing risk (2nd ed.). Washington, DC: American Psychological Association.

38.

Rettenberger

Rice

Harris

Eher

(press). Actuarial risk assessment of sexual offenders: The psychometric properties of the Sex Offender Risk Appraisal Guide (SORAG). Psychological Assessment.

39.

Rice

A. K.

Boccaccini

M. T.

Harris

P. B.

Hawes

S. W.

(2014). Does field reliability for Static-99 scores decrease as scores increase? Psychological Assessment, 26(4), 1085–1094. doi:10.1037/pas0000009

40.

Rice

M. E.

Harris

G. T.

Lang

(2013). Validation of and revision to the VRAG and SORAG: The Violence Risk Appraisal Guide – Revised (VRAG-R). Psychological Assessment, 25(3), 951–965. doi:10.1037/a0032878

41.

Rosenthal

Rosnow

R. L.

(1991). Essentials of behavioral research: Methods and data analysis (2nd ed.). New York, NY: McGraw-Hill.

42.

Rufino

K. A.

Boccaccini

M. T.

Guy

L. S.

(2011). Scoring subjectivity and item performance on measures used to assess violence risk: The PCL-R and HCR-20 as exemplars. Assessment, 18(4), 453–463. doi:10.1177/1073191110378482

43.

Singh

J. P.

Grann

Fazel

(2011). A comparative study of violence risk assessment tools: A systematic review and metaregression analysis of 68 studies involving 25,980 participants. Clinical Psychology Review, 31(3), 499–513. doi:10.1016/j.cpr.2010.11.009

44.

Sturup

Edens

J. F.

Sörman

Karlberg

Fredriksson

Kristiansson

(2014). Field reliability of the Psychopathy Checklist-Revised among life sentenced prisoners in Sweden. Law and Human Behavior, 38(4), 315–324. doi:10.1037/lhb0000063

45.

Townsend

(1971). Won't get fooled again [recorded by The Who]. On Who's Next. [Album]. Universal City, CA: MCA Records.

46.

Vincent

G. M.

Guy

L. S.

Fusco

S. L.

Gershenson

B. G.

(2012). Field reliability of the SAVRY with juvenile probation officers: Implications for training. Law and Human Behavior, 36(3), 225–236. doi:10.1037/h0093974

47.

Williams

Wormith

Bonta

Sitarenios

(2017). The use of meta-analysis to compare and select offender risk instruments: A commentary on Singh, Grann, and Fazel (2011). International Journal of Forensic Mental Health 16(1), 1-15.

“Meet the New Boss. Same as the Old Boss:” A Commentary on Williams,Wormith,Bonta,and Sitarenios (2017)

Abstract

Get full access to this article

References