Sage Journals: Discover world-class research

Abstract

Accurate risk assessment of individuals convicted of sexual offenses is crucial to prevent reoffending and prolonged institutionalization. However, findings indicate a heterogenous quality of risk assessment reports. Some of the qualitative variance may reflect differences in the strength of empirical evidence linking risk factors to reoffending. Some factors that have historically been important treatment targets have meta-analytically been shown to be empirically unsupported. To investigate the influence of unsupported risk factors on the decision-making process, the present study examined risk assessment reports (N = 304) conducted between 1999 and 2016. Results showed a heterogenous consideration of empirically (un)supported risk factors. Reports following a structured risk assessment approach considered significantly more empirically supported risk factors than reports based on an unstructured, clinical-intuitive assessment procedure. Taken together, our findings provide further support for the use of structured and standardized risk assessment procedures and caution expert witnesses against incorporating empirically unsupported risk factors.

Keywords

criminal risk assessment sexual offense risk factors prognostic judgment recidivism diagnosis hit rates

Introduction

Quality of criminal risk assessment reports for individuals convicted of sexual offenses has previously been shown to be heterogeneous (Haarig et al., 2012; Kunzl & Pfaefflin, 2011; Wertz et al., 2020). The methodological approach to risk assessment has been proposed as one central aspect responsible for this heterogeneity (e.g., Rettenberger & Eher, 2016; Wertz & Rettenberger, 2021). Prognostic judgments may be based on subjective clinical (or unstructured, intuitive, unguided, impressionistic), actuarial (statistical, mechanical, or algorithmic), structured professional, or clinical-idiographic prediction methods (e.g., Grove et al., 2000; Meehl, 1954; Nicholls et al., 2013). Research comparing different risk assessment approaches has consistently demonstrated the predictive superiority of structured methods (actuarial or structured professional judgment [SPJ] instruments; e.g., Ægisdóttir et al., 2006; Bengtson & Långström, 2007; Hanson & Morton-Bourgon, 2009) and the limited accuracy of unstructured predictions (e.g., Grove et al., 2000; Johansen, 2007; Turgut et al., 2006), particularly for predicting sexual and violent recidivism (e.g., Bonta et al., 1998; Heilbrun et al., 2016; Jackson et al., 2004). Crucially, however, a further study evaluating German risk assessment reports revealed that approximately half did not include standardized instruments but instead relied solely on intuitive judgments (Wertz & Rettenberger, 2021).

Additional value for explaining the heterogenous quality of risk assessment reports may be gained from analyzing the type and number of risk factors considered for risk assessment. In addition, several personality characteristics of examinees may affect expert witness judgments even though they do not constitute validated risk factors (Mann et al., 2010; Rettenberger, 2018). In an attempt to give an overview of factors that are frequently incorporated into expert witness judgments despite lacking predictive validity for recidivism, Mann et al. (2010) distinguished several types of variables for predicting recidivism risk among individuals convicted of sexual offenses. Next to identifying four variables for which no stable empirical relation to sexual reoffending could be established (i.e., denial, low self-esteem, major mental illness, loneliness), the authors describe four variables for which more than five studies were not able to find a predictive relationship with sexual recidivism at all (i.e., depression, poor social skills, poor victim empathy, lack of motivation for treatment at intake). The overview by Mann et al. (2010) was updated by Seto et al. (2023) who reviewed relevant literature that has appeared since that time. Two risk factors previously deemed promising by Mann et al. (2010), hostility toward women and dysfunctional coping, are now considered as empirically supported, while no new risk factors were identified. Furthermore, positive social support was the only empirically supported protective factor. Consequently, the consideration of the unsupported risk factors identified by Mann et al. (2010) and Seto et al. (2023) may contribute to the heterogeneity in the quality and accuracy of risk assessment reports.

In addition, the predictive relevance of psychiatric disorders remains questionable. Expert witnesses are urged to exercise caution when estimating the influence of psychiatric diagnoses on the risk of recidivism, as several studies indicate that such diagnoses have low or no predictive validity (Eher et al., 2015, 2016; Kingston et al., 2015). Recognizing the questionable role of psychiatric disorders for risk assessment, the German version of the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5, American Psychiatric Association, 2015) includes a warning note regarding the consideration of psychiatric diagnoses in judicial contexts. This call for caution is supported by a relatively low reliability of clinical diagnoses in the forensic context, leading to a large proportion of unjustified diagnoses (Rettenberger, 2018) and a high prevalence of mental disorders among individuals convicted of sexual offenses (e.g., up to 72% diagnosed with a mood disorder; Eher, Rettenberger, & Turner, 2019) despite generally low recidivism rates (Rettenberger et al., 2015).

Importantly, some exceptions exist. Specific disorders such as exclusive pedophilia (Eher et al., 2015), exhibitionism (Biedermann et al., 2023), hypersexual disorder (Gregório Hertz et al., 2022), some other personality disorders (PD), including antisocial, narcissistic, and borderline disorders (F-60 diagnoses), as well as substance use disorders (SUD; Kingston et al., 2015) predicted recidivism in previous studies, indicating the necessity to differentially consider the influence of psychiatric disorders when assessing the risk of reoffending (Långström et al., 2004).

Thus, factors that have historically been important targets and standard components of most treatment programs continue to be considered regularly, although meta-analytically regarded empirically unsupported (Mann et al., 2010; Seto et al., 2023). Therefore, examining the extent to which such unsupported risk factors are still considered in risk assessments of individuals convicted of sexual offenses, and how they influence prognostic judgments, represents an important aspect of quality assurance in forensic (risk) assessment practice.

Study Objectives

The main aim of the present study was to identify the characteristics considered and empirical foundations for prognostic judgments in a sample of German risk assessment reports about individuals convicted of sexual offenses. To this end, we examined the influence of these characteristics on the prognostic direction and accuracy of the reports. More precisely, study objectives were to systematically examine the degree of consideration of empirically unsupported and supported risk factors in risk assessment reports about individuals convicted of sexual offenses, and to investigate the relevance of the identified risk factors for the direction and the accuracy these judgments.

Methods

We retrospectively analyzed N = 304 risk assessment reports with regards to different aspects of the offense(s), pre-delinquency, psychiatric diagnosis according to the International Statistical Classification of Diseases and Related Health Problems, 10th revision (ICD-10; World Health Organization, 2016), and incarceration or placement of individuals due to articles 20, 21, 63, 64, or 66 of the German penal code.¹ Report-related data (time, institutional context, expert profession, use of risk assessment tools, methodological approach, and direction of prognostic judgment) as well as the consideration of empirically unsupported and supported risk factors for sexual recidivism were systematically gathered. In addition, the accuracy of the prognostic judgments was examined using data about recidivism according to the Federal Central Register (retrieved in 2016).²

Sample

All risk assessment reports were about male individuals charged or convicted of sexual offenses. Reports were gathered from two German institutions representing common forensic practice: the penitentiary in Freiburg (n = 135) and the Department of Forensic Psychiatry of the University Hospital Munich (n = 169). Assessments were conducted between 1999 and 2016 and were ordered by diverse judicial parties in the course of different penal law and sanction execution proceedings, including local or district courts, courts for the execution of prison sanctions, higher regional courts, and public prosecutors. Important to note, research on risk factors for sexual recidivism and approaches to risk assessment evolved during this timeframe (see e.g., Kelley et al., 2020), leading to shifts in what was considered best practice throughout the sampling period. Nevertheless, examining risk assessment reports from this extended timeframe offers a representative cross-section of reports. This enables our research to identify factors contributing to the heterogeneity in report quality, consistent with patterns observed in similar time periods in previous studies. Risk assessments were conducted by 68 different expert witnesses who reported between 1 and more than 40 assessments (M = 16.2 reports; SD = 7.2; range: 1–42). More than three quarters of the assessments were contributed by psychiatrists (80.6%, n = 245), while 16.1% (n = 49) were conducted by psychologists, and 3.3% by experts of both professions (n = 10).

Empirical Data Collection and Coding Procedure

We excluded reports about nonsexual, female, and juvenile persons. To ensure comparability of assessment contexts, reports that were only based on records without personal examination of the individuals as well as incompletely archived reports were also excluded and replaced by reports including a face-to-face examination by an expert witness. Unstructured clinical judgments were defined as assessments in which risk factors were measured solely based on the clinical experience of the assessor (Grove et al., 2000; Hanson & Morton-Bourgon, 2009; Nicholls et al., 2013; Skeem & Monahan, 2011). If any actuarial or SPJ tool was used, assessments were considered structured.

Empirically unsupported risk factors were drawn from the study by Mann et al. (2010). Their review constitutes a seminal work, providing an overview of variables that are frequently incorporated into expert witness reports, despite showing an equivocal or no relation to recidivism. Contrasting them with empirically supported risk factors, reports were examined regarding expert witness’ consideration of items included in the German version of the VRS:SO (Wong et al., 2003), for which a good reliability and predictive validity in a German speaking sample of individuals convicted of sexual offenses could be shown (Gaunersdorfer & Eher, 2022, 2023).³ From the 24 items making up the VRS:SO, seven are considered static, and 17 constitute dynamic risk factors that are amenable to change. Each factor was rated independently of use of risk assessment instruments on an ordinal scale and subsequently dichotomized for statistical analyses (0 = not considered for risk assessment; 1 = considered for risk assessment).

To assess the agreement with which the coding scheme was applied by two raters (MW and MG), interrater reliabilities were calculated for all risk factors in a randomly selected sample of 20 reports (6.7%; Bujang & Baharum, 2017) using Cohen’s (1968) weighted kappa κ_w. Using linearly-weighted κ-coefficients, interrater-reliability was at least substantial (κw = .61-.80; Landis & Koch, 1977) for all variables. For unsupported risk factors, Cohen’s weighted κ-ranged from κ_w = .81, p < .001 to κ_w = 1.0, p < .001. For static VRS:SO items, Cohen’s kappa ranged from κ_w = .82, p < .001 to κ_w = .93, p < .001, and for dynamic VRS:SO items, the reliability statistics ranged from κ_w = .73, p < .001 to κ_w = .90, p < .001. Several psychiatric diagnoses, such as exclusive pedophilia, exhibitionism, hypersexual disorder, and other PDs, including antisocial, narcissistic, or borderline disorders (F-60 diagnoses), and SUDs were considered as relevant for risk assessment, as all other mental disorders were considered as irrelevant.

To assess risk communication, the exact wording of the final judgment of each report was translated into a five-point Likert-type scale (very low risk, low risk, moderate risk, high risk, very high risk) by the first author (MW) according to a recommended five-level risk category (Eher, Rettenberger, Etzler, et al., 2019; Hanson et al., 2017).⁴ To examine the accuracy of prognostic judgments, actual recidivism data was extracted from criminal records in June 2016, according to the Federal Central Register, and analyzed using an average follow-up period of 7.48 years. Recidivism was coded as any (new criminal conviction of any kind), nonviolent, general sexual (new conviction involving both sexual offenses with physical contact as well as noncontact sexual offenses), sexual contact (new conviction for sexual offense with physical contact), or violent reconviction.

Data Analysis

To analyze the consideration of supported and unsupported risk factors and their dependence on the use of standardized risk assessment instruments, an independent samples t-test (two-sided) and a multivariate analysis of variance (MANOVA) with subsequent univariate analyses of variance (ANOVA) were conducted. To investigate the relevance of supported and unsupported risk factors for the direction and accuracy of expert witness judgments, hierarchical binary logistic regression analyses were conducted.⁵ Following the regression analyses, Receiver Operating Characteristic (ROC) analyses were calculated to assess the discriminability of high- and low-risk predictions, accurate and inaccurate judgments, as well as recidivism for each predictor and the final regression models by calculating the area under the curve (AUC; Backhaus et al., 2018).

Results

Almost two-third of the reports concerned individuals diagnosed with a mental disorder according to the ICD-10 who were placed in preventive detention or forensic psychiatry (see Supplemental Table S1, available in the online version of this article). Nearly 85% (n = 256) of the sample were convicted of at least one offense prior to the index offense, mostly because of sexual, or both sexual and violent offenses. Approximately half of the sample was described to have a (very) high risk of reoffending (n_{very high} = 52; n_high = 103), while approximately 40% of reports concluded a (very) low (n_very _low = 23; n_low = 101) risk of reoffending. Moderate risk judgments were made in 8.2% of the reports (n = 25). Approximately one in six reports (15.5%) included psychological testing (i.e., use of formal psychological diagnostic instruments), and 45% (n = 134) of all risk assessments followed a structured approach. Among these, approximately 23% applied SPJ methods only (e.g., HCR-20 [Müller-Isberner & Webster, 1998; original version: Webster et al., 1997], SVR-20 [Müller-Isberner et al., 2000; original version: Boer et al., 1997]), 3% used actuarial only (e.g., Static-99 [Rettenberger & Eher, 2006; original version: Harris et al., 2003], VRAG [Rossegger et al., 2009; original version: Quinsey et al., 2006], Stable-2007 [Matthes & Rettenberger, 2008; original version: Hanson et al., 2007]), and 18.4% used both methods.

Recidivism Rates

On average, individuals were followed up for a time at risk of 7.48 years (SD = 4.04, range: 1.5–16). The average time between risk assessment and release was 2.01 years (SD = 3.24). Recidivism rates of the total sample (n = 221) were analyzed for 2-year, 5-year, and total follow-up periods. Individuals who had not yet been released or lacked a criminal record because of death or emigration were excluded. Among the remaining sample, almost half were reconvicted for at least one offense of any kind during the total follow-up period. As expected, recidivism rates at the 2- and 5-year follow-ups were lower (see Supplemental Table S2, available in the online version of this article).

Hit Rates

For the average total follow-up period, the hit rate for the prediction of general recidivism was approximately at chance level (50.7%). Among individuals who reoffended with a sexual offense (n = 11), expert witnesses correctly classified 54.5% (n = 6) as high risk, while 36.4% (n = 4) were incorrectly assessed as low risk. For one person who reoffended with a sexual offense, the recidivism risk was assessed as moderate.

Overall, the descriptive results suggest that the present sample was comparable to other (international) samples of individuals convicted of sexual offenses (Hanson & Morton-Bourgon, 2009; Nedopil, 2013).

Consideration of Unsupported and Supported Risk Factors

We could find a significant mean difference in the consideration of supported (VRS:SO) and unsupported (according to Mann et al., 2010) risk factors of medium effect size, t(606) = 7.60, p < .001, d = 0.62, 95% CI [0.09, 0.15] (Cohen, 1968). Regardless of whether risk assessment instruments were applied, approximately 11% more supported (M = 33.8%, SD = 22.1, n₁ ≈ 7–8) than unsupported (M = 22.0%, SD = 15.50, n₂ = 1.8) risk factors were considered for risk assessment (see Supplemental Table S3, available in the online version of this article).

On average, expert witnesses considered one-third of all empirically supported risk factors and approximately two unsupported risk factors for their prognostic decisions. High standard deviations for both supported and unsupported variables suggest that the reports were highly heterogeneous regarding the number of risk factors considered. When risk assessment instruments were applied, a significant increase in the proportion of (un-)supported risk factors was observed, F(2, 301) = 61.79, p < .001, $η_{p}^{2}$ = .29. Univariate post hoc tests revealed that this increase was significant for the proportion of supported risk factors only (∆M = 23.6%, SD = 3.5) but not for the proportion of unsupported risk factors (∆M = 1.9%, SD = 1.7), F(1, 302) = 118.45, p < .001, $η_{p}^{2} =$ .28.

Relevance of Risk Factors for Prognostic Direction

Table 1 shows the results of the hierarchical logistic regression analysis, examining how the presence of diagnosis and the consideration of empirically supported risk factors (SRF) and unsupported risk factors (URF) predict the direction of risk judgments. The regression model containing all predictors (diagnosis, SRF, URF, diagnosis x SRF) fitted the data significantly better than the null model, $R_{N}^{2}$ = .08, $χ^{2}$ (4, N = 279) = 16.15, p = .003.

Table 1:

Hierarchical Logistic Regression Using Number and Type of Subject-Related Characteristics to Predict the Direction of Risk Assessment (N = 279)

Predictor	B	SE	Wald	df	p	e ^B	$R_{N}^{2}$	95% CI
Model without predictors
Constant (B₀)	−0.22	0.12	3.43	1	.064	0.80
Model including predictors							.08
Constant (B₀)	0.31	0.23	1.84	1	.175	1.36
Diagnosis	−0.68	0.28	5.99	1	.014	0.51		[0.30, 0.87]
SRF^a	0.82	0.25	10.59	1	.001	2.27		[1.39, 3.71]
URF^a	−0.08	0.13	0.44	1	.505	0.92		[0.72, 1.18]
SRF × Diagnosis	−0.89	0.30	8.96	1	.003	0.41		[0.23, 0.74]

Note. CI = confidence interval; $R_{N}^{2}$ = Nagelkerke Pseudo-R² value.

Variable was mean centered and standardized for regression analyses.

A main effect of diagnosis indicated that the presence of a mental health diagnosis significantly reduced the probability of receiving a low-risk judgment. In contrast, a main effect of SRF showed that an increased consideration of SRF increases the probability of a low-risk judgment. A significant interaction of SRF and diagnosis suggested that the influence of SRF on the direction of risk assessment depended on whether a psychiatric diagnosis was present or not. The consideration of a larger proportion of SRF increased the probability of a judgment of a (very) low risk only for individuals without a mental health diagnosis. For these individuals, the odds to be categorized as (very) low risk increased by 2.27 if the proportion of risk factors increased by one standard deviation (i.e., increase by 5.3 risk factors, 22.1%). Individuals for whom an average number of SRF were considered (i.e., 33.8%, M_centered = 0) were classified as (very) low risk with a probability of 57.6%. If, however, four to five SRF more were considered (i.e., 55.9%, M_centered = 1), the probability to be categorized as (very) low risk increased by 17.9%, given that all other variables were held constant.

A follow-up analysis confirmed that the effect of SRF was nullified, if a psychiatric diagnosis was present, B = −0.07, Wald(1) = 0.19, p = .667, e^B = 0.94, 95% CI [0.69, 1.27]. Individuals with a mental health diagnosis had a decreased chance to be classified as (very) low risk, regardless of the number of SRF considered for prognostic judgments. The presence of a diagnosis decreased the odds of a (very) low-risk judgment by 0.51 compared to those without a diagnosis, corresponding to a 16.7% decrease (all other variables kept constant; see Supplemental Figure S1, available in the online version of this article).

Relevance of Risk Factors for Predictive Accuracy

Table 2 displays the results of the hierarchical logistic regression analysis using the presence of a diagnosis (any diagnosis present: yes/no) and the consideration of SRF and URF to predict the accuracy of expert witness judgments for the 2-year follow-up period. The regression model including all predictors did not fit the data significantly better than the null model, $R_{N}^{2}$ = .50, $χ^{2}$ (4, N = 207) = 7.06, p = .133. A significant main effect of SRF indicated that the probability of correctly assessing recidivism risk improved with an increased consideration of SRF. A significant interaction additionally highlighted that this effect was dependent on whether a psychiatric diagnosis was present.

Table 2:

Hierarchical Logistic Regression Using Number and Type of Subject-Related Characteristics to Predict Hit Rates for a Time at Risk of 2 Years (N = 207)

Predictor	B	SE	Wald	df	p	e ^B	$R_{N}^{2}$	95% CI
Model without predictors
Constant (B₀)	0.5	0.14	0.12	1	.728	1.05
Model including predictors							.05
Constant (B₀)	0.39	0.24	2.65	1	.104	1.48
SRF ^a	0.61	0.25	5.89	1	.015	1.85		[1.13, 3.03]
URF ^a	0.00	0.15	0.00	1	.984	1.00		[0.76, 1.33]
Diagnosis	−0.43	0.31	1.95	1	.162	0.65		[0.36, 1.19]
SRF x F60 Diagnosis	−0.68	0.32	4.60	1	.032	0.50		[0.27, 0.94]

Note. CI = confidence interval; $R_{N}^{2}$ = Nagelkerke Pseudo-R² value.

Variable was mean centered and standardized for regression analysis.

Simple effect analyses revealed that the proportion of SRF considered only positively predicted judgment accuracy if the assessed individual was not diagnosed with a psychiatric disorder (see Supplemental Figure S2, available in the online version of this article). For those with a diagnosis, risk assessment accuracy did not depend on the consideration of SRF, B = −0.07, Wald(1) = 0.12, p = .728, e^B = 0.94, 95% CI [0.64, 1.37]. The accuracy with which the final model was able to predict whether expert witness judgments were correct was slightly above chance level (AUC = .65; Rice & Harris, 2005).

Finally, differences in the accuracy of expert witness judgments over time were investigated. Similar to the model predicting hit rates for the 2-year follow-up period, the model for the 5-year follow-up revealed a significant interaction of SRF and diagnosis, B = −0.73, Wald(1) = 3.88, p = .049, e^B = 0.48, 95% CI [0.24, 1.0]. In contrast to the 2-year model, no significant effect of SRF on predictive accuracy could be found for individuals without a psychiatric diagnosis, B = 0.51, Wald(1) = 3.22, p = .073, e^B = 1.66, 95% CI [0.96, 2.88], or those with a diagnosis, B = −0.22, Wald(1) = 0.86, p = .354, e^B = 0.80, 95% CI [0.51, 1.28]. The significant interaction indicated that, for individuals with a diagnosis, the odds of correctly predicting recidivism decreased by 0.48 if the number of SRF increased by one unit compared to the same increase for those without a diagnosis, B = −0.73, Wald(1) = 3.88, p = .049, e^B = 0.48, 95% CI [0.24, 1.00]. These results indicate that the relationship between SRF and predictive accuracy differed for individuals with and without a diagnosis. Yet, the number of considered SRF within each group could not significantly predict whether risk assessments were correct.

The model containing all predictors did not fit the data significantly better than the null model, $R_{N}^{2}$ = .04, $χ^{2}$ (4, N = 175) = 4.71, p = .318. In addition, the accuracy with which the model was able to predict whether expert witness judgments were correct was similar to the 2-year model (AUC = .66; Hosmer et al., 2013; Rice & Harris, 2005).

Neither the interaction term nor any of the main effects of SRF, URF, or diagnosis significantly predicted hit rates for the total follow-up period, with correct classifications just above chance level (54.3%). These results suggest that the predictive contribution of SRF decreased over time, yielding non-significant effects for an average follow-up of 7.48 years.

Relevance of Risk Factors for Recidivism

Additional regression analyses with the same variables (diagnosis, SRF, URF, diagnosis × SRF) predicting recidivism with any criminal offense revealed no significant interaction of SRF and the presence of a psychiatric disorder, B = −0.09, Wald(1) = 0.09, p = .764, e^B = 0.91, 95% CI [0.51, 1.65]. Furthermore, the presence of a psychiatric disorder did not significantly predict recidivism, B = 0.09, Wald(1) = 0.10, p = .755, e^B = 1.09, 95% CI [0.63, 1.90]. In contrast, a significant main effect of SRF was observed, indicating that a greater number of SRF considered in the assessment was associated with a lower probability of criminal reoffending in the future, B = −0.36, Wald(1) = 5.93, p = .015, e^B = 0.70, 95% CI [0.52, 0.93] (AUC = .60; Rice & Harris, 2005).

The point-biserial correlation of SRF and sexual recidivism was not significant, indicating that the proportion of SRF considered was unrelated to the probability of sexual recidivism, r(N = 221) = −0.50, p = .462. Similarly, no significant relationship between sexual recidivism and the presence of a psychiatric disorder was found,χ²(1, N = 221) = 0.27, p = .606. This indicated that both the proportion of SRF considered as well as the presence of a psychiatric diagnosis were independent of the risk of sexual recidivism.

The Relevance of Mental Health Diagnoses for Prognostic Direction and Hit Rates

In the next step we repeated the hierarchical logistic regression analyses predicting the direction of prognostic judgments, hit rates for 2-year, 5-year, and total follow-up periods, and recidivism, while controlling for whether a diagnosis is considered relevant (PDCR) for recidivism risk or not (PDCI).

Relevance of Risk Factors for Prognostic Direction

In the model predicting the direction of expert witness judgments, a significant interaction of SRF and the presence of a psychiatric diagnosis considered as irrelevant for risk assessment (PDCI; based on the current status of scientific knowledge) could be detected, while no significant interaction of psychiatric diagnoses considered as relevant for risk assessment (PDCR) and SRF was found.

The model including only the significant interaction showed that the direction of prognostic judgments depended on the consideration of SRF when no PDCI was present, B = −0.75, Wald(1) = 6.96, p = .002, e^B = 0.47, 95% CI [0.27, 0.83]. When a PDCI was present, the consideration of SRF did not predict the direction of risk assessment, B = −0.11, Wald(1) = 0.29, p = .592, e^B = 0.89, 95% CI [0.59, 1.35]. In contrast, when no PDCI was present, an increase in the consideration of SRF by one standard deviation (5.3 risk factors, 22.1%) raised the probability of receiving a low-risk judgment by 14.6%, reflecting a significant change, B = 0.63, Wald(1) = 11.20, p < .001, e^B = 1.89, 95% CI [1.30, 2.73]. Furthermore, the presence of a PDCR significantly decreased the odds of being classified as low risk, B = −1.44, Wald(1) = 13.99, p < .001, e^B = 0.24, 95% CI [0.11, 0.50], whereas no main effect of PDCI on the direction of risk assessment could be detected, B = −0.33, Wald(1) = 1.25, p = .264, e^B = 0.72, 95% CI [0.40, 1.28]. The final regression model revealed a correct classification of 62.4% (AUC = .60; Rice & Harris, 2005).

Relevance of Risk Factors for Predictive Accuracy

Logistic regression analyses predicting 2-year hit rates revealed no significant interaction of either PDCI and SRF, or PDCR and SRF. This indicates that the accuracy-increasing effect of SRF did not depend on whether a mental health diagnosis is considered relevant or irrelevant for recidivism risk. The model only including main effects revealed a significant effect of SRF, B = 0.39, Wald(1) = 5.38, p = .020, e^B = 1.48, 95% CI [1.06, 2.07], and of PDCR, B = −1.14, Wald(1) = 6.94, p = .008, e^B = 0.32, 95% CI [0.14, 0.75], on the probability of correctly predicting recidivism. When the consideration of SRF increased by one standard deviation, the probability of correctly predicting recidivism increased by 9.25%. In contrast, when individuals were diagnosed with a PDCR, this probability decreased by 27.2%. No significant main effects of URF or PDCI on predictive accuracy could be detected. ROC analyses for the model only containing main effects indicated a medium effect size (AUC = .61; Rice & Harris, 2005).

Regression analyses for the 5-year follow-up period revealed a significant interaction of SRF and PDCR, B = −1.28, Wald(1) = 6.02, p = .014, e^B = 0.28, 95% CI [0.10, 0.77]. Conditional main effect analyses showed that SRF did not predict risk assessment accuracy when a PDCR was present, B = −0.83, Wald(1) = 3.11, p = .078, e^B = 0.44, 95% CI [0.17, 1.10]. However, in the absence of a PDCR, an increase in the consideration of SRF by one standard deviation significantly increased the probability of correctly predicting recidivism by 9.9%, B = 0.45, Wald(1) = 4.01, p = .045, e^B = 1.57, 95% CI [1.01, 2.44]. No significant main effects of URF, B = 0.13, Wald(1) = 0.68, p = .409 e^B = 1.14, 95% CI [0.84, 1.56], or PDCI, B = 0.16, Wald(1) = 0.22, p = .641, e^B = 1.17, 95% CI [0.60, 2.29], on the accuracy of expert witness judgments were found (AUC = .62; Rice & Harris, 2005).

Similarly, for the total follow-up period, a significant interaction of SRF and PDCR could be detected, B = −0.95, Wald(1) = 6.54, p = .011, e^B = 0.39, 95% CI [0.19, 0.80]. In the absence of a PDCR, an increase in the consideration of SRF by one standard deviation increased the probability of correctly predicting recidivism by 9.54%, B = 0.40, Wald(1) = 4.52, p = .033, e^B = 1.49, 95% CI [1.03, 2.14]. Conditional main effect analyses confirmed that SRF did not predict assessment accuracy if a PDCR was present, B = −0.55, Wald(1) = 2.96, p = .085, e^B = 0.57, 95% CI [0.31, 1.08]. Neither URF, B = 0.15, Wald(1) = 1.12, p = .290, e^B = 1.16, 95% CI [0.88, 1.54], nor PDCI significantly predicted the accuracy of expert witness judgments, B = 0.27, Wald(1) = 0.75, p = .387, e^B = 1.31, 95% CI [0.71, 2.44]. ROC analyses for this model, including only main effects and the significant interaction term, indicated a medium effect size (AUC = .63; Rice & Harris, 2005).

Relevance of Risk Factors for Recidivism

Finally, a hierarchical logistic regression predicting general recidivism rates revealed no significant interactions of either SRF and PDCR nor SRF and PDCI. Similarly, no significant main effects were found for the presence of a PDCR, B = −0.17, Wald(1) = 0.19, p = .661, e^B = 0.84, 95% CI [0.39, 1.82], for PDCI, B = 0.22, Wald(1) = 0.49, p = .485, e^B = 1.24, 95% CI [0.67, 2.30], for the proportion of SRF considered, B = −0.31, Wald(1) = 3.75, p = .053, e^B = 0.74, 95% CI [0.54, 1.00], or for the proportion of URF considered, B = 0.08, Wald(1) = 0.32, p = .572, e^B = 1.08, 95% CI [0.82, 1.43] (AUC = .60; Rice & Harris, 2005).

Likewise, no significant associations were found between sexual recidivism and the presence of a PDCR, χ²(1, N = 221) = 0.13, p = .718, or between sexual recidivism and the presence of a PDCI, χ²(1, N = 221) = 0.75, p = .388. These findings indicate that neither the proportion of SRF considered nor the presence of a psychiatric diagnosis (PDCR and PDCI) were related to the risk to recidivate with a sexual or a general offense.

Discussion

Accurate risk assessment of individuals convicted of sexual offenses is crucial to prevent reoffending and unnecessary institutionalization. However, the quality of criminal risk assessment reports remains heterogeneous. Previous research underlined the superiority of structured over unstructured approaches (e.g., Bengtson & Långström, 2007; Hanson & Morton-Bourgon, 2009; Heilbrun et al., 2016). Structured methods, such as actuarial or SPJ-based instruments, use a predetermined list of empirically derived static and/or dynamic risk (and protective) factors that show an evidence-based relation with reoffending. Risk assessment instruments furthermore provide coding rules designed to improve interrater-reliability and predictive accuracy However, risk factors that have historically been important treatment targets and remain standard components of most treatment programs continue to be regularly considered, even though they lack empirical and meta-analytical support (Mann et al., 2010; Rettenberger, 2018; Seto et al., 2023).

In the present study, a heterogeneous use of empirically supported and unsupported risk factors could be identified. Encouragingly, across all reports, whether structured or not, relatively more supported risk factors than unsupported risk factors were considered during risk assessment. Furthermore, the application of risk assessment instruments significantly increased the proportion of supported risk factors included in evaluations. Together, these findings provide further support for the use of structured and standardized risk assessment procedures.

Our findings also show that the consideration of supported risk factors influenced the prognostic direction of expert witness judgments. A higher proportion of supported risk factors considered increased the odds that an individual was assessed as having a (very) low risk of reoffending. However, our analyses showed that this effect was limited to individuals without a mental health diagnosis. When such a diagnosis was present, the effect of supported risk factors was nullified. In these cases, psychiatric diagnoses were treated as separate, incremental risk factors, contrary to current best practice recommendations. Thus, individuals with a mental health diagnosis had a reduced chance to be judged as (very) low risk, regardless of the number of supported risk factors considered.

Similarly, the number of supported risk factors considered influenced the accuracy of expert witness judgments only positively, if the examinee was not diagnosed with a psychiatric disorder. The effect of supported risk factors, independent of use of formal risk assessment instruments, further decreased over time, showing significant effects for the 2-year follow-up period but not for the total time at risk of more than seven years. This highlights that empirically driven risk predictions generally have their greatest predictive value shortly after discharge, while being less predictive of long-term behavior. Notably, supported risk factors showed a time-independent effect on predictive accuracy when recidivism-relevant and recidivism-irrelevant diagnoses were examined in isolation, suggesting that the influence of risk factor consideration on predictive accuracy may be context-specific.

Theoretical Implications

Our results indicate that the presence of a mental health diagnosis moderates the probability with which expert witnesses classify individuals as having low or high recidivism risk. This finding may be interpreted in light of theoretical work on decision-making and biases in forensic practice (Neal & Grisso, 2014). Estimating recidivism risk requires a comprehensive assessment and integration of diverse subject-related information. At the same time, only recidivism-relevant factors should be considered, while empirically unrelated aspects to reoffending should not influence risk judgments. Given the complexity of this task, expert witnesses may be susceptible to implicit biases that facilitate the integration process, with the hazard of compromising the accuracy of the resulting risk judgments (e.g., Dror et al., 2021). For instance, given a high prevalence of personality disorder diagnoses among individuals convicted of sexual offenses (Eher, Rettenberger, & Turner, 2019), expert witnesses might be more inclined to assign higher risk ratings to those with such a diagnosis, thereby neglecting (other) relevant case-specific information and the relatively low rate of sexual recidivism in this population (Neal & Grisso, 2014; Oberlader & Verschuere, 2025; Rettenberger et al., 2015). As a consequence, information that may be irrelevant for risk assessment may unduly influence risk judgments.

The current finding that individuals with a mental disorder are unlikely to receive a low-risk judgment, irrespective of the number of supported risk factors considered, may reflect such clinical override, meaning an intuitive adjustment of risk estimates derived from standardized assessment procedures (Oberlader & Verschuere, 2025; Rettenberger, 2018). Crucially, these biases may largely operate unconsciously (Neal & Brodsky, 2016) and, as shown in the present study and previous work (Murrie et al., 2013; Oberlader & Verschuere, 2025), cannot be entirely mitigated through the adherence to structured measures during risk assessment. Nevertheless, structured methods constitute promising tools that help expert witnesses to critically examine and verify their assumptions.

Clinical Implications

The prevalence of mental health diagnoses is higher among individuals convicted of sexual offenses than among other crime-related populations, particularly for paraphilic, PDs, and SUDs (Biedermann et al., 2023; Eher, Rettenberger, & Turner, 2019). Despite these high prevalence rates, relatively few studies have examined the relationship between mental health diagnoses and (sexual) reoffending. These studies indicate that mental health diagnoses are not predictive of recidivism (Bonta et al., 1998, 2013; Kingston et al., 2015), although comorbid SUDs, some PDs, particular sexual preference disorders (e.g., exhibitionism and exclusive pedophilia; Biedermann et al., 2023), and hypersexuality (Gregório Hertz et al., 2022) showed low to moderate predictive accuracy. Crucially, though, mental disorders do not seem to improve the prediction of recidivism beyond actuarial risk assessment tools (Biedermann et al., 2023; Eher, Rettenberger, & Turner, 2019). Diagnostic categories derived from the DSM or ICD also usually fail to predict recidivism among individuals convicted of sexual offenses (e.g., Eher et al., 2016; Mann et al., 2010; Seto et al., 2023).

Despite the controversial influence of mental health disorders on recidivism risk, the present results demonstrate that psychiatric diagnoses significantly influence the forensic decision-making process. The presence or absence of a mental health diagnosis affected both the prognostic direction and the accuracy of the risk assessments and nullified the accuracy-increasing effect of considering empirically supported risk factors. Specifically, individuals with a psychiatric diagnosis had a significantly lower probability of receiving a low-risk judgment. Furthermore, when a diagnosis was present, the risk assessment accuracy did not depend on the consideration of empirically supported risk factors, suggesting that individuals with a diagnosis have a lower probability of being released, even though such diagnoses do not reliably predict reoffending.

These results point toward the substantial influence of psychiatric disorders on the direction and accuracy of forensic risk assessment of individuals convicted of sexual offenses. Importantly, follow-up analyses indicated that only psychiatric diagnoses considered as recidivism-relevant, based on the current status of scientific knowledge, but not those considered recidivism-irrelevant significantly predicted the direction of risk assessment, with a decrease in the probability to receive a low-risk judgment if a recidivism-relevant diagnosis was present. In addition, the presence of a recidivism-relevant diagnosis reduced the probability for correct recidivism predictions. However, the mere presence of a diagnosis (whether considered recidivism-relevant or not) did not predict recidivism in the current study, supporting existing findings that question the predictive validity of mental health disorders for recidivism risk.

Based on these findings and theoretical considerations, several clinical implications emerge for ensuring evidence-based, objective risk assessment. First, a systematic consideration of a structured risk assessment approach helps assessors to differentiate between empirically supported and unsupported risk factors. Standardized instruments emphasize the relevance of included predictors while excluding irrelevant ones. Second, continuous training –particularly regarding which factors are empirically linked to recidivism and which are not–is of crucial for all persons involved in forensic diagnostics and risk assessment processes. Finally, risk assessment judgments should not exclusively rely on clinical diagnoses.

Limitations

Despite the considerable strengths of the present study, including the large sample size in comparison to previous studies and the determination of interrater reliabilities, the results should be interpreted in light of some limitations. As all variables were retrospectively extracted from risk assessment reports, causal interpretations should be made with caution. The data permit only correlational inferences between predictors and outcomes. Furthermore, the analysis of hit rates inherently excluded individuals who were not discharged following risk assessment (see also Wertz et al., 2018). This prohibits a comprehensive examination of the accuracy of prognostic decisions across the full sample.

Regarding hit rates, it should further be noted that, for analytical purposes, risk predictions were dichotomized as low risk and high risk. In practice, however, expert witnesses typically differentiate between multiple levels of risk. As such, a low-risk judgment may still be considered accurate if an individual reoffends, provided the assessed risk was lower than that of those classified as high-risk.

In addition, risk factors were coded as considered only if their influence on recidivism risk was explicitly mentioned in the report. While it is possible that other variables were implicitly considered by expert witnesses, the reasoning behind risk judgments should be transparent and traceable. Therefore, variables not explicitly discussed were classified as not considered.

Another limitation concerns the small number of individuals recidivating with a sexual offense (n = 11). While this is desirable in practice, it precluded a differentiation of general and serious (i.e., violent, sexualized) reoffending in our regression analyses. As recidivism with serious offenses is of particular practical relevance, such offenses may be associated with distinct risk factors (Babchishin et al., 2016), future studies with larger samples should test the significant predictors from the present study specifically among individuals recidivating with serious offenses.

Finally, research on empirically supported and unsupported risk factors has progressed substantially throughout the past two decades. This implies that risk factors considered empirically (un)supported may have changed during the sampling period of 17 years. Crucially, however, while best practice approaches may have evolved over time, a retrospective analysis of the aspects contributing to unsatisfactory prediction accuracy–considering both state-of-the-art empirically supported and unsupported risk factors–over such an extensive sampling period can offer representative and valuable insights into the origins of the heterogenous quality observed in forensic risk assessment.

Conclusion

The present study investigated the influence of subject-related characteristics on risk assessment in a German sample of individuals convicted of sexual offenses and elucidated potential origins of the observed qualitative heterogeneity in these reports. The results replicate previous findings of substantial variability across risk assessment reports (Haarig et al., 2012; Kunzl & Pfaefflin, 2011; Wertz et al., 2020) and demonstrate that a substantial number of empirically supported risk factors remains insufficiently discussed by expert witnesses.

The use of risk assessment instruments contributed to an empirically driven risk assessment, thereby highlighting the need to follow a standardized and structured approach. At the same time, the application of such instruments did not prevent expert witnesses from considering unsupported risk factors. Furthermore, the presence of psychiatric disorders significantly reduced the probability of low-risk judgments and eliminated the accuracy-improving effect of comprehensively considering supported risk factors, even when such disorders were considered recidivism-irrelevant.

Taken together, our findings provide further support for the use of structured and standardized risk assessment procedures. At the same time, they highlight that even empirically driven assessments remain vulnerable to the influence of psychiatric diagnoses on both prognostic direction and accuracy. The presented findings therefore caution expert witnesses not only against incorporating empirically unsupported risk factors but also against clinically overriding inferences derived from structured assessment approaches.

Supplemental Material

sj-docx-1-cjb-10.1177_00938548251397535 – Supplemental material for The Consideration of Empirically Unsupported Risk Factors in Risk Assessment Reports About Individuals Convicted of Sexual Offenses

Supplemental material, sj-docx-1-cjb-10.1177_00938548251397535 for The Consideration of Empirically Unsupported Risk Factors in Risk Assessment Reports About Individuals Convicted of Sexual Offenses by Maximilian Wertz, Maren Giersiepen, Kolja Schiltz and Martin Rettenberger in Criminal Justice and Behavior

Footnotes

Authors’ Note:

The authors would like to thank Tobias Kalenscher (Heinrich-Heine-University Düsseldorf) for his support in conducting the present study. The authors declare that they have no conflict of interest to disclose. The data are available on request due to privacy/ethical restrictions. This study was not preregistered. Authors state no funding involved.

ORCID iDs

Maximilian Wertz

Martin Rettenberger

Supplemental Material

Supplemental Figures S1 and S2 and Supplemental Tables S1–S3 are available in the online version of this article at .

Notes

Maximilian Wertz is a research assistant in the Department of Forensic Psychiatry, LMU Munich with a PhD in Psychology. His current research focuses on criminal risk assessment, criminal responsibility assessment, and sexual preference disorders.

Maren Giersiepen completed her Master’s degree in Psychology with a thesis on the risk assessment of individuals charged with sexual offenses at the Department of Forensic Psychiatry, University Hospital of the Ludwig-Maximilians-University (LMU) Munich, and began her PhD in neuro-cognitive psychology at the Department of Psychology, LMU. Her current research focuses on how the sense of agency influences perception, decision-making, and learning during goal-directed action, using both neuroimaging and behavioral methods.

Prof. Dr. Kolja Schiltz is the department head at the Department of Forensic Psychiatry at the Ludwig-Maximilians-Universität München. His scientific interests focus on the multifactorial causes of aggression and violence, with a particular emphasis on neurobiological factors and risk modulators/prediction, neurobiological and psychosocial conditioning factors of forensically relevant changes in sexual orientation.

Prof. Dr. Martin Rettenberger is the director of the Centre for Criminology (KrimZ). His research focuses on criminal risk assessment of violent and sexual offenders, sexual preference disorders, evaluation of forensic assessments, and domestic violence.

References

Ægisdóttir

White

M. J.

Spengler

P. M.

Maugherman

A. S.

Anderson

L. A.

Cook

R. S.

Nichols

C. N.

Lampropoulos

G. K.

Walker

B. S.

Cohen

(2006). The meta-analysis of clinical judgment project: Fifty-six years of accumulated research on clinical versus statistical prediction. The Counseling Psychologist, 34(3), 341–382. https://doi.org/10.1177/001100000528587

Babchishin

K. M.

Hanson

R. K.

Blais

(2016). Less is more: Using Static-2002R subscales to predict violent and general recidivism among sexual offenders. Sexual Abuse, 28(3), 187–217. https://doi.org/10.1177/1079063215569544

Backhaus

Erichson

Plinke

Weiber

(2018). Logistische Regression [Logistic regression]. In Multivariate Analysemethoden: Eine anwendungsorientierte Einführung (15th ed., pp. 267–336). Springer. https://doi.org/10.1007/978-3-662-56655-8

Bengtson

Långström

(2007). Unguided clinical and actuarial assessment of re-offending risk: A direct comparison with sex offenders in Denmark. Sexual Abuse, 19(2), 135–153. https://doi.org/10.1177/107906320701900205

Biedermann

Eher

Rettenberger

Gaunersdorfer

Turner

(2023). Are mental disorders associated with recidivism in men convicted of sexual offenses? Acta Psychiatrica Scandinavica, 148(1), 6–18. https://doi.org/10.1111/acps.13547

Boer

D. P.

Hart

S. D.

Kropp

P. R.

Webster

C. D.

(1997). Manual for the sexual violence risk–20. Professional guidelines for assessing risk of sexual violence. Simon Fraser University Library.

Bonta

Law

Hanson

(1998). The prediction of criminal and violent recidivism among mentally disordered offenders: A meta-analysis. Psychological Bulletin, 123(2), 123–142. https://psycnet.apa.org/buy/1998-00120-001

Bujang

M. A.

Baharum

(2017). Guidelines of the minimum sample size requirements for Cohen’s Kappa. Epidemiological Biostatistics and Public Health, 14(2), e12267–12277. https://doi.org/10.2427/12267

Cohen

(1968). Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213–220. https://doi.org/10.1037/h0026256

10.

Dror

Melinek

Arden

J. L.

Kukucka

Hawkins

Carter

Atherton

D. S.

(2021). Cognitive bias in forensic pathology decisions. Journal of Forensic Sciences, 66(5), 1751–1757. https://doi.org/10.1111/1556-4029.14697

11.

Dunn

Felthous

A. R.

Gagné

Harding

Kaliski

Kramp

Lindqvist

Nedopil

Ogloff

J. R. P.

Skipworth

Taylor

P. J.

Thomson

Yoshikawa

(2014). Forensic psychiatry and its interfaces outside the UK and Ireland. In Gunn

Taylor

P. J.

(Eds.), Forensic psychiatry: Clinical, legal and ethical issues (2nd ed., pp. 116–125). Taylor & Francis Group.

12.

Eher

Olver

M. E.

Heurix

Schilling

Rettenberger

(2015). Predicting reoffense in pedophilic child molesters by clinical diagnoses and risk assessment. Law and Human Behavior, 39(6), 571–581. https://doi.org/10.1037/lhb0000144

13.

Eher

Rettenberger

Etzler

Eberhaut

Mokros

(2019). Eine gemeinsame Sprache für die Risikokommunikation bei Sexualstraftätern—Trenn—und Normwerte für das neue Fünf—Kategorienmodell des Static—99 [A common language for risk communication for sexual offenders—cut-off and norm values for the new five-category model of the Static-99]. Recht & Psychiatrie, 37(2), 91–99. https://www.researchgate.net/publication/331895581

14.

Eher

Rettenberger

Turner

(2019). The prevalence of mental disorders in incarcerated contact sexual offenders. Acta Psychiatrica Scandinavica, 139(6), 572–581. https://doi.org/10.1111/acps.13024

15.

Eher

Schilling

Hansmann

Pumberger

Nitschke

Habermeyer

Mokros

(2016). Sadism and violent reoffending in sexual offenders. Sexual Abuse, 28(1), 46–72. https://doi.org/10.1177/1079063214566715

16.

Falkai

Wittchen

H. U.

Döpfner

, (2018). Diagnostisches und statistisches Manual psychischer Störungen DSM-5® [Diagnostic and statistical manual of mental disorders] (5th ed.). American Psychiatric Association.

17.

Field

(2013). Logistic regression. In Carmichael

(Ed.), Discovering statistics using IBM SPSS statistics (4th ed.). Sage.

18.

Flom

(2018). Stopping stepwise: Why stepwise selection is bad and what you should use instead. Towards Data Science. https://towardsdatascience.com/stopping-stepwise-why-stepwise-selection-is-bad-and-what-you-should-use-instead-90818b3f52df

19.

Gaunersdorfer

Eher

(2022). Die prädiktive Validität der deutschsprachigen Version der VRS-SO für allgemeine Sexualdelinquenz, Kontaktsexualdelikte und Täteruntergruppen [The predictive validity of the VRS-SO German version for general sexual recidivism, contact sexual offenses and offender subgroups]. Forensische Psychiatrie, Psychologie, Kriminologie, 16(3), 231–244. https://doi.org/10.1007/s11757-022-00729-5

20.

Gaunersdorfer

Eher

(2023). Die deutschsprachige Version der Violence Risk Scale–Sexual Offense Version (VRS-SO): Prädiktive und konvergente Validität und Kalibrierung des Fünf-Kategorien-Modells [The German-language version of the VRS-SO: Predictive validity and calibration analyses of the five level risk and needs system]. Monatsschrift für Kriminologie und Strafrechtsreform, 106(4), 301–313. https://doi.org/10.1515/mks-2023-0012

21.

Gregório Hertz

Rettenberger

Turner

Briken

Eher

. (2022). Hypersexual disorder and recidivism risk in individuals convicted of sexual offenses. The Journal of Forensic Psychiatry & Psychology, 33(4), 572–591. https://doi.org/10.1080/14789949.2022.2053183

22.

Grove

W. M.

Zald

D. H.

Lebow

B. S.

Snitz

B. E.

Nelson

(2000). Clinical versus mechanical prediction: A meta-analysis. Psychological Assessment, 12(1), 19–30. https://doi.org/10.1037/1040-3590.12.1.19

23.

Haarig

Blase

Sedlmeier

(2012). Vorhersage von Rückfälligkeit bei Sexualstraftätern. Wie gut sind die Gutachten und wie könnte man sie verbessern? [Predicting recidivism of sexual offenders. How good are the assessments and how could they be improved?]. Monatsschrift für Kriminologie und Strafrechtsreform, 95(6), 392–412. https://doi.org/10.1515/mks-2012-950602

24.

Hanson

R. K.

Babchishin

K. M.

Helmus

L. M.

Thornton

Phenix

(2017). Communicating the results of criterion referenced prediction measures: Risk categories for the Static-99R and Static-2002R sexual offender risk assessment tools. Psychological Assessment, 29(5), 582–597. https://doi.org/10.1037/pas0000371

25.

Hanson

R. K.

Harris

A. J. R.

Scott

T.-L.

Helmus

(2007). STABLE-2007 [Database record]. APA PsycTests. https://doi.org/10.1037/t04644-000

26.

Hanson

R. K.

Morton-Bourgon

K. E.

(2009). The accuracy of recidivism risk assessments for sexual offenders: A meta-analysis of 118 prediction studies. Psychological Assessment, 21(1), 1–21. https://doi.org/10.1037/a0014421

27.

Harris

Phenix

Hanson

R. K.

Thornton

(2003). Static 99: Coding rules revised 2003. Solicitor General Canada.

28.

Heilbrun

Newsham

Pietruszka

(2016). Risk communication: An international update. In Singh

J. P.

Bjørkly

Fazel

(Eds.), International perspectives on violence risk assessment (pp. 150–165). Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199386291.003.0009

29.

Hosmer

D. W.

Lemeshow

Sturdivant

R. X.

(2013). Applied logistic regression (3rd ed.). John Wiley & Sons. https://doi.org/10.1002/9781118548387

30.

Jackson

R. L.

Rogers

Shuman

D. W.

(2004). The adequacy and accuracy of sexually violent predator evaluations: Contextualized risk assessment in clinical practice. International Journal of Forensic Mental Health, 3(2), 115–129. https://doi.org/10.1080/14999013.2004.10471201

31.

Johansen

S. H.

(2007). Accuracy of predictions of sexual offense recidivism: A comparison of actuarial and clinical methods. Dissertation Abstracts International: Section B: The Sciences and Engineering, 68(3-B), 1929. https://psycnet.apa.org/record/2007-99018-311

32.

Kelley

S. M.

Ambroziak

Thornton

Barahal

R. M.

(2020). How do professionals assess sexual recidivism risk? An updated survey of practices. Sexual Abuse, 32(1), 3–29. https://doi.org/10.1177/1079063218800474

33.

Kingston

D. A.

Olver

M. E.

Harris

Wong

S. C.

Bradford

J. M.

(2015). The relationship between mental disorder and recidivism in sexual offenders. International Journal of Forensic Mental Health, 14(1), 10–22. https://doi.org/10.1080/14999013.2014.974088

34.

Kunzl

Pfaefflin

(2011). Qualitätsanalyse österreichischer Gutachten zur Zurechnungsfähigkeit und Gefährlichkeitsprognose [Quality evaluation of Austrian expert witness reports on criminal accountability and risk]. Recht & Psychiatrie, 29(3), 152–159. https://www.researchgate.net/publication/287845008_Psychiatric_reports_on_insanity_and_dangerousness_in_Austria_-_A_qualitative_analysis

35.

Landis

J. R.

Koch

G. G.

(1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174. https://doi.org/10.2307/2529310

36.

Långström

Sjöstedt

Grann

(2004). Psychiatric disorders and recidivism in sexual offenders. Sexual Abuse: A Journal of Research and Treatment, 16(2), 139–150. https://doi.org/10.1177/107906320401600204

37.

Mann

R. E.

Hanson

R. K.

Thornton

(2010). Assessing risk for sexual recidivism: Some proposals on the nature of psychologically meaningful risk factors. Sexual Abuse, 22(2), 191–217. https://doi.org/10.1177/1079063210366039

38.

Matthes

Rettenberger

(2008). Die Deutsche Version des Stable-2007 [The German version of the Stable-2007]. Institut für Gewaltforschung und Prävention.

39.

Meehl

P. E.

(1954). Clinical versus statistical prediction: A theoretical analysis and a review of the evidence. University of Minnesota Press. https://doi.org/10.1037/11281-000

40.

Müller-Isberner

Gonzales Cabeza

Eucker

(2000). Die Vorhersage sexueller Gewalttaten mit dem SVR-20 [Manual for the Sexual Violence Risk – 20, Version 2: Professional guidelines for assessing risk of sexual violence]. Institut für Forensische Psychiatrie.

41.

Müller-Isberner

Webster

C. D.

(1998). Die Vorhersage von Gewalttaten mit dem HCR 20: In der modifizierten und adaptierten Übersetzung der kanadischen Originalversion [Predicting violent offending with the HCR-20]. Institut für Forensische Psychiatrie.

42.

Murrie

D. C.

Boccaccini

M. T.

Guarnera

L. A.

Rufino

K. A.

(2013). Are forensic experts biased by the side that retained them? Psychological Science, 24(10), 1889–1897. https://doi.org/10.1177/0956797613481812

43.

Neal

Brodsky

S. L.

(2016). Forensic psychologists’ perceptions of bias and potential correction strategies in forensic mental health evaluations. Psychology, Public Policy, and Law, 22(1), 58–76. https://doi.org/10.1037/law0000077

44.

Neal

Grisso

(2014). The cognitive underpinnings of bias in forensic mental health evaluations. Psychology, Public Policy, and Law, 20(2), 200–211. https://doi.org/10.1037/a0035824

45.

Nedopil

(2013). Prognosen in der forensischen Psychiatrie: Ein Handbuch für die Praxis [Risk assessment in forensic psychiatry: A handbook for practice] (4th ed.). Pabst Science Publishers.

46.

Nicholls

T. L.

Pritchard

M. M.

reeves

K. A.

Hilterman

(2013). Risk assessment in intimate partner violence: A systematic review of contemporary approaches. In Partner Abuse (Vol. 4, pp. 76–168). https://doi.org/10.1891/1946-6560.4.1.76

47.

Oberlader

Verschuere

(2025). Bias is persistent-sequencing case information does not protect against contextual bias in criminal risk assessment. PsyArXiv, 30, 143–158. https://doi.org/10.31234/osf.io/4hzcv

48.

Olver

M. E.

Wong

S. C.

Nicholaichuk

Gordon

(2007). The validity and reliability of the Violence Risk Scale-Sexual Offender version: Assessing sex offender risk and evaluating therapeutic change. Psychological Assessment, 19(3), 318–329. https://doi.org/10.1037/1040-3590.19.3.318

49.

Quinsey

V. L.

Harris

G. T.

Rice

M. E.

Cormier

C. A.

(2006). Violent offenders: Appraising and managing risk. American Psychological Association.

50.

Rettenberger

(2018). Intuitive, klinisch-idiographische und statistische Kriminalprognosen im Vergleich–die Überlegenheit wissenschaftlich strukturierten Vorgehens [Intuitive, clinical-idiographic, and statistical criminal risk assessment compared–the superiority of scientifically structured approaches]. Forensische Psychiatrie, Psychologie, Kriminologie, 12(1), 28–36. https://doi.org/10.1007/s11757-017-0463-y

51.

Rettenberger

(2019). Instrumente und Methoden zur Einschätzung des Rückfallrisikos [Instruments and methods for assessing the risk of reoffending]. In Eusterschulte

Eucker

Born

(Eds.), Forensische Psychiatrie zwischen Wissenschaft und Praxis: Festschrift für Rüdiger Müller-Isberner (pp. 19–39). Medizinisch-Wissenschaftliche Verlagsgesellschaft.

52.

Rettenberger

Briken

Turner

Eher

(2015). Sexual offender recidivism among a population-based prison sample. International Journal of Offender Therapy and Comparative Criminology, 59(4), 424–444. https://doi.org/10.1177/0306624X13516732

53.

Rettenberger

Eher

(2006). Actuarial assessment of sex offender recidivism risk: A validation of the German version of the Static-99. Sexual Offender Treatment, 1(3), 1–11. https://www.researchgate.net/publication/26585486_Actuarial_Assessment_of_Sex_Offender_Recidivism_Risk_A_Validation_of_the_German_version_of_the_Static-991

54.

Rettenberger

Eher

(2016). Potenzielle Fehlerquellen bei der Erstellung von Kriminalprognosen, die gutachterliche Kompetenzillusion und mögliche Lösungsansätze für eine bessere Prognosepraxis [Potential sources of error in the preparation of risk assessments, the expert witness illusion of competence, and possible solutions for better prediction practice]. Recht & Psychiatrie, 34(1), 50–57. https://www.researchgate.net/publication/299489078_Potenzielle_Fehlerquellen_bei_der_Erstellung_von_Kriminalprognosen_die_gutachterliche_Kompetenzillusion_und_mogliche_Losungsansatze_fur_eine_bessere_Prognosepraxis

55.

Rice

M. E.

Harris

G. T.

(2005). Comparing effect sizes in follow-up studies: ROC Area, Cohen’s d, and r. Law and Human Behavior, 29(5), 615–620. https://doi.org/10.1007/s10979-005-6832-7

56.

Rossegger

Urbaniok

Danielsson

Endrass

(2009). Der Violence Risk Appraisal Guide (VRAG)–ein Instrument zur Kriminalprognose bei Gewaltstraftätern [The Violence Risk Appraisal Guide (VRAG)—A tool for the risk assessment of violent offenders: Review and authorized German translation]. Fortschritte der Neurologie Psychiatrie, 77(10), 577–584. https://doi.org/10.1055/s-0028-1109705

57.

Seto

M. C.

Augustyn

Roche

K. M.

Hilkes

(2023). Empirically-based dynamic risk and protective factors for sexual offending. Clinical Psychology Review, 106, 102355. https://doi.org/10.1016/j.cpr.2023.102355

58.

Skeem

J. L.

Monahan

(2011). Current directions in violence risk assessment. Current Directions in Psychological Science, 20(1), 38–42. https://doi.org/10.1177/0963721410397271

59.

Turgut

Lagace

Izmir

Dursun

(2006). Assessment of Violence and Aggression in Psychiatric Settings: Descriptive Approaches [Psikiyqtri kliniklerinde şiddet ve agresyonun değerlendirilmesi: Tanisal yaklaşimlar]. Klinik Psikofarmakoloji Bulteni, 16(3), 179–194. https://psycnet.apa.org/record/2006-22006-005

60.

Webster

Douglas

K. S.

Eaves

Hart

(1997). Assessing risk for violence (Version 2). Simon Fraser University.

61.

Wertz

Hank

Hausam

Konrad

Schiltz

Imhoff

Rettenberger

(2022). The use and reporting practice of psychological tests in German risk and criminal responsibility expert reports. Psychology, Crime & Law, 30, 68–85. https://doi.org/10.1080/1068316X.2022.2063286

62.

Wertz

Kury

Rettenberger

(2018). Umsetzung von Mindestanforderungen für Prognosegutachten in der Praxis [Implementation of the minimum requirements for risk assessment reports in practice]. Forensische Psychiatrie, Psychologie, Kriminologie, 12(1), 51–60. https://doi.org/10.1007/s11757-017-0458-8

63.

Wertz

Rettenberger

(2021). Die Verwendung standardisierter Prognoseinstrumente in der Begutachtungspraxis: Empirische Erkenntnisse zur Häufigkeit und Risikokommunikation in Abhängigkeit gutachten- und probandenbezogener Merkmale [The use of standardized risk assessment instruments in the practice of risk assessment: Empirical findings on frequency and risk communication as a function of assessment- and subject-related characteristics.]. Forensische Psychiatrie und Psychotherapie, 28, 241–261.

64.

Wertz

Schiltz

Imhoff

Rettenberger

(2020). Der Einfluss des richterlichen Auftrags auf die Qualität der Arbeit von Sachverständigen im Rahmen der Prognosebegutachtung [The influence of the judicial order on the quality of the work of expert witnesses in the context of risk assessment]. Recht & Psychiatrie, 38(4), 193–200. https://doi.org/10.1486/RP-2020-04_193

65.

Wong

S. C.

Olver

M. E.

Nicholaichuk

T. P.

Gordon

(2003). Violence Risk Scale: Sexual Offender version (VRS:SO). Regional Psychiatric Centre and University of Saskatchewan.

66.

World Health Organization (WHO). (2016). International statistical classification of diseases and related health problems, tenth revision (ICD-10). WHO.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.33 MB