Abstract
We present an evaluation of a strength-based, cognitive-behavioral therapy program provided to individuals with a serious mental illness who had committed a sexual offense. Utilizing an intent-to-treat design, individuals who participated in treatment were compared with a group of untreated men on treatment-relevant measures and recidivism. Individuals who completed treatment demonstrated greater change in perceptions of working alliance and dynamically assessed risk to re-offend compared to noncompleters. During the 18-month fixed follow-up period, 3.7% re-offended sexually, 20.4% re-offended violently, and 39.8% re-offended generally. After controlling for baseline risk, participation in treatment was significantly associated with an approximate two-thirds decrease in the hazard of future violent (including sexual) recidivism. High-risk, untreated men in our sample showed significantly higher rates of violent recidivism than the treated groups. Results support the utility of a strength-based approach among men residing in an institutional setting and presenting with a serious mental illness.
The effectiveness of treatment for individuals convicted of sexual offending and the methods used to evaluate it have been subject to considerable debate (Marshall et al., 2011; Seto, 2005). Indeed, since the early reviews of correctional-based treatments and the idea that “nothing works” (Martinson, 1974), there has been more than 20 systematic reviews and meta-analyses of the sexual offending treatment literature (e.g., Gannon et al., 2019). Despite some initial weak and contradictory findings (Furby et al., 1989; Långström et al., 2013), most investigations have shown at least some association between treatment participation and reduced recidivism rates among those who have committed a sexual offense. In one of the most recent quantitative reviews, Holper and colleagues (2023) updated a previous meta-analysis (Schmucker & Lösel, 2017) to include a total of 37 samples and 30,394 participants. Results indicated a statistically significant treatment effect for sexual recidivism. Overall, participation in treatment was associated with a 31.8% relative reduction in sexual recidivism with a 9.3% sexual recidivism rate among those in the treated groups versus a 13.6% sexual recidivism rate in the control groups.
Research evaluating treatment outcome among those who have committed sexual offenses is promising; however, there is substantial variability across studies with regard to reduced recidivism. Such variability is likely due, at least in part, to the diversity in both the methodological approaches used in the evaluation and treatment delivery (Lösel et al., 2020; Yates & Kingston, 2021).
Methodologically rigorous designs that employ randomized control have often produced smaller effects than those based on more quasi-experimental methods (Holper et al., 2023; also see Marques et al., 2005). The difficulty in implementing such designs has been noted (Marshall & Marshall, 2007), but it remains a widely accepted standard in evaluating treatment outcomes (Seto et al., 2008) and the fact that more positive outcomes have been based on less rigorous designs have led some to question the efficacy of treatment for sexual offending (Rice & Harris, 2003).
Adherence to effective correctional interventions has also contributed to variations in treatment outcome. The Risk-Need-Responsivity model (RNR; Andrews & Bonta, 2010) is the most established rehabilitation framework used with individuals who have committed sexual offenses (Hanson et al., 2009). RNR highlights that treatment is most effective when programs: (a) match service intensity to risk level, targeting clients who are at moderate to high risk to reoffend (i.e., the risk principle); (b) target changeable risk factors that are empirically associated with recidivism for risk-reduction treatment services (i.e., the need principle); and (c) utilize methods that are primarily based on cognitive-behavioral/social learning approaches and, importantly, adapt the service delivery in such a manner as to ensure maximum benefit for individuals based on their own circumstances and capabilities (i.e., the responsivity principle).
A central tenet of RNR is the identification of criminogenic needs and to specifically target these factors in treatment (Rettenberger & Eher, 2024). Criminogenic needs are generally well established, but some factors, such as denial, victim empathy, and serious mental illness (SMI), are included in treatment programs despite weak or questionable associations with recidivism (Mann et al., 2010).
The relationship between SMI and recidivism remains a contentious issue. Some subscribe to the psychopathological model and argue that untreated mental illness is a precursor to criminal behavior and therefore is an established criminogenic need (Douglas et al., 2009). Others, however, have argued for a general personality and cognitive social learning perspective (Andrews & Bonta, 2010) noting that SMI is not directly related to sexual and violent recidivism, particularly within correctional or forensic mental health samples (Bonta et al., 2014; Kingston & Olver, 2018; Kingston et al., 2015) and that the pertinent risk factors are generally shared across those with and without an SMI (Olver & Kingston, 2019; Skeem et al., 2014, 2015).
Despite the increasing number of studies that have shown SMI to be unrelated to recidivism in correctional and forensic mental health samples, there have been some notable exceptions. Långström et al. (2004), for example, found a significant association between psychiatric disorders including psychosis and sexual recidivism in a large and relatively unselected Swedish sample of individuals who had committed a sexual offense. More recently, Babchishin et al. (2025) examined a large sample of Swedish men who were convicted or suspected of committing a sexual offense and similarly found associations between several psychiatric and neurological conditions and sexual offending. This has underscored the notion that SMI may be criminogenic for some individuals, in some situations (Skeem et al., 2004). Irrespective of whether SMI is conceptualized as risk relevant for an individual or deemed important for responsivity, treatment that addresses this issue is important.
Responsivity has often been overlooked or minimized in relation to the other two primary principles, despite being an important component in the delivery of correctional programs. Indeed, cognitive-behavioral and multi-systemic paradigms (for adolescents who have offended sexually), for example, have generally shown the strongest effects when compared to other modalities (Gannon et al., 2019). In addition to evidence-based modalities, some have noted that the delivery of the service is of critical importance (Yates & Kingston, 2021). More specifically, increasing attention has been drawn toward utilizing a strength-based approach (SBA) to treatment.
SBAs are based on the fundamental principles of positive psychology via the emphasis on enhancing client strengths and maximizing human potential (Seligman, 2019). In applying these principles to the treatment of sexual offending, Marshall and colleagues (2011) have noted that interventions should emphasize the identification and facilitation of client strengths in addition to treating observed areas of deficit by way of integrating RNR with a more explicit focus on SBAs. Deficits are primarily addressed via the enhancement of prosocial skills and strategies dedicated to the pursuit of prosocial goals rather than focusing solely on things to manage or potentially avoid. This approach has also been embedded in rehabilitation theories, such as the Good Lives Model (GLM; Ward, 2002).
Therapist characteristics and the associated therapeutic alliance (Baier et al., 2020; Marshall et al., 2011) is an integral part of SBA. It is well documented that a confrontational and punitive approach to treatment results in problematic outcomes (Ackerman & Hilsenroth, 2001). In contrast, a positive therapeutic alliance has been associated with improved outcomes (Martin et al., 2000), particularly when it is measured from the client’s point of view (Duff & Bedi, 2010). Not surprisingly, therapists who exhibit characteristics such as warmth, empathy, and directiveness (among other traits) tend to be involved in programs that have greater therapeutic alliance, and such traits have been specifically associated with positive outcomes found in substance use (Meier et al., 2005), as well as violent and sexual offending interventions (Beech & Hamilton-Giachritsis, 2005).
As noted earlier, several studies and meta-analytic reviews have suggested that the right type of treatment can have a meaningful impact on reductions in sexual recidivism. Moreover, studies have examined important process and responsivity issues, such as the utility of an SBA, several of which have focused on the GLM. For example, Harkins and colleagues (2012) compared attrition rates, treatment change, and overall program impressions between a GLM and a traditional relapse prevention program. Results demonstrated no differences in attrition rates or actual change between the programs, although the GLM based program was viewed as more positive and future-oriented in nature than the relapse prevention program. Unfortunately, there have been relatively few studies examining the benefit of GLM on actual reductions in recidivism rates (Yates & Kingston, 2021).
In an examination of treatment outcome, Olver, Marshall and colleagues (2020) compared two prison-based CBT sexual offending programs operated by the Correctional Service of Canada. Both programs were RNR-based but differed in the extent to which they were informed by a risk management perspective versus an SBA. A comparison group of untreated men sentenced for sexual offenses to federal institutions across Canada were also included. Results indicated that both treatment programs exhibited lower rates of sexual and violent recidivism compared to the no-treatment control group, but that the SBA program generated the lowest recidivism rates of all three groups. In the 8-year fixed follow-up, 20% of participants in the no-treatment condition committed a sexual re-offense compared to 10% of those who participated in the CBT-based prison program. Only 4% of participants in the SBA re-offended with a sexual offense within 8 years. Relative to the no-treatment control condition, the SBA program showed a 75%–81% reduction in sexual recidivism depending on risk level.
Unfortunately, fewer treatment outcome studies have been conducted among men with an SMI. Programs have largely been allocated to those that focus on pharmacological interventions and CBT approaches for mental health versus those that incorporate effective correctional interventions and criminogenic needs. Studies point toward a somewhat unidirectional treatment effect, such that targeting mental health needs impact mental health outcomes, whereas targeting risk-relevant needs have greater associations with reduced recidivism (Morgan et al., 2012; Parisi et al., 2022). Interestingly, some programs have been developed that address both mental health and criminogenic needs with some promising results (see Morgan et al., 2014).
In this study, we add to the treatment outcome literature by examining the utility of an SBA program that adheres to core correctional practices among a sample of individuals in a secure treatment setting who present with a SMI and who have committed a sexual offense.
Method
Participants and Treatment Program
The 207 participants comprising the present study sample were admitted to the Secure Treatment Unit (STU), Royal Ottawa Health Care Group, a 100-bed hybrid provincial correctional and mental health treatment facility for men. Full ethical approval was obtained from the Royal Ottawa Health Care Group and the Ministry of Community Safety and Correctional Services. Individuals were referred from 25 correctional facilities across Ontario, Canada after being held criminally culpable for their offenses, but were admitted for assessment and treatment services due to the presence of SMI or symptomatology that is consistent with a diagnosis. Table 1 presents more detailed demographic, criminal history, offense profile, and diagnostic information for both the treatment (n = 116) and control (n = 91) group samples (described below).
Group Comparisons on Study Demographic, Historical, Clinical, Risk, and Outcome Measures
Note. Total N = 207, while remaining ns vary by condition and measure and are provided in the table above; SRG = Self-Regulation Group Treatment; t/d statistics are reported for continuous measures, χ2/φ statistics for binary measures; LSI-OR = Level of Service Inventory-Ontario Revision; BIPOC = Black, Indigenous, and Other Persons of Color.
p ≤ .001, **p ≤ .01, *p ≤ .05.
Although designed as a hybrid correctional and mental health facility, the STU predominantly focuses on mental health recovery given this is the primary reason for which individuals are referred. There has been a shift, however, toward providing more correctional-based programing such that individuals can receive treatment for anger management, domestic violence, antisocial attitudes, substance abuse, and sexual offending.
Individuals with a history of sexual offending are referred to the Self-Regulation Program (SRG) which was based on the Rockwood Psychological Services (RPS) program (Marshall et al., 2011). These programs have been described in detail elsewhere (Marshall et al., 2011; Olver, Marshall, et al., 2020). Briefly, SRG consists of three phases; that is, (a) engagement, (b) addressing criminogenic needs, and (c) life-enhancement/self-management. Emphasis is placed on the SBA philosophy in that the focus is on client strengths and procedural features of treatment, such as the therapeutic alliance and effective therapist characteristics.
The duration of SRG is approximately 4 months and involves some didactic instruction with an embedded skills-based approach to cognitive, affective, and behavioral change. SRG is primarily a process-oriented program based on a series of client assignments and associated group discussions; primary modules include autobiography, offense disclosure, offense analyses (background and immediate factors), risk awareness and management, and release planning. Within these assignments, various discussion topics and specific interventions are employed that address clients’ strengths and areas of need, such as mood management and coping skills, attitudes and cognitions that support or condone sexually abusive behavior, sexual and behavioral self-regulation, attachment, and intimacy/loneliness.
Reasonable efforts were made to address threats to program fidelity. That is, the program adheres to RNR (e.g., excludes individuals who were found to be very low or below average risk for sexual recidivism). The program is manualized and run by a facilitator who was trained by the program developers. However, RPS developers were not involved in either the direct service delivery or this evaluation. A licensed psychologist provided ongoing, weekly supervision to further support program fidelity.
Control Group
Individuals are referred to the STU so that SMI can be assessed and treated. As such, the length of stay at this institution is contingent upon when the referral was received and processed. As a result, some individuals were appropriate for participation in SRG due to a history of sexual offending and their risk level but did not have the sufficient requisite amount of time (i.e., 4 months) to participate. These individuals served as our primary comparison or control group. Although they did not participate in SRG, these participants would likely have received other treatment focused on mental health. As such, the characteristics of this sample and the presence of SMI are notable study strengths.
Risk Assessment Measures
Level of Service Inventory-Ontario Revision
The Level of Service Inventory-Ontario Revision (LSI-OR) (Andrews et al., 1995) includes 43 items organized around the central eight risk factors, which can be organized into one of five risk bands: Very Low (0–4), Low (5–10), Medium (11–19), High (20–29), and Very High (30–43). Olver et al. (2024) demonstrated the LSI tools produced moderate effect sizes in Indigenous (d = 0.53) and non-Indigenous samples (d = 0.69) with similar predictive accuracy among those with an SMI (Olver & Kingston, 2019; area under the curve [AUC] = .69). LSI-OR ratings were available for n = 109 participants.
Static-99R
Static-99R (L. Helmus et al., 2012) is a 10-item static actuarial sexual offense risk tool. The item content comprises information regarding sexual and nonsexual criminal history as well as offense and victim demographics. Possible scores range from −3 to 12 with increasing scores representing a higher risk to sexually reoffend. A recent meta-analysis found the Static-99R to be a moderate predictor of sexual recidivism (AUC = .69; L. M. Helmus et al., 2022). Static-99R ratings were available for n = 154 participants
Violence Risk Scale-Sexual Offense Version
The Violence Risk Scale-Sexual Offense version (VRS-SO) (Wong et al., 2003/2017) is a 24-item risk assessment and treatment planning tool designed to assess sexual violence risk and need, and to assess changes in risk from treatment or other change agents. The VRS-SO consists of 7 static and 17 dynamic items grouped into five risk bands of Very Low (Level I), Below Average (Level II), Average (Level III), Above Average (Level IVa), and Well Above Average (Level IVb) risk. Change is operationalized through a modified application of the Stages of Change (SOC) model (Prochaska et al., 1992), which posits that individuals move along a continuum of five stages involving cognitive, experiential, and behavioral changes as they attempt to remediate problem areas. Change ratings are summed across items to yield a change score, which is subtracted from the pretreatment score to yield a posttreatment score. The VRS-SO has correlated significantly with other risk-relevant measures (Olver et al., 2018), and a recent meta-analysis has shown moderate to large effect sizes in the prediction of sexual and violent recidivism (Olver et al., 2024). In a sample of individuals housed in a psychiatric facility, Eher et al. (2020) showed that the VRS-SO predicted both sexual recidivism (AUC = .71) and reimprisonment (AUC = .65). VRS-SO ratings were available for up to n = 115 participants (i.e., given that not all individuals with pretreatment ratings had posttreatment ratings).
Treatment Need Measures
Aggression Questionnaire
The Aggression Questionnaire (AQ; Buss & Perry, 1992) is a 29-item self-report measure of the interpersonal and affective components of aggression. Results of factor analysis organized the items into four subscales: Physical Aggression, Verbal Aggression, Anger, and Hostility. All items can be summed to yield a total score ranging from 29 to 145. Higher scores are indicative of greater levels of anger/hostility. In a sample of 1,253 men and women, the AQ demonstrated good internal consistency (subscale α = .72 to .85, total score α = .89) and test–retest reliability over a 9-week interval (subscale r = .72 to .80, total score r = .80) (Buss & Perry, 1992). Evidence for the construct validity of the AQ has also been obtained among correctional samples (Diamond & Magaletta, 2006). AQ scores were available for up to n = 100 participants.
Difficulties in Emotion Regulation Scale
The Difficulties in Emotion Regulation Scale (DERS; Gratz & Roemer, 2004) is a 36-item self-report scale that measures emotional regulation problems in adults. All items are scored on a five-point Likert-type scale which indicates the frequency of the behavior described in each item. The DERS includes six subscales including nonacceptance of emotional responses, difficulty engaging in goal-directed behavior, impulse control difficulties, lack of emotional awareness, limited access to emotion regulation strategies, and lack of emotional clarity. The DERS has shown excellent internal consistency (α = .92) and good test–retest reliability across a 9-week interval (Dan-Glauser & Scherer, 2012). DERS scores were available for up to n = 98 participants.
Hypersexual Behavior Inventory
The Hypersexual Behavior Inventory (HBI; Reid et al., 2011) is a 19-item, self-report measure that assesses features of hypersexuality and sexual preoccupation. Scores can range from 19 to 95; a score of 53 is considered clinically significant. The HBI demonstrated convergent validity with other measures of hypersexuality and related constructs. Internal consistency was high in the initial validation sample (α = .89–.95) and in a subsequent field trial (α = .96; Reid et al., 2011). HBI scores were available for up to n = 99 participants.
Rape and Molest Scales
The Rape and Molest Scales (Bumby, 1996) are self-report measures of maladaptive cognitions that support sexual offending against adults and children, respectively. Items for each are scored on a 4-point Likert-type scale ranging from 1 (strongly disagree) to 4 (strongly agree). Higher scores are indicative of greater endorsement of offense-supportive attitudes. The Rape Scale consists of 36 items (possible scores 36–144), and the Molest Scale has 38 items (possible scores 38–152). Results have supported the convergent validity of the tool as well as excellent internal consistency (α > .95; Arkowitz & Vess, 2003). RAPE scores were available for up to n = 96 participants, and MOLEST scores, n = 97 participants.
Working Alliance Inventory-Short Form
The Working Alliance Inventory-Short Form Revised (WAI-SF; Hatcher & Gillaspy, 2006) is a 12-item measure of the therapeutic alliance that assesses (a) agreement on the tasks of therapy, (b) agreement on the goals of therapy, and (c) development of an affective bond. The WAI-SF has two versions, one for clients and one for therapists. Clients and therapists rate items on a 5-point Likert-type scale anchored at each end with “rarely or never” (1) and “always” (5). Higher scores are indicative of a stronger therapeutic alliance. Reported internal consistency for the total scale score has been excellent (α > .90; Munder et al., 2010; Paap et al., 2019). Client and therapist WAI-SFs are administered in SRG on two occasions, initially around the time of their third or fourth session, then again when they have completed the program. WAI-SF therapist ratings (WAI-SF-T) were available for up to n = 65 participants, and client ratings (WAI-SF-C) for n = 62 participants.
Diagnosis
All possible diagnoses were included in the analyses and were coded as present or not present based on current presentation. Diagnoses were based on criteria outlined in the Diagnostic and Statistical Manual of Mental Disorders (American Psychiatric Association, 2000) and were assigned by the intake psychiatrist at this facility. It was not possible to evaluate inter-rater reliability. Diagnoses were collapsed into relevant binary categories including the presence of Schizophrenia or other Psychotic Disorder, Mood Disorder, Anxiety Disorder, Substance Use Disorder (SUD), Personality Disorder, and Paraphilic Disorder. The categories were not mutually exclusive, and participants could have multiple diagnoses.
Recidivism
Outcome data were retrieved on February 16, 2023, from the Offender Tracking and Information System (OTIS) used by the province’s Ministry of Community Safety and Correctional Services, the same tracking system used for the Level of Service/Case Management Inventory normative sample. In this study, recidivism was defined as a return to provincial correctional supervision on a new charge or conviction. Owing to the exceptionally small number of sexual recidivists, sexual recidivism, which included any offense classified as sexual in nature from the criminal records (e.g., sexual assault, sexual interference) was collapsed into a broader violent recidivism variable for most outcome analyses. As such, primary recidivism variables examined included violent recidivism, which included any new charge or conviction for crimes against the person (e.g., assault, homicide, threats, sexual offenses, and so on), and general recidivism, which included any new criminal charge or conviction.
Planned Analyses
Analyses were conducted with the purpose of interrogating treatment and comparing group differences to inform treatment outcome analyses as well as to identify critical risk-relevant covariates that could influence outcome. All analyses were conducted using SPSS for Windows v. 28. Missing data are endemic to treatment outcome field research, and as such, case-wise deletion was used to handle missing data on study variables, resulting in some fluctuation in the cell n for treatment and comparison groups across analyses.
First, we conducted a series of SRG treatment group and control group comparisons on demographic, criminal history, clinical/diagnostic, baseline risk (continuous Static-99R and LSI-OR scores), and outcome measures via t-test (for continuous measures) or chi-square (for dichotomous measures); measures of effect size in the form of standardized mean difference (Cohen’s d) for continuous variables, or phi (φ) for binary variables, were also computed. The d magnitudes were interpreted per Cohen’s (1992) conventions of .20 small, .50 medium, and .80 large, while φ magnitudes employed Rice and Harris (2005) guidelines for point biserial correlations of .10 small, .24 medium, and .37 large. Given that the treatment condition employed an intent-to-treat design, and thus included both completers and noncompleters, a set of follow-up analyses were conducted comparing the SRG treatment completers to those who dropped out or failed to complete the group.
The second set of analyses compared treatment completers, noncompleters, and comparison controls on dynamically assessed study measures, specifically, continuously scored psychometric measures that were administered pre and posttreatment (RAPE and MOLEST scales, AQ, DERS, HBI) as well as therapeutically informed measures such as the WAI-SF client and therapist ratings and continuous VRS-SO static, dynamic, and total scores. Between-group comparisons were conducted via independent samples t-tests while within-group comparisons (to assess pre-post change) were conducted using a matched samples t-test; in both instances, again, Cohen’s d was computed to provide a measure of effect size documenting the magnitude of between group differences or the amount of within group change. Of note, not all measures could be completed at posttreatment or an equivalent time 2 interval, particularly for individuals who dropped out of the SRG program or for individuals who were not referred and comprised the comparison condition. Observed posttreatment ratings were utilized and compared to their respective pretreatment ratings (for change analyses) or compared across treatment completer, noncompleter, and comparison control groups. Although the VRS-SO Users’ Workbook (Olver, Kelley, et al., 2020) provides a series of instructions for imputing change scores to obtain a time 2 estimate when a direct assessment of change is not available (e.g., assigning a default change score of zero for treatment dropouts or refusers), given the purpose of the present analyses to compare subgroups on the amount of directly rated change, we did not impute change scores for group comparisons in the present study. The full psychometrics of VRS-SO scores, using change score imputations where applicable, is beyond the scope of the present study and will be presented in a separate report to follow.
Third, we examined the associations between measures of working alliance assessed via the WAI-SF client and therapist-rated versions with psychometric and other therapeutically informed measures of change. The purpose of these analyses was to examine to what extent the working alliance was associated with treatment change as well as the concordance between client and therapist ratings of the alliance on this measure. Pearson correlations were computed between the WAI-SF client and therapist scores (pre, post, and change) with change measures; the WAI-SF measures were also intercorrelated. Cohen’s (1992) conventions for interpreting correlation magnitudes between two continuous variables were employed corresponding to .10 small, .30 medium, and .50 large (Rice & Harris, 2005).
The remaining analyses examined recidivism as an outcome variable. Owing to the exceptionally low base rate of sexual recidivism and the constrained sample size for outcome analyses, given that just a little more than half of the treatment and eligible controls had been released, we limited treatment outcome analyses to violent (including sexual) and general (i.e., any, all) recidivism. As the LSI-OR was completed at baseline on all participants and provides a comprehensive assessment of risk and need, this was the primary control measure considered and comparative outcome analyses were conducted for the LSI-OR total and need scores, along with other study measures. For space considerations, we reported only the substantive findings for continuous LSI-OR scores in the prediction of binary violent and general recidivism employing fixed 18-month follow-ups, the longest minimum follow-up available to retain the majority of the sample and capture most recidivism. AUC statistics were computed through ROC analyses in which corresponding values from 0 to 1.0 were generated, representing the probability that a randomly selected recidivist has a higher score than a randomly selected nonrecidivist, which can be interpreted in terms of .56 small, .64 medium, and .71+ large.
The treatment outcome analyses employed the LSI-OR as the primary means of controlling for baseline risk to account for risk-relevant group differences impacting the outcome variable, as will be detailed in the results section below. To examine the association between SRG treatment participation and outcome, employing an intent-to-treat design, Cox regression survival analyses were conducted entering continuous LSI-OR total scores and binary treatment-no-treatment status as covariates, to examine their unique associations with violent or general recidivism over time. Cox regression adjusts and controls for individual differences in follow-up time and generates a hazard ratio (eB) representing the percent change in the hazard of an unwanted outcome (e.g., recidivism) per one unit change in the predictor; values below 1.0 represent associations with decreased recidivism, while values above 1.0 represent associations with increased recidivism. A set of logistic regressions followed utilizing the same predictors but examining their unique associations with fixed 18-month violent or general recidivism; the purpose here was to examine the amount of risk reduction associated with SRG treatment group participation. Logistic regression can model rates of recidivism associated with specific scores or combinations of predictors, for instance, the rates of violent recidivism associated with increasing LSI-OR scores among those who participated in the SRG program vs those who did not; the logistic function can then be used to generate recidivism estimates for a given combination of covariates at specific values.
Finally, we unpacked the Cox regression survival analyses by conducting risk × treatment group comparisons on rates of violent and general recidivism using Kaplan-Meier survival analysis. LSI-OR scores were dichotomized into high-risk groups by collapsing those who scored in the high or very high categories (i.e., 20–43 inclusive), while low risk groups were generated by collapsing the very low, low, and medium risk categories (0–19 inclusive). This resulted in four risk × treatment condition groups: high risk treated, high risk untreated, low risk treated, and low risk untreated. Pairwise comparisons were then conducted using log rank chi square to compare survival curve trajectories for violent and general recidivism among the four groups. To ascertain the equivalence of these comparison groups, particularly between high-risk SRG participants and high-risk comparison controls, a series of group comparisons were conducted via ANOVA (with post hoc Tukey beta comparisons) and chi square.
Results
Treatment and Control Group Description and Comparisons
Table 1 presents group comparisons and descriptive statistics on study demographic, criminal history, index, clinical, risk, and outcome variables. The lefthand columns compare the SRG treatment group to the comparison control. Treated men tended to be older at release than the untreated controls, were serving longer sentences, and had a higher base rate of paraphilia diagnoses, while men in the untreated comparison condition had denser criminal histories (i.e., violent and general prior offenses, including sentencing occasions), higher base rates of personality disorder, and higher rates of violent and general recidivism. On average, both the treatment and comparison conditions scored high risk on the LSI-OR total score (i.e., 20–29) and above average risk on Static-99R (i.e., 4–5), although the comparison control still scored more than half a standard deviation higher on the LSI-OR, and approximately one third of a standard deviation higher on the Static-99R. There were no significant differences on most LSI-OR need domains, with the exception of higher scores by the comparison controls on criminal history, family/marital, and antisocial pattern.
The righthand columns disaggregate the SRG treatment condition into treatment completers and noncompleters; approximately one-third of individuals referred to the treatment program failed to complete it. Consistent with the literature, treatment noncompleters tended to score higher on risk-relevant variables including being a younger age, more frequently single, denser general criminal histories, and higher rates of violent and general recidivism. Both groups scored above average risk, and not significantly different, on the Static-99R; however, noncompleters scored high risk on average and significantly higher on the LSI-OR and most of its risk-need domains, than the treatment completers (who in turn scored medium risk on the LSI-OR overall).
Treatment Subgroup and Comparison Controls on Dynamically Assessed Measures
Table 2 presents treatment subgroup and control group comparisons on the dynamically assessed measures that served as treatment targets for the program. Within-group comparisons are also reported. As would be anticipated, posttreatment measures were considerably less frequently obtained on those men who failed to complete the group or who were not referred. First, treatment completers had small pre-post effects, most of which were nonsignificant, for the psychometric measures (RAPE, MOLEST, AQ, DERS), and the amount of change was often not appreciably different from either the noncompleters or the comparison controls. On the more explicitly therapeutically informed measures, however—specifically the WAI-SF-T and WAI-SF-C variants and VRS-SO dynamic score—treatment completers evidenced upward of one half to a full standard deviation of pre-post change and considerably greater amounts of change of a similar magnitude than the noncompleters. Of note, the mean VRS-SO change score of 3.7 (SD = 2.9) for the treatment completers was generally identical to that reported in the normative sample (i.e., M = 3.7, SD = 2.5; Olver et al., 2018).
Self-Regulation Group Treatment Subgroup and Control Comparisons on Scores from Dynamically Assessed Study Measures
Note. Total N = 207, while remaining ns vary by condition and measure and are provided in the table above; d between = standardized mean difference between study groups on identified measure; d within = standardized mean pre-post difference on change scores on a given study measure within a respective treatment subgroup (completer or noncompleter) or comparison control. WAI-SF = Working Alliance Inventory-Short Form, c = Client rated, T = therapist rated; AQ = Buss Perry Aggression Questionnaire; DERS = Difficulties in Emotional Regulation Scale; HBI = Hypersexual Behavior Inventory; VRS-SO = Violence Risk Scale-Sexual Offense version.
p ≤ .05.
Second, pre-post comparisons for the noncompleters and comparison controls, when available, were highly variable both in terms of direction and magnitude on the psychometric measures; although for the WAI and VRS-SO, change was either negligible or negative, which is unsurprising given the men either dropped out of the SRG group or, if not referred, often completed no additional services during the time interval. Finally, on baseline comparisons across groups, the magnitude of between group differences varied on the psychometric measures, although on the HBI (assessed at pretreatment only) treatment completers had more pronounced concerns in the area of hypersexuality than either group. Small effects were largely present on VRS-SO scores at baseline, with each subgroup having a mean VRS-SO score corresponding to average risk (Level III); at posttreatment, however, for the small subset of noncompleters and comparison controls with VRS-SO post ratings, both groups had mean scores that corresponded to above average risk (Level IVa).
Working Alliance and Treatment Related Change
Table 3 reports the results of intercorrelations of the WAI-SF client (WAI-SF-C) and therapist (WAI-SF-T) ratings with pre-post change on the psychometric and other therapeutically informed measures. First, client and therapist ratings of the alliance showed congruence, particularly at posttreatment (r = .45), and changes in the alliance were particularly strongly associated between the two sets of measures (r = .68). The results support the veracity of client and therapist ratings of the alliance in a correctional treatment sample. Moreover, pretreatment WAI-SF-T and WAI-SF-C scores were strongly associated with change, demonstrating that clients with more room for growth in the alliance registered greater change. Second, the psychometric measures of change had most associations (19/24 or 79%) in the expected direction with the WAI-SF measures, and any associations that were significant or about medium and higher tended to be with WAI-SF-T and WAI-SF-C post and change ratings; this was particularly notable for decreases in rape myth acceptance (RAPE scale) and improved emotional regulation (DERS). Finally, VRS-SO dynamic change scores had significant, and medium to large correlations, with both WAI-SF-T and WAI-SF-C ratings at posttreatment, as well as with improvements in the alliance; that is, positive changes in the alliance were associated with greater amounts of change in sexual violence risk reduction.
Bivariate Correlations: WAI-SF Client and Therapist Ratings Associations with Scores on Change Measures
Note. ns in parentheses; WAI-SF = Working Alliance Inventory-Short Form, c = Client rated, T = therapist rated; AQ = Buss Perry Aggression Questionnaire; DERS = Difficulties in Emotional Regulation Scale; VRS-SO = Violence Risk Scale-Sexual Offense version.
p ≤ .001, **p ≤ .01, *p ≤ .05.
LSI-OR Associations with Violent and General Recidivism
In total, 109 men included in the current treatment outcome evaluation were released and followed up in the community. They had an average 1,166.4 days (SD = 375.0) community follow-up post-release, during which 4.6% (5/109) of men were charged or convicted for a new sexual offense, 23.9% (26/109) for any violent (including sexual) offense, and 44.0% (48/109) for any new offense. The longest fixed follow-up that could be obtained while retaining the greatest proportion of the sample was about 18-months, for which 108 individuals could be included, and any new charge or conviction for an offense post-release could be registered. The 18-month fixed follow-up rate of sexual recidivism was 3.7% (4/108), violent recidivism was 20.4% (22/108), and general recidivism was 39.8% (43/108).
As continuous LSI-OR total score was the primary covariate employed to control for baseline risk, as a comprehensive risk-need measure routinely rated on the sample, Table 4 reports predictive accuracy findings for LSI-OR total and criminogenic need scores for fixed 18-month violent and general recidivism. As previously noted, owing to the small number of sexual recidivists, resulting in underpowered analyses and unstable and potentially misleading results, the present study findings focus specifically on violent (which includes sexual) and general (i.e., any, all) recidivism. As seen in Table 4, LSI-OR total scores had large effects in the prediction of violent and general recidivism, with similarly large effects demonstrated by antisocial personality pattern and criminal history scores, medium to large effects for substance abuse, procriminal attitudes, and companion need scores, and small, frequently non-significant, effects for leisure/recreation and employment/education scores; family/marital scores had a nonsignificant small effect for future violence but a robust medium effect for general reoffending. Of note, Static-99R scores were available on a smaller proportion of released men in the treated sample (n = 74), and the measure demonstrated a small non-significant effect for violent recidivism (AUC = .57, ns, 95% CI = [0.41, 0.72]) and a medium significant effect for general recidivism (AUC = .65, p = .032, 95% CI = [0.53, 0.77]). Given that the LSI-OR predicted these outcomes, which are the primary focus in the present study, with stronger effects than Static-99R, and it was scored on all released men at baseline, this reinforced the decision to employ it as the primary covariate for the treatment outcome analyses to follow.
ROC Analyses: Predictive Accuracy of LSI-OR Need and Total Scores with Violent and General Recidivism (Fixed 18-Month Follow-up)
Note. N = 108.
p ≤ .001, *p ≤ .05.
SRG Treatment Outcome Analyses: Cox and Logistic Regression
Table 5 reports the first set of treatment outcome analyses, entering LSI-OR total score followed by binary SRG treatment status (i.e., intent-to-treat vs. no SRG treatment), and examining their unique relations to outcome. Cox regression survival analysis was employed initially to examine the association of SRG treatment participation to violent and general recidivism over time, controlling for baseline risk. As seen in Table 5 (top panel), SRG treatment participation was significantly associated with decreased violent recidivism (eB = 0.327, 95% CI = [0.143, 0.752]), specifically, about a two-third decrease (67.3%) in the hazard of future violence (including sexual violence) after controlling for baseline risk (LSI-OR score), which in turn, also uniquely predicted increased violence. For general recidivism, risk and SRG treatment status effects were in the expected direction, and although LSI-OR scores demonstrated an even stronger association with outcome, the SRG treatment effect was associated at p = .061 (eB = 0.578, 95% CI = [0.325, 1.026]) and represented a notable but smaller decrease (42.1%) in the hazard of general recidivism associated with SRG participation.
Cox and Logistic Regression Analyses: Associations between Self-Regulation Group (SRG) Treatment Participation and Violent and General Recidivism Controlling for LSI-OR Score
Note. N = 108, significant p-values in bold font.
The logistic regression analyses were conducted utilizing the same covariates but employing the fixed 18-month follow-ups to permit modeling rates of recidivism reduction association with SRG treatment participation if results warranted this. As seen in Table 5, although SRG treatment participation was not significantly associated with decreases in 18-month general recidivism after controlling for baseline risk, it was again, significantly associated with a decreased odds in future violence, and at the same magnitude observed in the Cox regression survival analyses (i.e., 66.9% decrease, eB = 0.331, 95% CI = [0.118, 0.933]). These results permitted modeling 18-month violent recidivism rates employing the values in Table 5 for SRG treatment versus the comparison controls at all possible LSI-OR scores (Figure 1). As seen in this figure, a considerably shallower trajectory of violent recidivism associated with increasing LSI-OR scores is modeled for the SRG treatment group versus the comparison controls, with the trajectories widening as baseline risk increases, per the risk principle (i.e., higher risk men are demonstrating greater violence risk reduction).

Logistic Regression Modeled Rates of 18-month Violent Recidivism as a Function of LSI-OR Score and Treatment Group Participation
Baseline Risk by SRG Treatment Status Direct Comparisons: Kaplan Meier Survival Analyses
The final sets of analyses created LSI-OR risk × SRG treatment groups as illustrated in Table 6, with the observed trajectories of violent and general recidivism over time graphed in Figure 2. Creating these four risk × treatment groups provided a parsimonious means of illustrating the Cox regression findings by dichotomizing LSI-OR scores (i.e., high vs. low risk) and its intersection with SRG treatment status. As seen in Table 6, there were very few differences on risk-relevant study measures between the two high-risk (treatment vs. control) and the two low-risk groups, suggesting this remained an effective control for risk. Although high risk treated men had fewer total prior offenses and sentencing dates than high risk untreated men, the two groups were generally identical in their LSI-OR scores and need profiles, base rates of criminogenically relevant clinical diagnoses, VRS-SO, and Static-99R scores. The SRG treatment group also had significantly lower rates of 18-month violence than the high risk treated men, consistent with Cox and logistic regression analyses.
LSI-OR Risk × SRG Treatment vs. Control Subgroup Comparisons on Key Historical, Clinical, Risk, and Outcome Variables
Note. VRS-SO = Violence Risk Scale-Sexual Offense version; LSI-OR = Level of Service: Ontario Revision. VRS-SO dynamic and total scores are based on the most recent assessment.
Significantly lower than high-risk control group; bsignificantly lower than high-risk treatment group.
p ≤ .001, **p ≤ .01, *p ≤ .05.

Kaplan-Meier Survival Analysis: Trajectories of Violent and General Recidivism for LSI-OR Risk Level and Self-Regulation Group (SRG) Participation
The Kaplan-Meier survival analyses in Figure 2, in turn, illustrated that high-risk untreated men had significantly faster and higher rates of violent recidivism over time than all three remaining groups, including the high-risk SRG treatment group (log rank χ2 = 8.96, p = .003); the high-risk treatment group, in turn, had significantly higher violent recidivism rates than the low risk treated men, but not the low risk untreated group. For the general recidivism outcome variable, the high risk untreated group also had significantly higher and faster rates of any reoffending than all three groups, although by a smaller margin with respect to the high risk SRG treated group (log rank χ2 = 4.60, p = .032); both high risk groups, irrespective of treatment status, had higher rates of general recidivism than both low risk groups.
Discussion
The present study was an evaluation of a CBT/RNR treatment program for individuals who present with an SMI and have committed a sexual offense. SRG has been described in more detail previously (see Marshall et al., 2011), but it is important to note that the program adheres to established best practices (i.e., RNR) in addition to emphasizing an SBA approach to treatment, both regarding the content of the program and the delivery of service. Several measures were included in the overall assessment battery to examine potential within-treatment change on pertinent needs addressed in treatment (e.g., antisocial attitudes) and a proportion of individuals were followed up in the community for an average of 3 years with a subset being examined for a fixed 18-month follow-up.
Treatment completers generally showed small pre-post effects for the psychometric measures, which was not appreciably different than the comparison group or those who failed to complete treatment. However, treatment completers evidenced a significant and positive change on dynamically-assessed risk to re-offend (VRS-SO). Notably, the mean positive change in sexual violence risk reduction was similar to what was reported in the normative sample (Olver et al., 2018) and was more pronounced among the treatment group compared to the non-completers.
Positive treatment change was also evident on perceptions of working alliance. It is noteworthy that client and therapist ratings of working alliance were generally congruent, particularly as measured at posttreatment which contrasts with some other investigations that have shown client ratings to be more salient predictors of outcome (Horvath, 2000). Impressions of the alliance by both the clinician and the client tended to be associated with greater psychometric change, notably for decreases in rape myth acceptance, improved emotional regulation, and sexual violence risk reduction. Several meta-analyses have shown an effect between alliance and outcome (Martin et al., 2000) and that this effect can be even more important than specific techniques employed in treatment (Norcross & Hill, 2002).
With regard to outcome, after controlling for risk through stratified group comparisons, participants who completed treatment exhibited significantly lower rates of violent (including sexual) recidivism and non-significantly lower rates of general recidivism than did the no-treatment group over the 18-month fixed follow-up period. Overall, 3.7% re-offended sexually, 20.4% re-offended violently, and 39.8% re-offended generally. Participation in SRG treatment was associated with an approximate two-thirds decrease in the hazard of future violent, including sexual, recidivism.
Strengths and Limitations
There are several strengths to the present study. First, we examined a sample of individuals who are unique in the forensic and criminal justice systems. Indeed, individuals presented with a SMI but were held criminally culpable for their offenses and were not adjudicated as Not Criminally Responsible on account of Mental Disorder. The unique characteristics of this sample allow for the examination of correctional-based programs in a SMI population and add to the literature on the utility of proving correctional based interventions to those with a SMI (Morgan et al., 2012; Parisi et al., 2022).
We attempted to address potential threats to validity with as little bias as possible. SRG is a manualized and well-established program delivered by a mental health professional who was trained by the program developers and was supervised by a licensed psychologist. Recidivism variables were collected by independent raters who were not directly involved in this study and were blind to individuals who participated in treatment. We also had a unique opportunity to analyze our results using a comparison group that could not attend treatment due to purely administrative reasons (i.e., a short stay at this specific institution). We utilized an intent-to-treat design, but the attrition rates were relatively high, although on par with institutional attrition rates from sexual offense programs reported in the meta-analytic literature of around 30% (Olver et al., 2011). Of note, attrition rates were influenced by several reasons, such as dropping out of the program but also could include being transferred from the institution due to being paroled or presenting with serious psychiatric concerns or other behavioral problems.
With regard to possible limitations, our follow-up time was relatively short with a low-base rate of sexual recidivism; however, this is offset by the higher base-rates of violent and general recidivism to yield sufficient power for our analyses. Moreover, there were some missing data across all study variables, however, the sample size across analyses coupled with the magnitude of key effects were sufficient to power the analyses to generate the current findings and conclusions. We also acknowledge that our sample of persons with SMI is unique and, despite being a notable strength featuring an understudied population in this literature, it is also a weakness in terms of generalizability. Indeed, our sample was predominantly White men who received relatively short sentences. We also noted that individuals in both the treatment and comparison groups received some intervention, whether it was structured programming for other needs, medication to manage mental health symptoms, or psychoeducation. Although we cannot say definitively that SRG was the effective change agent in this sample, prior studies have shown that interventions targeting untreated SMI have generally failed to show improved correctional based outcomes, at least when compared to programs targeting traditional criminogenic needs, such as SRG (Morgan et al., 2012).
Conclusion and Future Directions
The present study provides ongoing support for the treatment of sexual offending, particularly among those individuals with an SMI. Of note, our interventions were employed in a secure treatment setting, which is also promising, given past studies showing that interventions in such settings can have less impact when compared to community-based programs (Gannon et al., 2019).
The role of SMI as a causal factor in sexual and nonsexual aggression, and how best to treat it, remains a somewhat contentious issue. Most studies have shown SMI to be a non-criminogenic need (Bonta et al., 2014; Kingston et al., 2015; cf. Babchishin et al., 2025) but there are clearly circumstances in which mental health symptoms contribute to aggression. Nevertheless, SMI has invariably been considered an important responsivity factor while delivering evidence-based interventions. Although there has been a shift toward recognizing the importance of RNR among those with SMI (Skeem et al., 2015), adopting such principles has been relatively slow, particularly within forensic mental health settings (Bewley & Morgan, 2011; Morgan et al., 2012).
A foundational element of RNR is the administration of assessment instruments that can both determine risk to re-offend and identify criminogenic needs. There are many established instruments to assess risk to re-offend. The LSI tools (Andrews et al., 1995) include the central eight risk factors embedded within the General Personality and Cognitive Social Learning Model and are widely used to assess risk to re-offend. Several studies have shown that the LSI is both reliable and valid among individuals with an SMI (e.g., Olver & Kingston, 2019).
Interventions should also be heavily weighted toward addressing criminogenic needs as compared to non-criminogenic needs (Bonta & Andrews, 2007). Research has shown that individuals with an SMI present with similar risk-relevant needs when compared to those without SMI (Skeem et al., 2014). Programs that predominantly target mental health outcomes trend toward poorer results regarding reduced recidivism, whereas programs that target well-established criminogenic needs such as offense-supportive attitudes and pro-criminal associates tend to evidence better correctional based outcomes (Morgan et al., 2012).
From a responsivity perspective, programs should adhere to evidence base modalities, such as cognitive-behavioral approaches, but remain flexible in the approach to service delivery such that factors that can impede treatment change, such as SMI, are addressed. Psychiatric rehabilitation underscores several services and approaches (e.g., psychopharmacology, CBT) to promote independent living and symptom reduction. Finally, the importance of a SBA approach to treatment cannot be minimized. SBA incorporates therapist characteristics in the delivery of service, the associated therapeutic alliance, and how these are involved in the identification and enhancement of client strengths to both reduce risk to re-offend and live a better life.
Overall, programs that effectively adhere to RNR will have a better chance at achieving favorable mental health and correctional outcomes. Of note, some programs are integrating domains from both the psychopathological and GPCSL models noted earlier. There are structured manuals, for example, that effectively incorporate both needs with some promising results (e.g., Morgan et al., 2014). The STU approaches offender rehabilitation in a similar way via the delivery of correctional based programs that adhere to RNR but provide such service within a broader system that attends to SMI goals. Of course, additional evaluations that employ methodologically rigorous designs (i.e., RCTs) but also maintain the important elements of good treatment will be important to further advance our knowledge of how best to promote the lives of our clients and overall public safety.
