Abstract
The Correctional Service of Canada (CSC) uses the Custody Rating Scale (CRS) for initial security classification; it is gender-neutral. Gender-informed scholars contend that gender-neutral assessments are problematic for use with justice-impacted women, as they exclude factors (e.g., victimization) deemed more relevant for women. Using an archival database with 1,555 federally sentenced women in Canada, we examined the extent that gender-informed indicators could yield incremental predictive validity (predicting institutional misconduct) beyond the CRS. Specifically, gender-informed variables from these domains were tested: mental health, substance misuse, relationship dysfunction, personal/emotional difficulties, parental/family issues, and victimization. Results revealed at least one gender-informed variable from each domain significantly predicted institutional misconducts. Composite gender-informed scales were created from the set of significant gender-informed predictors. Area under the curve (AUC) and hierarchical Cox regression analyses revealed the composite gender-informed scales contributed incremental predictive validity above and beyond the CRS. Although the CRS was predictive, it can be improved by including gender-informed variables.
According to Correctional Service of Canada (CSC)—the governmental agency that administers sentences of 2 years or more—women represent 6% of federally sentenced individuals in Canada (CSC, 2019b). When individuals are brought into CSC custody, they must undergo initial security placement (CSC, 2021). The Custody Rating Scale (CRS; Solicitor General of Canada, 1987) is used to assist in making initial security classification decisions by designating federally sentenced men and women as minimum, medium, or maximum security. Although the CRS was developed with federally sentenced men (Auditor General of Canada, 2017), it is considered gender-neutral, meaning it is hypothesized to perform equally well in both genders in a nonbiased fashion. Gender-informed proponents—scholars who primarily study justice-impacted girls and women—argue that the assessment and classification of women in a correctional setting should be based on research that has used women-only samples (Van Voorhis, 2022). Gender-informed proponents further argue that given that most assessment and classification research has focused on men, we cannot conclude that women have similar risk and need factors (Chesney-Lind & Pasko, 2013).
It should be noted that CSC uses a gender-informed assessment for the reclassification of federally sentenced women, the Security Reclassification Scale for Women (SRSW; Blanchette, 2005). Reclassification is a formal review that occurs after initial classification that must be completed within 2 years of an individual’s sentence for those initially classified as medium or maximum security; it is essential for community reintegration (CSC, 2018). It allows incarcerated individuals to move into lower security-level designations, which in turn accelerates community reintegration (Blanchette, 2005). However, CSC’s continued use of the gender-neutral CRS for initial classification decisions for women remains a concern (Senate Canada, 2019); the scale was not only developed using an all-male sample, but factors hypothesized to be more relevant for women were not tested during the original development and validation phase.
Also, using a gender-neutral CRS for initial classification for justice-impacted women may present additional concerns. CSC policy dictates that reclassification should only occur minimally, every 2 years (CSC, 2018), potentially placing women in an improper security placement for up to 2 years. This is concerning as prior research has found institutional factors such as conditions of confinement, violence exposure in prison, and prison routine predict institutional misconduct (Dâmboeanu & Nieuwbeerta, 2016). Relatedly, Leigey (2019) found for a sample of 1,821 justice-impacted women that greater custody levels predicted engaging in any form of misconduct. Thus, prison environments can drive the occurrence of misconducts.
One important objective of security classification is to reduce institutional misconducts. When a person violates the rules, they may undergo a formal disciplinary process that may result in a charge (CSC, 2015). Although some discretion in determining a charge exists, CSC has policies in place to help ensure consistency in applying misconduct charges. Minor offenses include negative acts that violate institutional rules. In contrast, serious offenses include committing, attempting, or inciting acts considered serious security breaches such as engaging in violence, repetitively breaking the rules, or causing harm to others (CSC, 2015). One avenue to assess the validity of a classification tool is to examine its ability to predict institutional misconducts. Thus, the main objective of the study is to explore whether risk factors hypothesized to be more relevant for incarcerated women than their male counterparts can incrementally enhance the ability of the CRS to predict prison misconducts among federally sentenced women in Canada.
It has been well established that justice-impacted women are different from their male counterparts in important ways. Not only do they engage in less serious forms of crime, but they evidence higher levels of needs in terms of substance use, interpersonal victimization, and trauma histories, internalizing mental health needs, and parental responsibilities (Chesney-Lind & Pasko, 2013). Consequently, gender-informed proponents argue that assessment, classification, and intervention approaches for justice-impacted women must not only be trauma-responsive but they must address the unique needs of women without inadvertently elevating risk and classification placements (Wright et al., 2012). Gender-informed scholars further advocate for integrated models that recognize the importance of both gender-informed and gender-neutral variables to enhance classification efforts (Derkzen et al., 2019).
The main goal of classification is to determine level of risk while designating the least restrictive security level that allows for safety of correctional staff (Farr, 2000). In accordance with Canada’s Corrections and Conditional Release Act (1992), classification must reflect institutional adjustment (IA), escape risk, and the risk to the public if an escape occurs. Ensuring the CRS appropriately classifies women in Canadian federal institutions is essential as improper classification creates difficulties for justice-impacted women and for the institution. Thus, appropriate classification can reduce institutional misconducts, decrease escape risk, and ensure proper designation of resources (Blanchette & Taylor, 2007). Importantly, classification methods must not overclassify women—that is, placing women in higher levels of security than are in fact warranted (Van Voorhis, 2012).
Concern also exists for classifying Indigenous justice-impacted women (Struthers Montford & Hannah-Moffat, 2021). In Canada, Indigenous peoples and their descendants are the original peoples of North America; there are three groups: First Nations, Inuit, and Métis (Government of Canada, 2022). Indigenous people have their own social history that includes the destructive and intergenerational impacts of colonialism, including residential schools, loss of culture, racism/stereotypes, family disruption due to forced removal of Indigenous children, and loss of identity (Cesaroni et al., 2018; Gutierrez & Wanamaker, 2022). It is argued that due to their distinct cultural and social backgrounds, Indigenous women are disadvantaged by the CRS; thus, Indigenous women are at greater risk of overclassification (Struthers Montford & Hannah-Moffat, 2021).
Currently, there is mixed evidence regarding the predictive accuracy of the CRS with federally sentenced women in Canada. Importantly, the evidence varies as a function of Indigeneity. Blanchette et al. (2002) conducted a CRS study on 334 federally sentenced women and examined nonviolent (causing a disturbance, substance use, escape, etc.) and violent (committing one or more acts of murder, hostage taking, or assault) misconduct. They found the predictive accuracy of the CRS was mixed. The IA scale of the CRS was a strong predictor of both violent and nonviolent misconducts for Indigenous and non-Indigenous women. However, the Security Risk (SR) scale of the CRS only predicted both types of misconduct among the non-Indigenous women.
Barnum and Gobeil (2012) explored the predictive validity of the CRS in a sample of 628 federally incarcerated Canadian women (n = 157 were Indigenous) over a 3-month follow-up. They examined institutional charges: Minor charges included substance use, property damage, and disciplinary problems, whereas major charges included murder, sexual assault, and possession of contraband. In terms of predicting major misconducts, the CRS had good predictive validity for Indigenous women and modest predictive validity for non-Indigenous women. In contrast, the CRS evidenced poor predictive validity for minor misconducts for Indigenous women and modest predictive validity for non-Indigenous women.
Rubenfeld (2014) investigated whether reweighting the CRS using the Burgess method (Nuffield, 1982) would improve the predictive validity of the CRS in a sample of 1,083 federally incarcerated women; the sample was randomly divided into a construction and validation sample. The Burgess method (Nuffield, 1982) employs a dichotomous weighting system. For example, values of one are assigned to item responses associated with increases in recidivism, whereas values of zero are assigned to item responses associated with decreases in recidivism; individual item scores (0 or 1) are then summed to arrive at a total score; in this case, higher scores would be indicative of greater likelihood of recidivism. The authors examined minor (e.g., disobeying rules, disrespecting staff, or having unauthorized item) and serious institutional charges (e.g., fights/assaults). The original CRS consistently predicted institutional charges for non-Indigenous women in both the construction and validation samples. However, the corresponding results for Indigenous women were mixed. Specifically, while the original CRS predicted institutional charges in the Indigenous construction sample, it did not predict institutional charges in the Indigenous validation sample.
Similarly, while the reweighted CRS performed consistently better than the original CRS for non-Indigenous women in both the construction and validation samples, the results were again mixed for Indigenous women. Although the reweighted CRS performed slightly worse for Indigenous women in the construction sample, it performed better for Indigenous women in the validation sample. However, the area under the curve (AUC) for the CRS did not exceed a .60 threshold in either sample. As a next step, Rubenfeld (2014) suggested creating a gender-informed scale developed specifically for women that explores the added benefit of including gender-informed predictors to the CRS.
Gender-Informed Predictors of Institutional Misconducts Among Justice-Impacted Women
To date, three published studies have explored adding gender-informed variables to gender-neutral classification assessments for justice-impacted women. Collectively, these studies identified the following gender-informed variables as promising predictors of institutional misconducts: poor self-efficacy, traumatic stress, current depression/anxiety, current psychosis, anger/hostility, low relationship support, history of child abuse, alcohol/drug problems, borderline features, and poor family support (Davidson et al., 2016; Van Voorhis et al., 2010; Wright et al., 2007).
Wright et al. (2007) examined serious prison misconducts (which excluded minor rule violations) in a sample of 272 institutionalized women. They found combining a gender-responsive needs scale and a gender-neutral needs scale evidenced the strongest relationship with misconducts (correlations ranged from r = .28 to r = .38). Similarly, Van Voorhis et al. (2010) examined serious misconducts (e.g., assault, escapes, contraband, fighting, and sexual behavior) that were committed 6 to 12 months after intake in a sample of 628 justice-impacted women. They found a gender-responsive scale evidenced greater AUC values (AUCs ranged from .62 to .70) than a gender-neutral scale (AUCs ranged from .58 to .68) in the context of predicting prison misconducts.
Finally, Davidson et al. (2016) explored adding gender-responsive factors from the Personality Assessment Inventory (PAI; Morey, 2007) to a gender-neutral baseline model that included static criminal history variables (e.g., age at first arrest, violent arrests) in a sample of 2,181 female inmates. The PAI assesses psychopathology and assists in classification of justice-impacted individuals. They examined general disciplinary infractions and assaultive infractions (assault on staff or inmates). Paranoia, antisociality, traumatic stress, drug problems, and mental health contributed incremental predictive validity to the baseline model. Collectively, the findings indicate that adding gender-informed factors to gender-neutral classification assessments can yield incremental predictive accuracy. Thus, both gender-neutral and gender-informed variables are imperative for justice-impacted women.
Despite these studies supporting the inclusion of gender-informed items to gender-neutral classification assessments, limitations should be noted. Van Voorhis et al. (2010) and Wright et al. (2007) relied on partial correlation analysis and correlations to determine incremental predictive validity. Furthermore, due to data limitations, Davidson et al. (2016) were unable to fully explore the gender-responsiveness of the PAI. In addition, there is no study that has examined the inclusion of gender-informed variables to the CRS in Canada.
Proposed Gender-Informed Predictor Domains
To date, an agreed-upon central set of global predictor domains has not been established for women. Based on past theorizing and research (and the data available to us), we have organized our study into six global areas: victimization, mental health challenges, substance misuse, relational dysfunction, parental/family challenges, and personal/emotional challenges. These domains align somewhat, albeit imperfectly, with past studies that have previously explored what factors predict institutional misconducts among women (Davidson et al., 2016; Van Voorhis et al., 2010; Wright et al., 2007).
Victimization
Daly (1992) identified five pathways into crime for women: Street Woman (experienced trauma at home and runs away to escape, thereby placed at increased risk for criminalized street survival strategies such as prostitution, drug trade and robbery), Harmed and Harming (trauma leads to having addictions and mental health issues), Battered (an abusive intimate relationship results in criminalized self-defense), Drug-connected (there is no experience of trauma, but sells or is addicted to drugs from family connections), and Economically Motivated (no victimization but commits crime for self-involved purposes such as greed). Thus, the role of childhood and adult victimization figured prominently in Daly’s pathways theory (PT). Subsequent scholars have underscored the role of childhood trauma and adult victimization in propelling women into the justice system (Holtfreter et al., 2022; ten Bensel et al., 2019; Van Voorhis, 2022). Notably, traumatic stress, abuse history, and childhood abuse have predicted institutional misconduct in justice-impacted women (Davidson et al., 2016).
Mental Health Challenges
Using a PT approach, Gehring (2018) found childhood abuse, mental health issues (such as diagnosis of a mental health disorder), and current substance use lead to crime in justice-impacted women but not men. Similarly, in comparison with justice-impacted women not diagnosed with a mental health disorder, justice-impacted women with a mental health disorder are 1.75 to 2.2 times more likely to perpetuate prison misconducts (Houser et al., 2012; Houser & Welsh, 2014). Other researchers have found that mental health challenges are a strong predictor of violent misconducts (Reidy & Sorensen, 2018).
Substance Misuse
Substance misuse plays a significant role in Daly’s (1992) Harmed and Harming and Drug Connected pathways. Salisbury et al. (2018) used a mixed-methods study with a sample of 246 incarcerated women that completed a survey and then randomly assigned 12 of the participants to complete semi-structured interviews. Salisbury et al. (2018) found the majority of their sample displayed Daly’s (1992) Drug-connected and Economically Motivated pathways. However, the Drug-connected group had primarily been involved in drug trafficking with male partners or male strangers (Salisbury et al., 2018). Finally, Davidson et al. (2016) reported that substance misuse predicted institutional misconducts.
Relationship Dysfunction
Relational cultural theory (RCT) is a women-centered theory positing women gravitate toward and experience growth in relationships characterized by mutual, authentic, and empathic connections (Miller, 1976). Intimate partner antisociality has shown to increase criminal behaviors for females, but not for males (Burgess-Proctor et al., 2016; Van Voorhis, 2022). DeHart (2018) found in a sample of 60 incarcerated women in maximum security the existence of a distinct group characterized by defensive or retaliatory violence directed at their intimate partner. Relatedly, there is evidence that relationship dysfunction predicts institutional misconducts in justice-impacted women (Van Voorhis et al., 2010; Wright et al., 2007).
Parental/Family Challenges
Roughly 75% of incarcerated women have children (Barrett et al., 2010). Notably, separation from family, particularly children, has been described as one of the most difficult experiences for incarcerated women (Collica, 2010). The impact of familial visits on incarcerated women has generally yielded positive outcomes. For example, prosocial family contact and number of family visits have predicted reductions in officially recorded institutional misconducts among justice-impacted women (Celinska & Sung, 2014). Similarly, Wilton and Stewart (2015) found that women who had at least one prison visit during incarceration were less likely to be returned to custody than women with no prison visits. This is particularly impressive given that the effect of prison visits was observed while controlling for seven other variables that were also linked to return to custody. In contrast, Benning and Lahm (2016) reported that prison visits with children among incarcerated mothers were correlated with prison rule infractions. However, this study should be interpreted cautiously. It was based on cross-sectional survey data that relied on women’s retrospective recall regarding not only prison visits but also if they had been written up for rule violations. More importantly, the nature of the study design precludes the ability to determine whether or not rule infractions preceded or followed prison visits with children.
Family challenges are a contributor to women’s pathways into crime. Smith (2017) explored pathways into crime in 1,209 justice-impacted women. She found one prevalent pathway that emerged included childhood victimization, family drug abuse, being in foster care, and parental incarceration. Chaotic family upbringing and ongoing dysfunctional relationships with caregivers and exposure to antisocial friends influences women’s pathways into crime (Belknap et al., 2016; Maghsoudi et al., 2018).
Personal/Emotional Challenges
Anger (Van Voorhis et al., 2010), poor self-efficacy (Van Voorhis et al., 2010), personal distress (Bonta & Andrews, 2017), and low self-esteem (Wright et al., 2007) encompass this multifaceted domain. In interviewing 37 jail staff members, Belknap et al. (2016) found staff perceptions of women’s pathways to crime included low self-esteem, self-confidence, and self-worth that contribute to increased risk of poor choices in partners and friends, thus increasing risk of offending. Barnum and Gobeil (2012) found the personal/emotional domain of the Dynamic Factor Identification and Analysis–Revised (DFIA-R; CSC, 2019a) was moderately correlated with the CRS. Relatedly, Van Voorhis et al. (2010) found poor self-efficacy predicted institutional misconducts among justice-impacted women.
The Current Study
The study explored whether (a) hypothesized gender-informed risk factors would predict institutional misconducts among federally sentenced women in Canada and (b) whether the addition of gender-informed risk factors would yield incremental predictive validity to the CRS. The aim of the study was to create one consolidated gender-informed tool. Based on past research, it was hypothesized that gender-informed risk factors would predict misconducts and they would add incremental predictive validity above and beyond the CRS. To test the hypotheses, we used a series of univariate Cox regression survival analyses, hierarchical Cox regression survival analyses, AUC statistics, and formal AUC comparisons.
Method
Participants
The initial sampling frame comprised all federally sentenced women in Canada (thus, only those serving 2 years or more as per Canadian law) admitted to custody on a new warrant of committal between September 28, 2009, and January 8, 2017. Only women with a valid and complete DFIA-R (CSC, 2019a) assessment were included (N = 1,555). The average age at admission was 34.73 (SD = 10.72, range = 17–78) years. The sample included 577 (37.1%) Indigenous women, 971 (62.4%) non-Indigenous women, and seven (.005%) women whose racial heritage was unknown. The majority (83.3%, n = 1,296) were serving their first federal sentence while 16.7% (n = 259) were serving a subsequent federal sentence. Over half of the women had been convicted for a violent index offense(s) (56.1%); the remaining 43.9% had been convicted for a nonviolent index offense(s). The average sentence length (excluding life sentences) was 3.32 (SD = 1.82) years.
Measures
All data used for this study were gathered from the Offender Management System (OMS). The OMS is used by CSC to manage information for all federally sentenced individuals for the duration of their sentences. OMS data retrieved for this study included information from the Dynamic Factors Identification and Analysis (DFIA-R; CSC, 2019a), the Computerized Mental Health Intake Screening System (COMHISS; CSC, 2007), the Women’s Computerized Assessment of Substance Abuse (W-CASA, MacDonald et al., 2015), the CRS (Solicitor General of Canada, 1987), and institutional misconducts.
The Dynamic Factor Identification and Analysis–Revised
The DFIA-R (CSC, 2019a) was originally implemented in September 2009 and comprises 100 dichotomous items. These 100 items are organized within seven need domains: Employment/Education (12 dichotomous items), Attitude (11 dichotomous items), Family/Marital (16 dichotomous items), Substance Abuse (18 dichotomous items), Community Functioning (seven dichotomous items), Associates (11 dichotomous items), and Personal/Emotional (25 dichotomous items). Parole officers rate each item dichotomously, as either present (scored 1), or absent (scored 0). Parole officers use the individually scored dichotomous items within each domain to assign an overall need domain rating; domain ratings range from asset, no need, some need, to considerable need for improvement. Stewart et al. (2017) validated the DFIA-R in a sample of 4,798 men and 1,368 women. The DFIA-R need ratings predicted general revocations as well as revocations with a new offense (Harrell’s C ranged from .54 to .61 for revocations, and .55 to .60 for revocations with an offense). Guided by gender-informed theory and past research, we extracted 37 hypothesized gender-informed items from the following domains to be included in the initial analyses: Substance Abuse, Marital/Family, Associates, and Personal/Emotional (see Table 1 in the “Results” section for a detailed list of the gender-informed DFIA-R indicators).
Cox Regression Survival Analyses: Gender-Informed Variables Predicting Any Misconduct
Note. CI = confidence interval; DHS = Depression, Hopelessness and Suicide Screening Form; CoMHISS = Computerized Mental Health Intake Screening System; DFIA-R = Dynamic Factor Identification and Analysis–Revised.
Wald’s Statistic. b Predictors from CoMHISS, please see Methods section. c Severity of Distress = Global Severity Index subscale of the Brief Symptom Inventory. d Predictors from DFIA-R, please see Methods section. e LOG Alcohol Dependence Scale = Log-transformed Alcohol Dependence Scale.
p < .05. **p < .01. ***p < .001.
CoMHISS
Indicators from the Computerized Mental Health Intake Screening System (CoMHISS, CSC, 2007) were also included to capture hypothesized female-specific predictors. At intake, CoMHISS is offered to all federally sentenced individuals upon admission. CoMHISS is a computerized psychometric test battery that screens for mental health concerns that need to be addressed by mental health professionals. CoMHISS is considered a valid screener for mental health concerns among federally sentenced women (Archambault et al., 2010). It is comprised of four measures (132 items in total) that collectively assess historical and current psychological functioning, current suicide risk, depression, psychological symptoms, and cognitive impairment. The measures used specifically for this study are as follows:
(a) The Depression, Hopelessness and Suicide Screening Form (DHS; Mills & Kroner, 2004); the DHS is comprised of 39 true/false items (scored 0 = false, 1 = true) that measure depression, hopelessness, current suicidal ideation, cognitive suicide, and historical suicide ideation/attempts; higher scores indicate a greater likelihood of the presence of the measured constructs.
(b) The Brief Symptom Inventory (BSI; Derogatis, 1993); the BSI is comprised of 53 items that assess nine symptom categories: Obsession-Compulsion, Somatization, Depression, Interpersonal Sensitivity, Phobic Anxiety, Hostility, Psychoticism, and Paranoid Ideation; it also consists of three global indices of distress. These indices are the Global Severity Index, Positive Symptom Distress Index, and Positive Symptom Total. The items within the nine symptom categories are rated on a Likert-type scale from 0 (not at all) to 4 (extremely). The study used the Global Severity Index of the BSI. The Global Severity Index is calculated by using the sum of the nine symptom categories (plus four items of the BSI that are not included in the symptom categories but are clinically important) and dividing by the total number of items that the individual responded to. The GSI is considered the most sensitive indicator of the individual’s distress level and combines information about the number of symptoms and intensity of the distress.
W-Casa
The W-CASA (MacDonald et al., 2015) is a 261-item computerized assessment that measures substance abuse issues in justice-impacted women on admission into federal custody. It includes various subscales that capture the history of alcohol/drug abuse, current drug/alcohol use, polysubstance abuse, severity of dependence, and prior programming.
The W-CASA includes two standardized measures that were used in conjunction with the DFIA-R substance abuse indicators to represent the gender-informed substance abuse domain in this study: the Alcohol Dependence Scale (ADS; Skinner & Horn, 1984) and Drug Abuse Screening Test (DAST; Skinner, 1982). The ADS is a 25-item assessment tool that determines the extent of physiological dependence with alcohol. Dichotomous items are scored 0, 1; three-choice items are scored 0, 1, 2; and four-choice items are scored 0, 1, 2, 3. Total scores can range from 0 to 47, with higher scores indicating greater alcohol dependence. The ADS has acceptable psychometric properties as Cronbach’s alpha of .91 has been found (Skinner & Horn, 1984).
The DAST is a 20-item assessment tool that assesses use of drugs that are not alcohol. Items are dichotomously scored as a “Yes” response is given a score of “1” except for Items 4, 5, and 7 where a “No” response is given a score of “1.” Higher scores are indicative of increased drug problems and a total score is computed by summing all the items. Interpreting the total score is as follows: A score of 0 corresponds with “None,” scores from 1 to 5 correspond to “Low,” 6 to 10 correspond to “Intermediate” (and likely meet Diagnostic and Statistical Manual of Mental Disorders [4th ed.; DSM-IV; American Psychiatric Association, 1994] criteria of substance abuse disorder), 11 to 15 correspond to “Substantial,” and 16 to 20 correspond to “Severe.” It has acceptable psychometric properties, as a Cronbach’s alpha of .74 has been found (Skinner, 1982).
Custody Rating Scale
The CRS (Solicitor General of Canada, 1987) is used for initial security classification of federal sentenced individuals to assist in determining their security level. The CRS comprises two subscales: the five-item IA scale (scores range from 0 to 186 points), and the seven-item Security Risk (SR) scale (scores range from 17 to 190 points). The scale’s recommended security classification is based on the total score of the subscales, and for each institutional incident that occurs item scores will increase. As scores increase for each subscale, higher security classifications are warranted. The cut-off scores for the CRS designation of minimum, medium, and maximum security are as follows: (a) minimum security: 0 to 85 on the IA scale, and 0 to 63 on the SR scale; (b) medium security: between 86 and 94 on the IA scale, and between 0 and 133 on the SR scale; or between 0 and 85 on the IA scale and 64 and 133 on the SR scale; and (c) maximum security: 95 or greater on the IA scale or 134 or greater on the SR scale. As reviewed in the introduction, the CRS has demonstrated mixed validity (Barnum & Gobeil, 2012; Blanchette et al., 2002; Rubenfeld, 2014).
Institutional Misconducts
Minor and serious institutional charges perpetrated by the women were extracted from OMS. Minor misconducts included events such as failure to attend institutional count, improper dress, and noncompliance to directions. Major misconducts included events such as instigating a riot, assault, refusing urinalysis, or uttering threats (Kroner & Mills, 2001). In the event of multiple misconducts, the date of first occurrence was used. Misconduct data were collected between September 28, 2009, and January 8, 2017. Thus, the mean follow-up time was 579.18 (SD = 402.53; range = 1–3,735) days. More women had experienced minor misconducts (53.1%; N = 812, M = 2.73, SD = 6.01) than major misconducts (30.5%; N = 466, M = 0.99, SD = 3.27). In sum, 58.8% of the women incurred “any” misconduct (includes both minor and major; N = 899, M = 3.72, SD = 8.04) during the follow-up period. In our study, we examined two outcome variables: any misconduct (minor or major combined) and serious misconducts (i.e., major misconducts only).
Procedure and Data Analytic Approach
Based on prior theory and research, we extracted 46 potential gender-informed predictors from three measures (DFIA-R, CoMHISS, and W-CASA). To create the mental health domain, items from CoMHISS were used; to create the substance use domain, items from the DFIA-R Substance Abuse Domain and W-CASA were used; to create the relationship dysfunction domain, items from the DFIA-R Marital/Family domain and marital status were used; to create the parental/family issues domain, items from DFIA-R Marital/Family and Associates domain were used; to create the victimization domain, items from the DFIA-R Marital/Family were used; and to create the personal/emotional domain, items from the DFIA-R Personal/Emotional were used (see Table 1).
Next, gender-informed scales were created to test whether they would add incremental predictive validity to the CRS. This was accomplished as follows. First, the gender-informed predictors were tested using a series of survival analyses to determine which gender-informed predictors would emerge as predictors of serious and any misconduct charges. For example, all gender-informed personal/emotional domain indicators were entered simultaneously into a Cox regression survival analysis predicting time to any misconduct, and then again, but using serious misconduct as the outcome; this was repeated for each of the remaining five global predictor domains. Significant indicators were retained and summed to create two distinct gender-informed scales—one predicting any misconduct, and another predicting serious misconducts. Then, AUC comparisons and hierarchical survival analyses were used to test incremental predictive validity. AUC comparisons were computed in MedCalc Version 18.9 using the method of Hanley and McNeil (1983), and hierarchical Cox regression analyses were computed using SPSS version 26. This approach is similar to prior research that has examined adding gender-informed predictors to gender-neutral classification assessments (e.g., Davidson et al., 2016; Wright et al., 2007).
Results
All variables were examined for missing data and violations of statistical assumptions (e.g., heteroscedasticity, normality, linearity, outliers) prior to analyses. Missing data ranged from 0% to 28.0%. The following variables had more than 10% missing data: the CoMHISS measures Global Severity Index of the Brief Symptom Inventory (27.9%); the DHS Depression (28.0%); DHS Hopelessness (27.8%); DHS Cognitive Suicide (27.7%); DHS Current Ideation (27.6%); DHS Historical Suicide (27.7%); and the DFIA-R item “gives up easily when challenged” (10.8%). The items with over 10% of missing data were nonetheless retained due to theoretical importance, the absence of evidence that the pattern of missingness was related to other variables in the data set, and the large size of the sample. As the missing value analysis suggests that the data are most likely missing at random (MAR), pairwise deletion and listwise deletion were used to deal with missing data (Tabachnick & Fidell, 2013). However, the ADS was log-transformed (Tabachnick & Fidell, 2013) due to severe skewness. No dichotomous items evidenced splits greater than 90/10; hence, all were retained. There were 27 multivariate outliers removed. This data cleaning resulted in a final sample size of 1,528.
Which Gender-Informed Variables Predict Institutional Misconduct?
To examine Research Question 1—which gender informed variables predict institutional misconduct—the 46 potential predictors from the DFIA-R, CoMHISS, and W-CASA were assigned to one of six conceptually appropriate gender-informed domains (outlined in Table 1). Next, to address different lengths of time for incarceration, survival analyses were run for each of the six gender-informed domains for both any and serious misconducts. Thus, a total of 12 survival analyses were conducted to determine which items from each of the six gender-informed domains would emerge as significant predictors of any and serious misconducts (see Tables 1 and 2).
Cox Regression Survival Analyses: Gender-Informed Variables Predicting Serious Misconduct
Note. CI = confidence interval; DHS = Depression, Hopelessness, and Suicide Screening Form; CoMHISS = Computerized Mental Health Intake Screening System; DFIA-R = Dynamic Factor Identification and Analysis–Revised.
Wald’s Statistic. b Predictors from CoMHISS, please see “Methods” section. c Severity of Distress = Global Severity Index subscale of the Brief Symptom Inventory. d Predictors from DFIA-R, please see “Methods” section. e LOG Alcohol Dependence Scale = Log-transformed Alcohol Dependence Scale.
p < .05. **p < .01. ***p < .001.
For any misconduct, of the 46 predictors, 16 gender-informed variables emerged as significant predictors (i.e., 16 items were individually significant at p < .05 and the hazard ratio confidence intervals did not contain one). Notably, each gender-informed domain had at least one item that significantly predicted any institutional misconduct. This included one predictor in the mental health domain (Historical Suicide), two predictors in the substance abuse domain, three predictors in the relationship dysfunction domain, three predictors in the parental/family issues domain, two predictors in the victimization domain, and five predictors in the personal/emotional domain. Unexpectedly, the item “Assertiveness Skills are Limited” was negatively predictive of any misconducts (i.e., being more assertive predicted misconducts).
For serious misconduct, of the 46 predictors there were only 13 gender-informed variables that emerged as significant predictors (i.e., 13 items were individually significant at p < .05 and the hazard ratio confidence intervals did not contain one). Again, each gender-informed domain had at least one item that significantly predicted serious misconduct. This included one predictor in the mental health domain (Depression), three predictors in the substance abuse domain, two predictors in the relationship dysfunction domain, two predictors in the parental/family issues domain, one predictor in the victimization domain, and four predictors in the personal/emotional domain. Notably, once again, Assertiveness Skills are Limited was negatively predictive of the outcome (i.e., being more assertive predicted serious misconducts).
Can the Gender-Informed Scales Yield Incremental Predictive Validity to the CRS?
To test whether the addition of gender-informed variables can yield incremental predictive validity to the CRS, four gender-informed scales were created. First, a scale labeled, Gender-Informed Measure Continuous Version for Any Misconducts (GIM-ContANY) was created by summing the significant predictors of any misconduct. However, two significant items were excluded: (a) Historical Suicide was excluded for ethical reasons 1 and (b) Assertive Skills are Limited was excluded due to the demonstrated counter-intuitive relationship with outcome. The remaining 14 gender-informed dichotomous predictors were summed to create the GIM-ContANY scale. Scores ranged from 0.00 (no gender-informed factors present) to 14.00 (all gender-informed factors present; M = 6.84, SD = 3.59).
For serious misconducts, three of 13 significant items were dropped prior to creating the Gender-Informed Measure Continuous Version for Serious Misconducts (GIM-ContSERIOUS) scale. Depression was excluded from the gender-informed scale for ethical reasons. 2 Both Assertive Skills are Limited and Has Combined the Use of Alcohol and Drugs were dropped given the observed counter-intuitive relationship with outcome. Scores on the resultant 10-item, GIM-ContSERIOUS scale ranged from 0.00 to 10.00 (M = 4.47, SD = 2.64).
As both the GIM-ContANY and GIM-ContSERIOUS scales are continuous, it was determined that it would be advantageous to create a trichotomized version of each scale to allow for similar comparisons with the CRS that designates incarcerated women into minimum, medium, and maximum security levels. The CRS designated 38.0% of the total sample (N = 1,528) to minimum security, 46.9% to medium security, and 8.9% to maximum security. The GIM-ContANY and GIM- ContSERIOUS were trichotomized to mirror, as proximally as possible, the CRS security-level distribution.
Each of the GIM scales was trichotomized into “low,” “medium,” and “high” groups. Justice-impacted women designated to the high group had the greatest gender-informed factors, and were subsequently more likely to perpetuate misconducts. For the GIM-ContANY scale, the high group comprised 11.9% (N = 182) of the total sample. The medium group comprised 43.5% (N = 744) of the total sample, and the low group comprised 44.6% (N = 681) of the total sample. For the GIM-ContSERIOUS scale, the high group consisted of 14.9% (N = 228) of the total sample, the medium group comprised 46.9% (N = 716) of the total sample, and the low group comprised 38.2% (N = 584) of the total sample.
Receiver operating curve (ROC) analyses were used to examine potential differences in the predictive validity of the CRS and gender-informed scales; these analyses were conducted for the entire sample, Indigenous women, and non-Indigenous women. As Table 3 illustrates, the gender-informed scales (AUCs ranged .63–.73) outperformed the CRS in all analyses (AUCs ranged from .63–.67), except for Indigenous women; the gender-informed AUCs were either lower or essentially equivalent (to the CRS) for the Indigenous women. Formal statistical AUC comparisons confirmed the Table 3 results. As Table 4 illustrates, the ability of the gender-informed scales to outperform the CRS was driven by the non-Indigenous women-specific results. For Indigenous women, the gender-informed scales did not outperform the CRS in the prediction of any or serious misconduct.
Predicting Any and Serious Institutional Misconducts as a Function of Indigeneity
Note. Gender-informed Measure Continuous for Serious Misconducts (GIM-ContSERIOUS) has the remaining 10 gender-informed predictors of serious misconducts. Gender-informed Measure Trichotomized for Serious Misconducts (GIM-TriSERIOUS) is the trichotomized GIM-ContSERIOUS scale. The Gender-informed Measure Continuous for Any Misconducts (GIM-ContANY) contains the remaining 14 gender-informed predictors of any misconduct. Gender-informed Measure Trichotomized for Any Misconducts (GIM-TriANY) is the trichotomized GIM-ContANY scale. We reran the analyses to include the mental health indicators (historical suicide for any misconduct and depression for serious misconduct) and found there was no change in the results. See Table S6 of Supplemental Material (available in the online version of this article). AUC = area under the curve; CI = confidence interval; GIM-ContSER = GIM-ContSERIOUS; GIM-TriSER = GIM-TriSERIOUS.
Formal Comparison of ROC Curves With the Total Sample, Indigenous Women, and Non-Indigenous Women
Note. The Gender-informed Measure Continuous for Any Misconducts (GIM-ContANY) contains the remaining 14 gender-informed predictors of any misconduct. Gender-informed Measure Trichotomized for Any Misconducts (GIM-TriANY) is the trichotomized GIM-ContANY scale. We reran the analyses to include the mental health indicators (historical suicide for any misconduct and depression for serious misconduct) and found there was no change in the results. See Table S7 of Supplemental Material (available in the online version of this article). AUC = area under the curve; CI = confidence interval; GIM-ContSER = GIM-ContSERIOUS; GIM-TriSER = GIM-TriSERIOUS; ROC = receiver operating characteristic; CRS = Custody Rating Scale.
p < .05. **p < .01. ***p < .001.
Hierarchical Cox regression analyses were also conducted to determine whether the gender-informed scale could add incremental predictive validity to the CRS. As Table 5 illustrates, the trichotomized and continuous versions of the gender informed scales do contribute incrementally to the prediction of survival time. As expected, the CRS remained a significant predictor in all models. Thus, both the CRS and the gender-informed scales are important predictors of any misconducts and serious misconducts. 3
Hierarchal Cox Regression Analyses for the Custody Rating Scale and Gender-Informed Scales for Any and Serious Misconducts
Note. Gender-informed Measure Continuous for Serious Misconducts (GIM-ContSERIOUS) has the remaining 14 gender-informed predictors of serious misconducts. Gender-informed Measure Trichotomized for Serious Misconducts (GIM-TriSERIOUS) is the trichotomized GIM-ContSERIOUS scale. The Gender-informed Measure Continuous for Any Misconducts (GIM-ContANY) contains the remaining 15 gender-informed predictors of any misconduct. Gender-informed Measure Trichotomized for Any Misconducts (GIM-TriANY) is the trichotomized GIM-ContANY scale. We reran the analyses to include the mental health indicators (historical suicide for any misconduct and depression for serious misconduct) and found there was no change in the results. See Table S8 of Supplemental Material (available in the online version of this article). CI = confidence interval; CRS = Custody Rating Scale.
p < .05. **p < .01. ***p < .001.
Discussion
The CRS is used to assist in initial security classification for federally sentenced persons in Canada. However, despite that the CRS has been created and validated with an all-male sample (Auditor General of Canada, 2017), it has been considered to be gender-neutral—that is, it works equivalently for both men and women. Although initial security classification for federally sentenced women in Canada is informed using the CRS, reclassification for federally sentenced women is informed using the gender-informed SRSW. Furthermore, the predictive accuracy of the CRS, with the exclusion of major misconducts for Indigenous women, has been mixed (Barnum & Gobeil, 2012; Blanchette et al., 2002; Rubenfeld, 2014). As prior literature has provided support that gender-informed factors contribute to the incremental predictive validity of gender-neutral assessments (Davidson et al., 2016; Van Voorhis et al., 2010; Wright et al., 2007), there is a need for more gender-informed work for assessment measures involving justice-impacted women. Thus, investigating whether gender-informed factors can yield incremental predictive validity to the CRS in a sample of justice-impacted women is warranted.
The study explored whether gender-informed factors could predict serious and any institutional misconduct for federally incarcerated women in Canada. Furthermore, the study examined whether the addition of gender-informed factors could yield incremental predictive validity over and above the CRS. Overall, the study illustrated that theoretically selected gender-informed items can predict both general and serious misconducts among federally sentenced women. More importantly, the summation of gender-informed risk factors can also yield incremental predictive validity above and beyond the CRS. These findings are aligned with prior gender-informed research (e.g., Davidson et al., 2016; Van Voorhis et al., 2010).
A subset of items from each gender-informed domain (relationship dysfunction, mental health, substance use, parental/family issues, victimization and personal/emotional functioning) emerged as significant predictors of both serious and any institutional misconduct. For any misconduct, 16 significant predictors emerged: one mental health, two substance use, three relationship dysfunction, three parental/family issues, two victimization, and five personal/emotional. Serious misconducts had less significant predictors, with a total of 13 significant predictors: one mental health, three substance use, two relationship dysfunction, two parental/family issues, one victimization, and four personal/emotional.
These results are consistent with prior research, as mental health (Reidy & Sorensen, 2018), parental/family issues (Van Voorhis et al., 2010), relationship dysfunction (Wright et al., 2007), substance use (Davidson et al., 2016), and victimization (Davidson et al., 2016) have all been found to be significant predictors of institutional misconduct for justice-impacted women. Specifically for the mental health domain, depression and historical suicide were found to be significant predictors in this study, which is consistent with prior studies that have found depression to be a significant predictor of general and violent misconducts (Davidson et al., 2016; Van Voorhis et al., 2010). Also, this study found going on drug-taking binges, becoming violent when drinking or using drugs, combining the use of alcohol and drugs, and combining the use of different drugs were significant predictors. This is again consistent with Davidson et al. (2016) who reported that alcohol and drug problems predicted general and violent misconducts.
Our study also found that various indices of relationship dysfunction (i.e., an inability to maintain an enduring intimate relationship, problematic intimate relationships, and perpetuating spousal violence) were all significant predictors for institutional misconducts. These findings are consistent with prior research, as high relationship dysfunction (Van Voorhis et al., 2010; Wright et al., 2007) and low relationship support (Wright et al., 2007) have been found to be positively associated with misconducts. For parental/family issues, negative relationships with parental figures during childhood, family members criminally active in childhood, and having limited prosocial support from friends were significant predictors of misconduct. This finding is also consistent with prior research, as parental stress (Wright et al., 2007) has been found to be positively associated with misconducts, while having family support has been found to be negatively associated with misconducts (Van Voorhis et al., 2010; Wright et al., 2007). For victimization, limited attachment to family during childhood was a significant predictor of misconduct. This is also consistent with prior research, as childhood abuse (Van Voorhis et al., 2010; Wright et al., 2007) has been found to be significantly associated with misconducts.
For personal/emotional, limited assertiveness, difficulty solving interpersonal problems, feeling intense anger, suppressing anger, and having low frustration tolerance were found to be significant predictors of misconduct. Overall, these findings are consistent with prior research, as anger/hostility (Van Voorhis et al., 2010) and aggression (Davidson et al., 2016) have been found to predict misconducts. Davidson et al. (2016) also found that the Violence Potential Index of the PAI, which combines factors reflective of anger, impulsivity, hostile suspiciousness, and mood swings, was also a significant predictor of misconducts for justice-impacted women. Experiencing intense anger and hostility, as well as interpersonal difficulties that impede the ability to mitigate anger (e.g., low frustration tolerance), are gender-informed factors that contribute to the likelihood of perpetuating misconducts. However, the finding that limited assertiveness skills was associated with a lesser likelihood of engaging in misconducts (including serious misconducts) was contradictory to our expectations. One potential explanation may be that justice-impacted women who are assertive may be more likely to receive institutional charges. While speculative, women who are confident in defending their own self-interests may be more likely to be perceived as combative, argumentative, and defiant that in turn may provoke prison authorities or other incarcerated individuals resulting in more misconducts—both general and more serious.
The CRS evidenced acceptable predictive validity for both any and serious misconducts for the total sample and Indigenous women, albeit the predictive indices were appreciably lower for non-Indigenous women. It should also be noted that the survival analyses underscored the importance of including both the CRS and gender supplements such as our gender-informed scales. This is an important finding given that the study was composed of all 1,555 women admitted between September 28, 2009, and January 8, 2017.
However, the study also illustrated that for the total population the CRS can be improved. The gender-informed scales evidenced greater predictive accuracy than the CRS for both serious and any misconducts. With the exception of the GIM-TriSERIOUS scale for serious misconducts, formal ROC comparisons demonstrated the gender-informed scales were significantly more predictive of misconducts than the CRS (for the total sample, and non-Indigenous women only); this is in conjunction with hierarchical Cox regressions that demonstrated the gender-informed scales significantly added incrementally to the CRS for both serious and any misconducts. Notably, in a previous study, a reweighted CRS also demonstrated greater predictive accuracy than the CRS for justice-impacted women (Rubenfeld, 2014). Collectively, our study and Rubenfeld’s suggest that the predictive accuracy of the CRS can be improved for use with women. But validation on a new sample is required.
Our study was focused on determining whether the CRS could be improved for all federally sentenced women. However, the disaggregated results by (Indigeneity) suggest that the benefits of a gender-informed CRS were largely incurred by non-Indigenous women. Researchers and policy makers alike must carefully consider the benefits and challenges associated with exploring Indigenous-specific scale development for incarcerated women in Canada. Potential benefits would include the opportunity to develop tools from the ground-up for Indigenous women populations that could include Indigenous developers. Potential challenges would include (for example) creating new tools that inadvertently place Indigenous women in even higher security levels given that Indigenous women (relative to their non-Indigenous counterparts) evidence higher needs, lower reintegration potential, and are already overrepresented in segregation and maximum security within CSC (Gutierrez & Wanamaker, 2022). Alternatively, ensuring that CSC decision makers are fully trained in the application of Gladue principles while administering measures such as the CRS is an equally promising avenue to explore. Gladue principles in Canada originated from the Supreme Court of Canada (SCC) ruling of R v. Gladue (1999); the SCC court ruled that courts must consider the social histories of Indigenous people in the context of sentencing.
The findings from this study not only reveal that gender-informed factors are predictive of institutional misconduct, but it also highlights the substantial needs experienced by justice-impacted women. This raises the issue of how to appropriately address gender-informed needs such as abuse, trauma, substance use, and mental health concerns in a carceral setting. While there are ethical considerations with including needs and victimization indicators as predictors that could yield higher security recommendations, there are also ethical considerations with excluding known predictors—potentially jeopardizing inmate, staff, and public safety. As such, we would propose the inclusion of important gender-informed predictors in security classification models for women, contingent upon the expedient provision of gender-informed interventions to address those needs. Importantly, the correctional environment for high-needs women should be similarly gender and trauma-informed to mitigate any potential behavioral impacts of inappropriate/overly austere custody.
In conjunction with potentially augmenting the CRS with gender-informed predictors, CSC should also consider creating an initial security classification built from the ground up for women. The CRS has not changed in 35 years. Arguably this in and of itself provides sufficient cause at a minimum, to re-tool the CRS, or preferably to build an entirely new CRS tool from the ground up for women. Furthermore, the profile of federally incarcerated women (and men) has changed significantly as the CRS was originally created in 1987. Most notable is the vast increase in the proportion of incarcerated individuals who are Indigenous; 40.7% of incarcerated women are now Indigenous (Public Safety Canada, 2022). Given the substantial overrepresentation of Indigenous peoples within the CSC, as stated, Indigenous experts should play a strong role in any re-development of the CRS.
A reflection on Webster and Doob’s (2004) critique of the Blanchette et al. (2002) CRS women-focused study is also warranted. Webster and Doob vehemently argued against the use of the CRS, particularly with Indigenous women. However, counter to Webster and Doob’s conclusions, the CRS performed quite well for Indigenous women in the original Blanchette et al. study. Extrapolating effect sizes (from Table 9 in Blanchette et al.), it can be inferred that the CRS predicted misconducts very well for Indigenous women (estimated AUC = .78) but not as well for non-Indigenous women (estimated AUC = .59). Whether one chooses to interpret the Blanchette et al. study in a positive or negative light is perhaps inconsequential given sample size limitations. Blanchette et al. used a relatively small sample size: 61 Indigenous women, 230 non-Indigenous women.
Our study is not without limitations. First, archival data were used, which limited the ability to match gender-informed predictors with prior studies (e.g., Wright et al., 2007). Prior studies included poverty and self-efficacy as gender-informed domains; however, our study did not include these variables. It would also be advantageous to conduct a similar study with males. Perhaps the CRS could be improved for males by considering more dynamic variables. Derkzen and colleagues (2019) investigated developing and validating a gender-informed risk assessment tool in a sample of federally sentenced women. The authors had a development group, a validation group, and a male comparison group. The gender-informed risk assessment consisted of nine components that were associated with returns to federal custody: Criminal History, Drug Misuse and Unstable Accommodation, Antisocial Personality, Employment, Alcohol Misuse, Negative Childhood Experiences, Violence and Weapons, Support and Resources, and Incidents and Charges. They found that not only did the gender-informed risk assessment tool demonstrate incremental predictive validity over established CSC risk assessments, but the gender-informed assessment had comparable predictive accuracy with men. It should be noted that the Derkzen et al. (2019) gender-informed risk assessment included few variables hypothesized to be more relevant for women. Thus, it is not surprising that it demonstrated predictive accuracy equally well for both men and women.
Finally, our results should be interpreted with caution. We used the same sample to both construct and validate the gender-informed scales. Consequently, the observed relationships between the gender-informed scales and misconducts were most likely inflated. A true test of the incremental validity of our gender-informed scales over and above the CRS is needed using an entirely new validation sample. It is possible that a subsequent validation study would not favor our gender-informed scales over the CRS, but this remains an empirical question.
The study used a women-centric approach. In doing so, it allowed us to test factors that otherwise would not have been included in the original pool of CRS tested items (Solicitor General of Canada, 1987). Importantly, however, a gender-informed approach does not imply that gender-neutral approaches are irrelevant for women. Rather, a gender-informed approach is holistic; it considers gender differences as well as similarities. In alignment with Derkzen et al. (2019), our study supports using a gender-informed approach that incorporates both gender-responsive and gender-neutral perspectives to enhance the validity of classification procedures for women in federal custody.
Supplemental Material
sj-docx-1-cjb-10.1177_00938548231202799 – Supplemental material for The Classification of Federally Sentenced Women in Canada: Addition of Gender-Informed Variables to the Custody Rating Scale Contributes Incremental Predictive Validity
Supplemental material, sj-docx-1-cjb-10.1177_00938548231202799 for The Classification of Federally Sentenced Women in Canada: Addition of Gender-Informed Variables to the Custody Rating Scale Contributes Incremental Predictive Validity by Theresia E. M. Bedard, Kelley Blanchette and Shelley Brown in Criminal Justice and Behavior
Footnotes
Authors’ Note:
We have no known conflict of interest to disclose. We acknowledge and thank Correctional Service of Canada for providing the data used in this study. The views presented in this study are our own and are not those of Correctional Service of Canada.
Supplemental Material
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
