Abstract
The National Institutes of Health (NIH) data harmonization project on existing measures (www.phenx.org) has recommended the Global Appraisal of Individual Needs (GAIN)–-Short Screener (GSS) as one of the most reliable, valid, efficient, and inexpensive general behavioral health screeners to quickly identify people with internalizing and externalizing mental health disorders, substance use disorders, and crime/violence problems. The present study examined how well the four GSS screeners and their sum predict future arrest or incarceration among individuals entering treatment for a substance use disorder. Using a cross-validation design, a diverse sample of 6,815 youth with substance use disorders was split into a development sample and a validation sample. Overall, results found the GSS's crime and violence screener (CVScr) and the substance disorder screener (SDScr) to be the two best predictors of arrest/incarceration within the 12 months following treatment intake. Additionally, we found that these screeners could be used to categorize individuals into three groups (low risk, moderate risk, high risk) and this simplified classification had good predictive validity (Area Under the Curve = 0.601). In sum, the GSS's predictive validity was similar to other instruments that have been developed to predict risk for recidivism; however, the GSS takes only a fraction of the time to collect (ie, approximately 2–3 minutes for just these two screeners).
Introduction
In the United States, there exists a high degree of overlap between substance use and criminal behaviors such that among youth in treatment for substance use disorders (SUDs), 50% are also involved in the juvenile justice system (JJS), 1 and among youth in community supervision, 50% require treatment for SUDs.2,3 The JJS system is the largest source of referral to adolescent SUD treatment, yet less than one-third of the youth in the JJS with an SUD receive any treatment, with even fewer receiving the kind of evidence-based treatments associated with better outcomes.4–7 In most of the SUD treatment system for adolescents, there is currently no widespread practice of assessing for risk of recidivism or using this assessment of risk to assign them to interventions that might help reduce their risk.
Calls for an integrated public safety (eg, supervision, judicial hearings, consequences) and public health (eg, community treatment) approach for those having co-occurring drug and criminal activity problems have been made, 8 with the realization that the limitations of pursuing either approach singularly translates into serious consequences for this population, such as increased JJS involvement, higher rates of the human immunodeficiency virus (HIV) and other sexually transmitted infections, victimization, mental health problems, family and environmental problems, health problems, and death.3,9–15 On the positive side, it has been found that when integrated treatment approaches that emphasize both public safety and public health can be provided in a tailored fashion to individuals based on their level of risk (an idea known as the Risk Principle), it leads to even more effective outcomes. 16 Per the Risk Principle, 17 those at high risk are provided more intensive treatment than those at low risk.
Borne out of a need to assess risk, interest in developing reliable and valid assessment instruments to identify individuals’ levels of risk has flourished. The justification for developing these instruments is further based on a considerable body of research that has demonstrated the superiority of actuarial methods (ie, those that rely on standardized assessment tools) over “professional judgments” in limiting biases and disparate outcomes18–22 and perhaps a recognition that assessing risk and tailoring treatments respectively is better than the null behavior of not conducting risk assessment at all. Within the current behavioral health landscape, given the large magnitude in co-occurrence of SUD and crime, multiple expert groups have recommended assessing and treating a wide range of behavioral health needs to reduce both substance use and recidivism15,21,23–29; however, a recognition that the need for services often exceeds available resources and that limited resources be used efficiently has led to a focus on developing simpler screening tools with the goal of helping make efficient decisions regarding the need for further assessment and/or treatment placement.
The National Institutes of Health (NIH) data harmonization project on existing measures (www.phenx.org) has recommended the Global Appraisal of Individual Needs (GAIN)–-Short Screener (GSS) 30 as one of the most reliable, valid, efficient, and inexpensive general behavioral health screeners to quickly identify people with internalizing disorders, externalizing disorders, substance disorders, and crime/violence problems. Using a Rasch measurement model, each five-item GSS screener has been shown to have over 90% sensitivity, 90% specificity, and 90% area under the curve (AUC) relative to the 16–40 item versions in the full GAIN. 30 Moreover, the GSS 30 requires minimal staff training, no certification or licensure requirements, takes only three to five minutes to administer, and costs only $100 for five years of unlimited use. The prior study by Dennis and colleagues, 30 however, did not examine the ability of the GSS to predict subsequent arrest or incarceration.
The most recent meta-analysis of juvenile justice risk assessments 31 identified 28 risk assessment instruments, with two of the most commonly studied assessment instruments being the Youth Level of Service/Case Management Inventory (YLS/CMI)32–34 and Psychopathy Checklist–-Youth Version (PCL-YV). 35 The weighted average AUC of all 28 assessment instruments was 0.640 and ranged from 0.532 to 0.780, indicating that no current risk instrument offers excellent prediction of future behavior. Additionally, even within the same measure, the AUC varied significantly (eg, 30- to 40-minute YLS/CMI varied from 0.571 to 0.750 across 11 studies and the 90- to 120-minute PCL-YV varied from 0.644 to 0.780 across three studies). In addition to their time to administer, these measures take considerable effort to train and conduct quality assurance, and in some cases require 60-minute or longer collateral interviews (eg, for PCL-YV). Thus, there appears to be a need for brief screeners that can assist in determining whether or not to invest additional resources for more detailed assessment, more intensive monitoring, or more secure level of service placements.
Although there are “screening versions” of these existing instruments available, the Level of Service Inventory –-Revised: Screening Version (LSI-R:SV) 36 still requires 10–15 minutes and the Psychopathy Checklist: Screening Version (PCL:SV) 37 requires approximately 45 minutes (plus a 35-minute collateral interview). Additionally, each of these screeners has an administration fee per assessment and lacks published studies of their predictive validity. As such, they may not meet the pressing need for reliable, valid, efficient, and low cost screeners to predict risk for future offending.
In summary, selection of a screening and/or assessment instrument is an important decision that requires carefully balancing several important factors (eg, reliability, validity, staff training/administration time, and cost per administration). While previous research has demonstrated the GSS excels in these areas relative to the longer GAIN at intake, 30 it did not look at the GSS’ predictive validity in terms of subsequent criminal activity. Thus, the primary goals of this paper were to: 1) examine the extent to which each of the GSS screeners and their sum predict arrest or incarceration within the 12 months following SUD treatment intake, 2) examine which of the GSS screeners or their sum is the most predictive, and 3) develop and validate a relatively simple classification system for the GSS that would facilitate its implementation as a practical risk measure that could assist in efficiently making appropriate decisions about the level of additional assessment and/or treatment that individuals need.
Method
Data Source
Data for the current study came from the 2011 GAIN Summary Analytic dataset, which is currently one of the largest SUD treatment datasets, with longitudinal outcomes (intake and 3-, 6-, 9-, and 12-month follow-up) on 29,782 individuals with substance use disorders between 2002 and 2011. Over two-thirds of the assessments were conducted by independent investigators, funded by a wide range of organizations (eg, Substance Abuse and Mental Health Services Administration's Center for Substance Abuse Treatment, National Institute on Alcohol Abuse and Alcoholism, National Institute on Drug Abuse, Robert Wood Johnson Foundation), and conducted in a variety of adolescent settings and levels of care. All data were collected as part of general clinical practice or specific research studies under their respective voluntary consent procedures. Data were subsequently deidentified and made available for secondary analysis under the supervision of Chestnut Health Systems Institutional Review Board.
Sample
Using the 2011 GAIN Summary Analytic dataset as a starting point, we only included cases that met the following inclusion criteria, which maximize the generalizability and validity of our final results. These inclusion criteria were: (a) adolescents between the ages of 12 and 17 years of age (n = 22,976), (b) who were from SUD treatment sites having a 70% or greater 12-month follow-up interview rate (calculated as the number who completed a 12-month follow-up assessment interview divided by the number due for a 12-month follow-up assessment interview) (n = 9,261), and (c) who either had a 12-month assessment or who indicated they had been arrested/incarcerated in the past 90 days at their 3-, 6-, or 9-month assessment interview (n = 6,925). Additionally, we further subset to those having valid predictors and dependent variables necessary for our analysis in order to obtain a final sample of 6,815 youth from 55 treatment sites across the United States. A 70% follow-up rate falls within an acceptable range of follow-up rates (65–80%) that have been demonstrated with empirical support as not compromising the credibility of study findings. 38 Figure 1 illustrates the subset criteria and participant flow.

Participant flow diagram.
Measures
For this study, we utilized data from GSS items that are embedded in the full-length GAIN-I assessment. Validity for the GAIN instrument has been documented in many prior studies using multiple methods (eg, urine tests, collateral reports, Rasch measurement models, time-line follow-back).39–49 Additionally, as noted in the introduction, validity for the GSS has been established through sensitivity, specificity, and area under the curve analyses. 30 Finally, the GSS measures recency of having problems, with response choices including: past-month (score of 3), 2–12 months ago (score of 2), over a year ago (score of 1), and never (score of 0). To obtain a score for each screener, the number of responses endorsed per time period are counted. In this case, we were interested in past-year scores on the GSS and thus counted responses of 2s and 3s to obtain a final score between 1 and 5 for each screener.
Independent Measures
The primary independent measures were continuous and grouped versions of each of the four 5-item GSS screeners and their sum (20 items total). The Internalizing Disorder Screener (IDScr; alpha = 0.74) contains items related to depression, anxiety, trauma, and suicide. The Externalizing Disorder Screener (EDScr; alpha = 0.76) contains items on inattentiveness, hyperactivity, conduct, gambling disorders, and other impulse control problems. The Substance Disorder Screener (SDScr; alpha = 0.76) contains items about frequent use, abuse, dependence, and induced disorders and symptoms. Finally, the Crime and Violence Screener (CVScr; alpha = 0.72) contains items related to domestic violence, as well as property, drug, and violent crimes. The Total Disorder Screener (TDScr; alpha = 0.87) is the sum of the 20 items from these 4 screeners. Copies of the instrument, manual, psychometrics and publications related to the GAIN SS are available at www.gaincc.org/gainss.
Dependent Measure
The primary dependent variable for the current study was 1-year arrest/incarceration based on self report. This operationalization is consistent with part 1 of the Council of Juvenile Correctional Administrators’ (CJCA) two-part standard for measuring recidivism, 50 which includes 1) the commission of an offense (in our case, operationalized as arrest or incarceration), and 2) by an individual already known to have committed at least one other offense. Our dichotomous (yes/no) outcome variable for 1-year arrest/incarceration was calculated using items from four quarterly follow-up assessments during the year post-intake (3-, 6-, 9-, and 12-month follow-up). The following GAIN follow-up items were used in the operationalization of arrest/incarceration: 1) When was the last time you were arrested and charged with a crime?; 2) During the past 90 days, how many times have you been arrested and charged with breaking a law?; 3) During the past 90 days, how many days have you been in juvenile detention, jail, or prison?; 4) A re you currently in jail, prison, or detention?; 5) Are you currently involved with the criminal justice system in any of the following ways: In jail or prison?; and 6) Are you currently involved with the criminal justice system in any of the following ways: In detention? Any positive response (Yes or 1+ days) was sufficient to indicate 1-year arrest/incarceration.
Procedures
As noted previously, data collection for this study followed guidelines set forth by each site's voluntary consent procedures and were either part of specific research studies or general clinical practice. Data pooled for secondary analysis purposes are under data sharing agreements from Chestnut Health Systems’ Institutional Review Board. All treatment sites received standardized training and quality assurance checks of their data collection to facilitate comparisons with other sites utilizing the GAIN instrument.
Analytic Procedures
The initial sample of 6,815 youth was split into two random split half samples stratified by site in order to run cross-validation analyses. The first was used for initial model development and the second for validation. There were no significant differences in these samples on key variables, such as gender, race, level-of-care, treatment type, co-occurring disorders, substance use severity, victimization, and intensity of juvenile justice involvement. Because ours was an exploratory study of the GSS in predicting recidivism, the main analyses consisted of a) running bivariate logistic regressions separately using each of the four GSS screeners and their sum at baseline to predict the dichotomous 12-month arrest/incarceration, then b) running multivariate analyses in a stepwise fashion. The stepwise regression started with all five baseline predictor variables in the model, and then removing one non-significant predictor at a time from the model and testing the change in model fit. At each step, we also ran AUC analyses to determine the accuracy of the prediction, where 0.5 is no better than chance and 1.0 is perfect prediction. Unlike cross-sectional use comparing two measures (where AUC should be high), when predicting 12 months into the future they are often much lower so it is important to compare how well a new measure works relative to what other existing measures can do. To interpret our results we compared them to Schwalbe's 31 meta-analysis of juvenile justice recidivism studies, in which the bottom 25% of the studies had AUC in the range of 0.532 to 0.594 (poor), the middle 50% of the studies had AUC between 0.595 to 0.718 (good), and the top 25% of the studies AUC between 0.719 to 0.780 (excellent).
Results
Sample Characteristics
Table 1 displays the characteristics of the development and validation samples. Note the samples do not differ on any of the demographic and/or pretreatment characteristics at a 0.05 alpha level. The development sample was mainly male (76%) with a mean age of 15.1 years. Most were White (41%), followed by Hispanic (28%), African American (16%), Mixed Race (11%), and Other race (5%). The majority of adolescents (54%) had used substances for three or more years, and met full criteria for past-year SUD (87%). Only 28% perceived they had a substance problem, but 75% reported recognizing a need for some kind of treatment. The majority (69%) had a co-occurring disorder, with 55% meeting criteria for Conduct Disorder, 72% reporting past-year physical violence, and 73% reporting past year illegal activity. Additionally, nearly half of the adolescents (48%) reported a high degree of victimization in their lifetime.
Sample characteristics.
Bivariate and multivariate analyses
Results of the bivariate analysis (see Table 2) indicated that, with the exception of the EDScr, moderate and high scores on each of the screeners, relative to low scores, were predictive of arrest/incarceration in the year after intake in the development sample (P < 0.05). Values for the AUC suggest that any of the sub-screeners would individually predict arrest/incarceration more accurately than by chance (ie, each of the four AUC statistics were significant at P < 0.05). Still, because the AUC for each of the screeners individually indicated only slightly better prediction than by chance, we wanted to examine whether the screeners together would perform better. Table 3 shows the results of the multivariate analysis, in which all four predictors were initially included and a stepwise procedure was applied in which one predictor that did not significantly contribute to the solution was removed from the model at each step. The four screeners and their sum could not be in at the same time due to multi-collinearity. Since its AUC was lower than several shorter screeners, we chose to drop the sum based on the parsimony principle. Results indicated that EDScr and IDScr did not significantly contribute to a multivariate model in which the GSS screeners predict arrest/incarceration in the year following intake (Model 1 and Model 2); however, CVS and SDS together were significant predictors of arrest/incarceration (Model 3; P < 0.001). Values of the AUC in the multivariate analysis phase indicate that each of the models predicted arrest/incarceration more accurately than by chance (ie, each of the three AUC statistics were significant at P < 0.001).
Bivariate results using development sample (n = 3,420).
Multivariate results using development sample (n = 3,420).
Interaction Analyses
Because multivariate analyses indicated that the CVS and the SDS were the two best predictors of arrest/incarceration in the year after intake, it was decided that their cross-interaction should be examined more closely. To this end, each of the three levels of the CVS (low, moderate, and high) were crossed with each of the three levels of the SDS (low, moderate, and high) to form a 3 x 3 = 9 level variable. Figure 2 presents the nine-level variable resulting from the CVS x SDS cross on the x-axis. The y-axis on this figure is the percent of adolescents who were arrested/incarcerated in the year after intake. The positive slope of the line suggests that adolescents with low scores on the CVS and SDS at baseline (far left) were less likely to be arrested/incarcerated and that those with high scores on the CVS and SDS at baseline (far right) were at greatest risk to be arrested/incarcerated. The AUC was 0.614 (P < 0.001), indicating that prediction using this nine-level variable was significantly more accurate than chance, and thus this nine-level variable could be used to triage adolescents into relative risk for arrest/incarceration in the year after intake based on scores on the GSS at baseline.

Arrest/Incarceration for 9-Level (CVS x SDS) grouping using development sample (n = 3,420).
Due to the large number of levels (ie, nine) in this solution, we thought its real-world application may not be optimal. Thus, based on visual inspection of Figure 2, we found that rates of arrest/incarceration appeared to cluster in the 40% range, the 50% range, and the 60% range (these clusters are separated by large parentheses in Fig. 2). The cluster to the left includes any adolescents who score ‘low’ on the CVS (regardless of their scores on the SDS) or those who score ‘moderate’ on the CVS and ‘low’ on the SDS. The middle cluster includes adolescents who score ‘moderate’ on the CVS and either ‘moderate’ or ‘high’ on the SDS, or ‘high’ on the CVS and ‘low’ on the SDS. Finally, the cluster on the right includes adolescents who score ‘high’ on the CVS and either ‘moderate’ or ‘high’ on the SDS. These clusters correspond to three distinct groups–-low, moderate, and high risk for arrest/incarceration–-which we believe is a more practical solution to help GSS users interpret baseline scores on the CVS and SDS with respect to predicting future arrest/incarceration. Analyses using this three-group solution (see Fig. 3) indicated it was a significant predictor of arrest/incarceration during the year after intake (AUC = 0.605, P < 0.001) and that the AUC of this three-group solution (0.605) is not significantly different than that of the nine-group solution (0.614) based on their overlapping 95% confidence intervals as well as results of effect size (ES) analyses comparing the difference between the two AUCs, which yielded an insignificant ES of 0.02.

Arrest/Incarceration for 3-group simplified solution using developmental sample (n = 3,420).
Cross-validation analysis
Using the validation sample, we examined rates of arrest/incarceration and the AUC for both the nine- and three-group solutions. Figure 4 presents the results of the nine-group solution with the validation sample. Similar to results with the development sample, we found rates of arrest/incarceration to cluster in the 40% range, the 50% range, and the 60% range (these clusters are separated by large parentheses in Fig. 4). The AUC for the nine-group solution was 0.603 and was significant at P < 0.001.

Arrest/Incarceration for 9-Level (CVS x SDS) grouping using validation sample (n = 3,395).
Figure 5 presents the results of the simplified three-group solution using the validation sample. Again, similar to the findings from the development sample, the rate of arrest/incarceration increased from 42% for adolescents in the low risk group to 52% (moderate risk group) and then to 65% (high risk group). The AUC for the three-group solution was 0.601 and was significant at P < 0.001.

Arrest/Incarceration for 3-group simplified solution using validation sample (n = 3,395).
Discussion
Given the high rate of overlap in youth who have SUDs and involvement in the juvenile justice system, expert panels have consistently recommended evidence-based screening and assessment for a range of behavioral health issues. The high costs and need to target the limited assessment resources suggest the need for a multistage process, starting with an efficient screener to do the initial triage. This study built on earlier research showing the value of the GSS as a triage tool for clinical assessment and demonstrated its value as a reliable, valid, efficient, and inexpensive tool for triaging youth based on their risk for future arrest or incarceration and in accordance with the Risk Principle.
Schwalbe 31 noted that “The hallmark of the actuarial approach is an empirical development strategy that separates instrument development from validation,” as we have done here. Instrument development here was conducted on an ‘estimation sample,’ which was used to predict recidivism and create the 3 risk-groups. Instrument validation took place on a separate sample in which the overall predictive validity of the index was tested. Based on analyses with the development sample, each of the four GSS screeners (ie, IDScr, EDScr, SDScr, CVScr) were significant predictors of arrest/incarceration during the year after intake (AUCs ranging from 0.551 to 0.601). A series of multivariate analyses using the development sample suggested that the combination of all four screeners was the best predictor of future arrest/incarceration (AUC = 0.619, P < 0.001), but that a similar level of prediction could be achieved with just the SDScr and CVScr (AUC = 0.615, P < 0.001). Thus, among the four GSS screeners, CVScr and SDScr were found to be the two best predictors of future arrest/incarceration. Additionally, results indicated that both a nine-group and simplified three-group classification approach were significant (P < 0.001) predictors of future arrest/incarceration, with AUCs of .614 and 0.605, respectively. After completion of analyses using the development sample, analyses were conducted on the validation sample. These subsequent analyses focused on examination of the predictive validity of both the nine- and three-group solutions. Consistent with results from the developmental sample, analyses using the validation sample indicated that both the nine-group and simplified three-group classification approach were significant (P < 0.001) predictors of future arrest/incarceration, with AUCs of 0.603 and 0.601, respectively.
Overall, the AUCs reported as part of this study fall within the good or middle 50% range (0.595 to 0.718) of AUCs reported within Schwalbe's meta-analysis of juvenile risk assessments. 31 While it must be acknowledged that other assessments, such as the YLS-CMI32,33 or PCL-YV, 35 have better predictive validity, the full and screening versions of these other instruments take three to 60 times longer to complete than the five-minute GSS or the 2–3 minutes it would take for just the CVScr and SDScr. Thus, the GSS not only appears to represent a highly efficient screener for internalizing disorders, externalizing disorders, substance disorders, and crime/violence, 30 but also a highly efficient screener for future arrest or incarceration. It could also be used in a multi-step process to decide “whether” to invest in a more extensive risk assessment.
Strengths and limitations
The study's strengths include a large, diverse, and multi-site sample, cross-validation design, and multiple follow-up assessment points of data. Yet, it also has some limitations that must be acknowledged. First, all data, including arrest/incarceration, was based only on self-report. Ideally this should be validated with records. Second, the follow-up time period for arrest/incarceration was limited to one year. Third, because individuals used for the current study were adolescents with SUD presenting to treatment, it should ideally be replicated in other samples and settings.
General summary and directions for future research
Overall, the current study provides evidence that supports use of the CVS and SDS of the GSS as very brief, yet valid predictors of criminogenic risk among youth in SUD treatment. Thus, in addition to serving as a cost-effective front door screener to identify people with co-occurring disorders across multiple systems, 30 the GSS also can serve as a cost-effective risk screener. Importantly, although a key goal of risk assessment is to help classify individuals into different risk categories, it is equally (if not more) important that once a reliable and valid risk prediction system has been implemented to match the level of risk with the most appropriate level of treatment service(s). 51 Thus, in addition to further improving the predictive validity of risk screeners and assessments, future research on the integration of risk assessment and treatment planning is needed. 52 It also would be beneficial for future research to focus on future illegal activity, which is more than twice as common as future re-arrest or incarceration. While the two mental health screeners did not necessarily add to our ability to predict recidivism, the rates of mental health problems and need for associated services were high. Since most systems care about other outcomes as well (eg, suicide, change in victimization or emotional problem, family problems, HIV risk behaviors), it would still make sense to use the full screener.
Author Contributions
Conceived and designed the experiments: MLD, BRG. Analyzed the data: BRG, VKB, MLD. Wrote the first draft of the manuscript: BRG, VKB. Contributed to the writing of the manuscript: BRG, VKB, MLD. Agree with manuscript results and conclusions: BRG, VKB, MLD. Jointly developed the structure and arguments for the paper: BRG, VKB, MLD. Made critical revisions and approved final version: BRG, VKB, MLD. All authors reviewed and approved of the final manuscript.
Disclosures and ETHICS
As a requirement of publication the authors have provided signed confirmation of their compliance with ethical and legal obligations including but not limited to compliance with ICMJE authorship and competing interests guidelines, that the article is neither under consideration for publication nor published elsewhere, of their compliance with legal and ethical guidelines concerning human and animal research participants (if applicable), and that permission has been obtained for reproduction of any copyrighted material. This article was subject to blind, independent, expert peer review. The reviewers reported no competing interests.
