Abstract
Substance abuse is a serious mental health concern and reoffense risk factor for justice-involved youth. The Drug Abuse Screening Test for Adolescents (DAST-A) is used to assess drug abuse in different contexts, yet its psychometric properties have not yet been thoroughly explored in youth justice samples. We examined the measurement invariance and psychometrics of the DAST-A in a diverse sample of 741 justice-involved youth (
Substance use disorders (SUDs) are characterized by co-occurring cognitive, behavioral, and physiological symptoms that result in recurring use or craving of a substance, despite the negative substance-related sequelae (American Psychiatric Association, 2022). According to a 2010 national mental health survey in the United States, the lifetime prevalence of SUDs in youth was 11.4% (6.4% for alcohol abuse/dependence and 8.9% for drug abuse/dependence; Merikangas et al., 2010). SUDs are highly overrepresented in the youth criminal justice arena, with prevalence rates ranging between 22% to 51% in justice-involved youth; rates for cannabis abuse/dependence range from 8% to 45% (Teplin et al., 2002; Wasserman et al., 2005).
These high SUD rates in justice-involved youth are of clinical concern generally and of significant concern specifically within the criminal justice space, as substance abuse is a well-established recidivism risk factor in adults and youth (Dowden & Brown, 2002; Stoolmiller & Blechman, 2005). Recidivism risk assessment tools typically provide an overall estimate of a person’s risk for reoffense and identify specific domains of “criminogenic need,” including substance abuse, to be targeted in rehabilitative interventions. Clinical best practice (Hoge & Andrews, 2011; Vincent et al., 2012) recommends multimethod, multiinformant assessment, including the use of formal tools to assess functioning in criminogenic need domains. Within the substance use domain specifically, the Drug Abuse Screening Test for Adolescents (DAST-A; Martino et al., 2000) has been in use for decades. Given the high prevalence rates of SUDs in youth justice populations and the need for effective substance abuse screening in the context of criminogenic needs assessment, we investigated the DAST-A’s psychometrics and cut-off scores within and across subgroups of justice-involved youth that have been underresearched (young women) and are overrepresented (Black youth) in the justice system.
The DAST and DAST-A
The Drug Abuse Screening Test (DAST; Skinner, 1982a) is a brief screener for drug abuse over the past year. Developed and validated in a sample of adult treatment-seeking SUD patients (
Beyond its initial validation, the DAST’s psychometric properties have been found to be strong in other contexts, including employment (El-Bassel et al., 1997) and criminal justice settings (Saltstone et al., 1994); in different languages, including Turkish (Evren et al., 2014) and Mandarin (Y.-T. Chen et al., 2020); and in various populations, such as different mental health populations (e.g., Cassidy et al., 2008; McCann et al., 2000). The tool has also demonstrated good internal consistency (α = .74–.998; e.g., Cassidy et al., 2008; Y.-T. Chen et al., 2020) and test–retest reliability (α = .75–.85; e.g., Y.-T. Chen et al., 2020; El-Bassel et al., 1997).
With regards to the DAST’s factor structure, some studies have found the scale to be unidimensional (e.g., Y.-T. Chen et al., 2020; Skinner, 1982a), and others as multidimensional, although usually with one dominant factor (e.g., El-Bassel et al., 1997; Saltstone et al., 1994). In terms of convergent and concurrent validity, findings in different populations include correlations between the DAST and other measures of drug use and addiction with medium to large effect sizes (e.g., Cocco & Carey, 1998; Evren et al., 2014), and small to medium associations with mental health symptoms (e.g., anxiety, depression, thought disorder), alcohol abuse, and work performance (e.g., Cocco & Carey, 1998; El-Bassel et al., 1997; correlation coefficient effect size descriptors derived from Cohen, 1988). The tool has demonstrated robust predictive power, with areas under the curve (AUCs) ranging from .77 to .94 in the 20 to 28 item versions, predicting issues with substance use and current diagnosis/identification of drug abuse/dependence in the studied populations (e.g., Y.-T. Chen et al., 2020; Wolford et al., 1999).
Adapted for use with adolescents (Martino et al., 2000), the DAST-A consists of 27 items closely mirroring the DAST’s 28 items. It was initially validated on psychiatric inpatients (
Use of the DAST(-A) in Criminal Justice Settings
Some psychometric properties of the DAST(-A) have been investigated in the criminal justice arena. Saltstone et al. (1994) conducted a preliminary validation of the DAST in women (
The Need to Validate the DAST-A for Use With Diverse Justice-Involved Youth
When assessment tools are used with members of a population upon which the measure was not normed or validated, the results may be unreliable, invalid, or not capture what they aim to measure; these errors can have major negative consequences (Mushquash & Bova, 2007). To address this issue, there is a growing body of research examining the validity of tools used in the criminal justice system in populations that have been obscured or omitted in earlier studies. In light of the dearth of research on justice-involved young women and the overrepresentation of racialized youth in the criminal justice system (e.g., Malakieh, 2020; Owusu-Bempah & Wortley, 2014), it is crucial to examine the DAST-A’s psychometric properties separately for justice-involved young women and racially diverse youth. For (young) women in particular, there is evidence for drug abuse being a particularly salient criminogenic risk factor/need (e.g., Andrews et al., 2012), lending further support for this psychometric study. In addition to psychometric indicators (e.g., reliability), it is key to investigate whether the DAST-A demonstrates measurement invariance: whether the assessed construct has the same structure and/or meaning across different groups (Putnick & Bornstein, 2016). The DAST-A includes questions that may operate differently for different demographic groups, such as questions on accessing supports for drug use, or experiences with arrest in the context of drug use. In addition, in light of prior findings of varying optimal ranges of DAST cut-off scores in different populations (e.g., Cocco & Carey, 1998), there is a need to review the appropriateness of the currently used cut-off score in (subpopulations of) justice-involved youth. As such, the goal of the present study was to examine the measurement invariance and psychometric properties of the DAST-A in a general sample and subgroups (young men and women; White and Black youth) of justice-involved youth.
Method
Participants
The sample consisted of
Table 1 presents demographic and criminal justice data by gender for the total sample. All participants identified as young men or young women; no information was gathered on whether youth identified as cis- or transgender. The sample was ethnoracially diverse, with the largest subgroups consisting of Black youth and White youth. Overall, the majority of youth were charged with violent, but not sexual, offenses. Young men had a higher rate of sexual offense charges than young women, while young women had a somewhat higher rate of violent offense charges than young men. Black youth were less likely than White youth to be charged with a sexual offense, and more likely to be charged with a violent (nonsexual) offense, although the effect size was small, χ2(2,
Demographic and Criminal Justice Variables for Total Sample (
Variables and Measures of Interest
Substance Abuse (Drugs and Alcohol)
DAST-A
The DAST-A is a 27-item yes/no screener for drug abuse in adolescents, with a total score of ≥7 indicating drug abuse behaviors/symptoms of clinical concern, and in need of follow up (Martino et al., 2000). Because Martino et al. (2000) did not provide interpretation recommendations for different ranges of DAST-A total scores, at the mental health agency where the data were collected DAST-A total scores of 3 to 6 are interpreted as reflecting drug abuse in the “borderline” range, drawing from Skinner’s tentative guidelines on interpreting DAST scores (Skinner, 1982b). Endorsed DAST-A items are summed to achieve the total score. In the current study, some youth did not respond to all DAST-A items. To create comparable DAST-A total scores, youths’ scores were averaged across completed items and multiplied by 27.
Youth Level of Service/Case Management Inventory (YLS/CMI) 2.0 Substance Abuse subscale
The YLS/CMI 2.0 assesses 12- to 18-year-old youths’ reoffense risk. Its core consists of a 42-item checklist assessing eight domains of criminogenic need, including substance abuse (Andrews et al., 1990; Hoge & Andrews, 2011). The number of endorsed items per domain yields domain scores, which are summed to calculate a total recidivism risk score. The YLS/CMI has strong psychometric properties, with medium to strong internal consistency and medium to strong predictive power for recidivism (Schmidt et al., 2005). The YLS/CMI Substance Abuse subscale consists of five items, including occasional drug use, chronic drug use, chronic alcohol use, the interference of substance abuse in daily life, and if substance abuse is linked to a youth’s offenses. Given the focus of this study, the alcohol use item was omitted from the Substance Abuse domain score, producing a score ranging from 0 to 4. As all study participants were charged with an offense prior to their 18th birthday, and in light of evidence supporting the use of youth risk assessment tools in emerging adults (Kleeven et al., 2022; Vincent et al., 2019), the YLS/CMI was also used in our sample’s 19-year-old youth (
Alcohol Use Disorders Identification Test (AUDIT)
The AUDIT is a 10-item alcohol abuse screening test, with each item scored between 0 to 4 and higher scores indicating alcohol use of greater concern (Babor et al., 2001). A total score of 8 or more is recommended as a cut-off, indicating harmful alcohol use. The tool has shown good internal consistency reliability and criterion-related validity in different populations (e.g., Reinert & Allen, 2007).
Social, Emotional and Behavioral Functioning
Youth Self-Report (YSR)
The YSR (112 item self-report) assesses internalizing and externalizing problems as well as “syndrome-specific” behaviors (e.g., thought problems and attention problems) in youth 6 to 18 years (Achenbach & Rescorla, 2001). Each item is scored 0 (
DSM diagnoses
As an outcome of the court-ordered forensic assessments conducted at the mental health agency, youth may have been diagnosed with mental health concerns according to
Risk Factors for Criminal Behavior and Reoffense
YLS/CMI criminogenic 2.0 criminogenic
In addition to the Substance Abuse domain score, the other seven YLS/CMI 2.0 criminogenic need domain scores captured risk factors for reoffense: History of Criminal Conduct, Family Circumstances, Education/Employment, Peer Affiliations, Leisure/Recreation, Personality/Behavior, and Antisocial Attitudes. The YLS/CMI total score represented the overall risk for reoffense.
Recidivism
Data on recidivism were acquired from a national police criminal records database. Recidivism was defined as any reconviction within a fixed three-year follow-up period from the sentencing date for the charge which triggered the court-ordered assessment.
Data Analytic Plan
Analyses were performed for the overall sample and separately by youth gender and race (i.e., White and Black youth, as the subsample sizes for the other ethnoracial groups were too small). Due to the small subsample sizes of White and Black young women, conducting intersectional analyses was not possible. Some variables had missing data, resulting in varying sample sizes; therefore, sample sizes are specified in each analysis. Analyses were performed in SPSS v27.0 and v29.0, except for the confirmatory factor analyses (CFAs), conducted in Mplus v8.5 (WLSMV estimator; Muthen & Muthen, 2017). The study data are not publicly available due to their clinical nature; analysis code is available upon request to the corresponding author.
McDonald’s omegas (ω), coefficient alphas (α), mean inter-item correlations (MIC), and mean corrected item-total correlations (MCITC) were calculated to assess the internal consistency reliability of the DAST-A. McDonald’s omega (McDonald, 1970) is akin to coefficient alpha; however it has less stringent statistical prerequisites (Kalkbrenner, 2023) and is less sensitive to the number of scale items than coefficient alpha, and also takes into account the proportion of shared variance across scale scores tied to common factors (Zinbarg et al., 2005). Therefore McDonald’s omega is considered to be a more robust measure of internal consistency. However, because coefficient alpha is more commonly used in the research literature, it is also reported. Coefficient alphas over .70 (Nunnally & Bernstein, 1994), MICs between .20 and .40 (Piedmont, 2014), and MCITCs over .30 (Nunnally & Bernstein, 1994) were deemed acceptable. To these authors’ knowledge there are no guidelines regarding acceptable values for McDonald’s omega, and as such the same rule of thumb for coefficient alpha was used to interpret the McDonald’s omega findings (which was deemed acceptable as McDonald’s omega is a broader measure of internal consistency that should have the same value as coefficient alpha when the prerequisites for alpha are met; Kalkbrenner, 2023).
Due to the skewed and nonnormal distribution of scores, the DAST-A’s convergent and concurrent validity were investigated via partial Spearman’s Rho correlations with the revised YSL/CMI Substance Abuse subscale score (convergent validity) and measures of emotional, behavioral, and criminal behavior constructs hypothesized to be related to drug abuse (concurrent validity), controlling for the effects of age. Using logistic regression (variables entered in a single block) and area under the curve (AUC) of receiver operating characteristic curves (ROCs), we examined the predictive validity of the DAST-A in relation to (a) a diagnosis of an SUD, and (b) recidivism. In addition, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were calculated at Martino et al.’s (2000) recommended DAST-A cut-off score of ≥ 7. AUCs were interpreted according to Hosmer and Lemeshow’s (2004) guidelines (AUCs = .50 no discrimination power; AUCs ≥ 0.70 acceptable discrimination; AUCs ≥ 0.80 excellent discrimination) and with regards to their effect sizes (AUCs of .56 = small effect; .64 = medium effect; .71 = large effect; Rice & Harris, 2005).
Construct validity was assessed by CFA to verify the previously established unidimensional structure of the DAST-A (Martino et al., 2000). Model fit indices were interpreted according to a combination of recommended cut-offs: > .95 for Comparative Fit Index (CFI) and the Tucker Lewis Index (TLI), < .06 for Root Mean Square Error of Approximation (RMSEA), and < .09 for Standardized Root Mean Square Residual (SRMR; Hu & Bentler, 1999). Construct validity was assessed in the overall sample and per subgroup.
Prior to starting a measurement invariance analysis, it is critical to perform CFAs for the combined subgroups (i.e., all young men and women combined in a group, and all Black and White youth combined in a group) and per subgroup, to establish baseline models and ensure there are no estimation concerns; this preliminary step was partially completed in our assessment of construct validity. Building on the initial CFAs, to assess whether the DAST-A was measurement invariant, separate multigroup CFAs were performed for gender and race. Assessing a tool’s measurement invariance involves a series of steps, with each step reflecting a more stringent exploration (Putnick & Bornstein, 2016). The first step, configural invariance, involves comparing the structural equivalence of the tool between each group of interest. Metric invariance is assessed next by comparing the equivalence of factor loadings (λ) across groups. Scalar invariance involves additionally assessing the equivalence of the thresholds/intercepts of the observed variables across groups (Putnick & Bornstein, 2016). With each step, the chi-square values and model fit metrics are compared with the previous step’s values. The analysis is concluded when the fit metrics are significantly different or below/above a cut-off, reflecting a bad fit of the final model to the data; the step at which the analysis is concluded defines the level of measurement invariance achieved. Because chi-square tests are sensitive to sample size, solely relying on differences in chi-square values (Δχ2) between models to assess invariance may lead to the rejection of adequate models (F. F. Chen, 2007). Therefore, this study also relied on more reliable change in fit metrics, using CFI, RMSEA, and SRMR cut-offs: ΔCFI (>−.005 for uneven sample sizes and >−.010 for even sample sizes), ΔRMSEA (< .010 for uneven sample sizes and < .015 for even sample sizes) and ΔSRMR (< .025 for metric and < .005 for scalar for uneven sample sizes, and < .03 for metric and < .01 for scalar for even sample sizes; F. F. Chen, 2007).
Results
Reliability and Validity of the DAST-A
Preliminary Analyses
Table 2 presents data on DAST-A scores and
DAST-A Scores and
Internal Consistency Reliability
For the overall sample, coefficient alpha and McDonald’s omega were .90 and .91 respectively, indicating excellent internal consistency reliability. The MIC and MCITC were both acceptable at
Convergent and Concurrent Validity
As seen in Table 3, large correlations between the DAST-A and substance abuse measure across groups supported the DAST-A’s convergent validity in our sample. The DAST-A also demonstrated concurrent validity, as there generally was a pattern of medium to large positive relations across groups between the DAST-A and measures of alcohol abuse, externalizing and internalizing behavior problems, thought problems, inattention and aggression. For young women there were additional medium-sized correlations with anxiety and withdrawal/depression scales. Finally, broadly there were large correlations between the DAST-A and measures of recidivism risk (YSL/CMI Total Score) and rule-breaking behaviors in all groups. For the overall sample, young men, young women and Black youth, there were also small to medium correlations with the YLS/CMI subdomain scores. For White youth, all correlations with the YLS/CMI subdomain scores had a medium effect size, with White youth having significantly greater correlation coefficients in some domains (i.e., antisocial attitudes, family circumstances, criminal conduct, education/employment) compared with Black youth.
DAST-A Partial Correlations for Total Sample and Subgroups, Controlling for Age
Predictive Validity
SUD diagnosis
The results from the Hosmer–Lemeshow (HL) test and regression model fit metrics are reflected in Table 4. Logistic regression analyses indicated that, with each unit increase in DAST-A score, the odds of being diagnosed with an SUD increased between 18% and 28% depending on the subgroup. The AUCs for the overall sample and all subgroups had (or neared) large effect sizes (AUCs for the overall sample, young men, and Black youth in the acceptable range [.70–.80]; AUCs for young women and White youth in the excellent range [.80–.90]; Hosmer & Lemeshow, 2004; Rice & Harris, 2005). The AUCs of young men and women did not differ (ΔAUC = −0.03,
Predictive Validity Analyses for Total Sample and Subgroups
Recidivism
In contrast to the overall sample and other subgroups, the logistic regression models for young women and Black youth did not fit the data well (reflected in significant HL Tests; Table 4). The regression models including predictors (DAST-A total score and age) were significant for all groups except for young women and Black youth. Based on these findings, the logistic regression results for the subgroups of young women and Black youth were not interpreted further. As seen in Table 4, the results were most striking for White youth: each unit increase in DAST-A score was associated with a 12% increase in recidivism odds. The AUC for White youth neared a large effect size (acceptable range), while the AUC for Black youth suggested no discrimination power (Hosmer & Lemeshow, 2004); the ΔAUC was significant (ΔAUC = −0.22,
Construct Validity and Measurement Invariance of the DAST-A
Construct Validity
The CFA assessing construct validity indicated that the generated model fit the data of the overall sample adequately; while the CFI and TLI fit metrics were slightly below the recommended cut-offs, the analysis identified no modification indices, and the RMSEA was below the cut-off (see Table 5). A similar process was followed for each subgroup, establishing models fitting each subgroup adequately (see Table 5). The SRMR was somewhat higher than desired in all groups; however, in light of the acceptability of the other fit metrics, the models were deemed to fit the data well. In all groups, the DAST-A’s structure was unidimensional.
CFA and Measurement Invariance Results for Total Sample and Subgroups
Measurement Invariance
As reported above, there were no model estimation concerns for the CFAs, with the analyses confirming the tool’s single factor structure.
Young men versus young women
As discussed above, the CFA results of the group of young men and women combined (i.e., for the overall sample) were favorable (referred to as the “baseline model” in Table 5). In the two-group configural model, parameters being tested for invariance were estimated freely (see model fit metrics in Table 5). Overall, the factors generally loaded similarly across groups (the significant standardized item loading ranges were λ
White versus Black youth
In a similar fashion, a baseline one-group CFA was performed for White and Black youth combined (“baseline model” in Table 5), which revealed acceptable results. Next, a two-group configural invariance model was generated, where all parameters were estimated freely. The fit metrics for this model were favorable (Table 5), and the factor loadings generally looked fairly similar across groups (Supplemental Table S2; the significant standardized item loading ranges were λ
Similar to the gender analysis above, we investigated the modification indices to identify which items were performing in the “borderline” range. At the metric level of analysis, Items #25 (self-help-seeking behaviors) and #2 (prescription drug abuse) had the highest modification indices; these items also displayed some of the greatest factor loading discrepancies between groups. Freeing up the factor loadings for these items one at a time resulted in a final partial metric model with slightly better fit metrics compared with the full metric model and a smaller, nonsignificant, Δχ2, suggesting unconstraining these items resulted in a better-fitting model. At the scalar step, a review of the modification indices identified the thresholds of Items #10 (complaints regarding drug use from romantic partners/parents) and #7 (abusing drugs more than once per week) as being of most interest. Unconstraining the thresholds for these items in a partial scalar model resulted in slightly better fit metrics and a lower, yet still significant, Δχ2.
Discussion
Psychometric analyses revealed the DAST-A was unidimensional, had excellent internal consistency reliability and good convergent (large effect-sized correlations with a substance abuse measure), concurrent (medium to large effect-sized correlations with measures of emotional and [criminal] behavioral constructs hypothesized to be related to drug abuse) and predictive validity (for SUD diagnoses), both in the overall youth justice sample and in all subgroups. These findings are consistent with previous DAST-A validation studies in justice-involved (e.g., O’Hagan et al., 2019) and clinical (e.g., Martino et al., 2000) youth samples. The sensitivity (70%) and specificity (79%) at the cut-off of ≥ 7 were adequate for the overall sample; PPV was lower at 56%. There are no universal standards on determining the appropriateness of validity values, with best practices recommending assessing indicators based on screening goals (Trevethan, 2017). In light of our clinical priority of maintaining a balance between high tool sensitivity and specificity, the roughly equal sensitivity and specificity values achieved at a DAST-A cut-off score of ≥ 7 were most optimal, and deemed to be adequate for the purposes of a screening tool used in the context of a forensic assessment. It should be noted that PPV represents the proportion of youth scoring ≥ 7 on the DAST-A who were subsequently diagnosed with an SUD by a clinician. While diagnosis was chosen as the reference standard, it is important to highlight that the DAST-A is used together with clinical/diagnostic interviews and other collateral information to diagnose an SUD in our forensic context. Further, drug abuse does not necessarily equate to having an SUD. A lower PPV, relative to sensitivity and specificity, is thus to be expected. Further, as we were interested in investigating the DAST-A’s accuracy compared with a reference standard, our focus was on the tool’s sensitivity and specificity, versus its PPV and NPV (which are also affected by base rates; for a more fulsome discussion on the difference between sensitivity, specificity, PPV and NPV, please refer to Trevethan, 2017).
Subgroup Analyses
The DAST-A demonstrated robust predictive validity for an SUD diagnosis in Black and White youth, although the AUC was significantly lower for Black youth. Similarly, the odds of Black youth being diagnosed with an SUD based on unit increases in DAST-A scores were lower compared with White youth. The DAST-A’s sensitivity and PPV were below 50% for Black youth, and less robust than those of the other groups. Visual inspection of the ROC suggested a cut-off score of ≥ 4 generated a better (and closer to the other groups’) sensitivity value for Black youth; in other words, lowering the cut-off score decreased the “risk” of underidentifying Black youth in need of follow-up (i.e., decreased the occurrence of false negatives). The findings of our exploration of an alternate cut-off score support having a “borderline range” for the interpretation of DAST-A total scores. They further suggest that in-depth drug abuse assessments may be warranted when DAST-A total scores fall in this borderline range (i.e., lower than the originally recommended cut-off of ≥ 7); this may be particularly true for racialized youth, who may face bias during diagnostic assessment (e.g., Garb, 2021). Compared with the race analyses, the DAST-A’s psychometrics were more similar across young men and women. Nonetheless, the DAST-A better predicted an SUD diagnosis in young women than men, and young women had significantly larger correlations between the DAST-A and an anxiety/depression measure.
With respect to reoffending, DAST-A scores predicted three-year recidivism for White youth, suggesting that drug abuse was a salient criminogenic need for this group. In contrast, the DAST-A did not predict reoffending in Black youth or meaningfully in young men or women as separate subgroups. De Somma et al. (2021) identified different profiles among justice-involved youth who abuse substances. In a group with clinical drug use and low-to-moderate criminogenic needs, almost half the youth were Black, while less than a quarter were White. This finding suggests that, compared with White youth, substance misuse may be less of a pertinent criminogenic need for some Black youth. In all groups the correlations between the DAST-A and overall recidivism risk, rule-breaking behaviors, and some criminogenic needs had medium to large effect sizes; however, the relationships were most extensive for White youth, with medium to large-sized correlations between the DAST-A and all criminogenic domains. These findings are consistent with the idea that substance abuse may be a particularly salient criminogenic need for White youth who abuse drugs, and may be tied to their other criminogenic risk factors.
In terms of measurement invariance across gender and race, full scalar invariance was supported, indicating that the DAST-A had the same unidimensional structure, item loadings, and equivalence of item thresholds. Because scalar invariance was supported in both sets of analyses, comparisons of mean DAST-A scores across groups of justice-involved young men and women, and Black and White youth can be made. Therefore, differences in mean DAST-A scores across groups can be assumed to reflect differences in the latent construct of drug abuse.
While the measurement invariance analysis results provided adequate support for the DAST-A’s invariance in screening for drug abuse across the investigated gender and racial groups, in a series of supplemental analyses we honed in on the items identified as “weak points” in the tool achieving invariance (without causing the tool to become noninvariant as a whole). For example, Item #22 (whether youth had been arrested for drug possession) had the highest modification index in the gender analysis, and had discrepant standardized factor loading across groups (with the factor loading being nonsignificant for young women). Unconstraining this item in the partial scalar model resulted in a better model, supporting greater invariance of the DAST-A. As such, this item may not be as strongly related to the latent factor of drug abuse in young women as in young men. This finding is consistent with the fact that, in the United States, women only represent 25.4% of arrests for drug abuse violations (Federal Bureau of Investigation, 2019).
Similarly, the racial analysis identified some “weaker” DAST-A items (in the context of measurement invariance). Differences in the standardized factor loading coefficients for items on help-seeking behaviors and prescription drug use were identified, with factor loadings being smaller (yet still significant) for these items for Black youth. In addition, the thresholds for the items assessing romantic/family concerns around drug use and frequency of drug use were in the borderline range in terms of equivalence. Unconstraining the factor loadings and thresholds of these items in supplemental analyses resulted in more favorable invariance results, confirming their “borderline” status. These findings may be explained by differences in drug abuse patterns, given there is some evidence that White individuals tend to abuse prescription drugs and hard/illicit drugs more than Black individuals (Broman et al., 2015; Feldstein Ewing et al., 2011). There is also evidence that White justice-involved youth are more likely than Black youth to present with drug and alcohol problems, more extensive SUD-related psychopathology, and combinations of SUDs (with substances other than cannabis; Feldstein Ewing et al., 2011; McClelland et al., 2004). In the current study, White youth had higher rates of
The difference in performance of the help-seeking item may also reflect barriers Black youth face in accessing care. In an investigation of mental health service utilization of youth with emotional disturbances, Garland et al. (2005) reported that, compared with White youth, Black youth were half as likely to access any mental health service. Compared with White youth, racialized youth face more barriers in accessing SUD treatment in nonrestrictive therapeutic settings and are more likely to be found seeking supports in the youth justice space, where the options for effective SUD care are more limited (Aarons et al., 2004). In sum, the weaker performance of the abovementioned items across racial groups suggests these items may not be as salient to assessing drug abuse in Black youth and/or may potentially be capturing the impact of other related constructs such as racial inequities in accessing care. Despite the existence of at least one culture/ethnicity-specific substance abuse screener (the Indigenous Risk Impact Screen; Schlesinger et al., 2007), we have not interpreted our findings as suggesting the need for a separate drug abuse screener for Black justice-involved youth. Instead, we strongly encourage clinicians using the DAST-A (and other tools) to use clinical judgment when following suggested cut-offs, reviewing in which populations the tool has been validated for use, and to ensure they are educated on the unique issues (e.g., structural racism) faced by different demographic groups and on how these issues may impact a youth’s clinical presentation and scale scoring.
Limitations, Future Directions and Conclusion
While our sample being drawn from a clinical database is a strength in terms of ecological validity, it also posed some limitations. First, analyses were limited to the youth represented in the database, and as such sample size limitations prevented analysis of racialized groups other than Black youth. This includes Indigenous youth, who–alongside Black youth–are overrepresented in various jurisdictions (e.g., Canada; Malakieh, 2020). We were also not able to take an intersectional lens in this study due to the small number of White and Black young women. It will be important for future studies to address these limitations to ensure the tool is reliable and valid for use in these groups. In addition, as the youth in our database all received court-ordered pre-sentencing mental health/forensic assessments, they may not be representative of the broader justice-involved youth population; as such, caution should be exercised when extrapolating these findings to other justice-involved groups. Second, while the clinical database provided a breadth of data, our choices on analysis measures were constrained to what was available. It is recommended our finding of an altered cut-off score with better sensitivity for Black justice-involved youth be replicated in independent samples before firm suggestions are made regarding the adoption of an altered cut-off score in this demographic group.
To conclude, we investigated the measurement invariance and psychometric properties of the DAST-A in a diverse sample of justice-involved youth. While the tool performed well on various measures of internal consistency reliability, and convergent, concurrent and predictive validity, there were less robust findings for Black youth on some predictive measures. Exploration of different cut-off scores suggested that a borderline range of DAST-A total scores is clinically useful, and that youth who fall in this range should be considered for further in-depth assessment for drug abuse, particularly racialized youth who may face bias during diagnostic assessment. Due to complex social, political and cultural factors, it is perhaps to be expected that demographic groups will respond differently to questionnaires. In light of these factors, it is key that clinicians working with diverse groups, who may not have been part of a tool’s construction sample, validate their measures for use in the populations they serve and ensure recommended cut-off scores are equally valid.
Supplemental Material
sj-docx-1-cjb-10.1177_00938548241246437 – Supplemental material for Examining the Measurement Invariance and Psychometrics of the Drug Abuse Screening Test for Adolescents (DAST-A) in Justice-Involved Youth
Supplemental material, sj-docx-1-cjb-10.1177_00938548241246437 for Examining the Measurement Invariance and Psychometrics of the Drug Abuse Screening Test for Adolescents (DAST-A) in Justice-Involved Youth by Alexandra Mogadam, Tracey A. Skilling, Michele Peterson-Badali and Liam Hannah in Criminal Justice and Behavior
Footnotes
Authors’ Note:
We have no known conflict of interest to disclose.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
