Abstract
Background:
An alcohol text message intervention recently demonstrated effects in reducing heavy episodic drinking (HED) days at the three month follow-up in young adults with a history of hazardous drinking. An important next step in understanding intervention effects involves identifying baseline participant characteristics that predict who will benefit from intervention exposure to support clinical decision-making and guide further intervention development. To identify baseline characteristics that predict HED, this exploratory study used a prediction rule ensemble (PRE). Compared to more complex decision-tree methods (e.g., random forest), PREs have comparable performance, while generating simpler rules that can directly identify subgroups that do or do not respond to intervention.
Methods:
This secondary analysis examined data from 916 young adults who reported HED (68.5% female, mean age = 22.1, SD = 2.1), were enrolled in an alcohol text message randomized clinical trial and who completed baseline assessment and the three month follow-up. A PRE with ten fold cross-validation, which included 21 baseline variables representing sociodemographic characteristics (e.g., sex, age, race, ethnicity, college enrollment), alcohol consumption (frequency of alcohol consumption, quantity consumed on a typical drinking day, frequency of HED), impulsivity subscales (i.e., negative urgency, positive urgency, lack of premeditation, lack of perseverance, sensation seeking), readiness to change, perceived peer drinking and HED-related consequences, and intervention status were used to predict HED at the three month follow-up.
Results:
The PRE identified 12 rules that predicted HED at three months (R2 = 0.23) using 7 baseline features. Only two cases (0.2%) were not classified by the 12 rules. The most important features for predicting three month HED included baseline alcohol consumption, negative urgency score, and perceived peer drinking.
Conclusions:
The rules provide interpretable decision-making tools that predict who has higher alcohol consumption following exposure to alcohol text message interventions using baseline participant characteristics (prior to intervention), which highlight the importance of interventions related to negative urgency and peer alcohol use.
Introduction
An alcohol text message intervention recently found positive outcomes for reducing the number of heavy episodic drinking (HED) days at three month follow-up (end of intervention) in young adults, after accounting for relevant baseline covariates (eg, age, gender, impulsivity). 1 Notably, although there was an average reduction in HED at three month follow-up, some young adults showed greater response to the digital alcohol intervention than others. The next critical step in maximizing intervention impact involves identifying baseline participant characteristics that predict who benefits to further refine and improve the text message intervention.
Like the main outcome study, we examined HED as the primary outcome. We focused on HED because HED may be associated with adverse consequences, such as interpersonal violence, and impaired work performance.2,3 To identify baseline characteristics that predict HED, we used a prediction rule ensemble (PRE), a recursive partitioning or decision tree method. 4 Advantages of PREs compared to other decision tree methods (e.g., random forest) include using a parsimonious set of prediction rules (i.e., if—then statements) to model associations (e.g., interactions, additive effects) between predictors and outcome. PREs provide a quick and frugal method to predict outcome, which can be readily interpreted by clinicians, and may be useful tools, enabled by technology, for clinicians with limited time.
Building on prior analyses demonstrating the effectiveness of alcohol text message interventions versus control condition in reducing HED at three month follow-up, 1 this exploratory study examined baseline participant characteristics that predicted HED at three months using PRE. To our knowledge, this is the first use of PRE to identify prediction rules for a digital alcohol intervention. Identifying prediction rules can increase understanding of “for whom” the intervention has effects, and indicate features important to predicting the main intervention outcome. The PRE model included treatment assignment as a confirmatory rule known to predict outcome, and generated information on the baseline characteristics most important to predicting HED at three months to aid model interpretation. This data-driven approach maximized predictive accuracy while minimizing complex rule structure, to generate a set of rules that could be readily understood by clinicians to support young adult HED intervention.
Methods
Participants
Young adults (N = 1131; 68.5% female) aged 18 to 25 (mean age = 22.1; SD = 2.1) presenting at Emergency Departments (EDs) in Western Pennsylvania were screened for eligibility if they were in stable condition, provided permission to be approached, were not imminently being discharged, and were not already enrolled. 1 Eligible patients reported >1 HED in the past 30 days and an Alcohol Use Disorder Identification Test for Consumption (AUDIT-C) 5 score of ≥3 for women or ≥ 4 for men. Patients were ineligible if they reported current psychiatric or addictions treatment or did not own a phone with text messaging capability.
This report’s secondary analyses included only cases with written consent and complete data at baseline and three month follow-up (n = 916). At three months, the follow-up rate was 81.1% with no differential attrition between intervention and control groups, or by age or race (ps>0.15). However, women were more likely to complete three month follow-up than men (84.0%vs 74.7% respectively, χ2= 13.7, df = 1, P < 0.01).
Procedures
After consent, participants completed a baseline assessment in the EDs. Following baseline, participants started a two week run-in during which they needed to complete >50% of text message queries delivered twice per week in order to be randomized to 1 of 5 interventions to reduce HED (for details; see 1 ). The 12-week alcohol text message interventions included: self-monitoring of drinking behavior, feedback on drinking plans, feedback on drinking quantity, support for drinking limit goal setting, and the four interventions combined (clinicaltrials.gov: NCT02918565). Participants earned $20 for completing the baseline assessment and $30 for three month follow-up assessment. The University of Pittsburgh Human Research Protection Office approved the protocol for the intervention study and the Rutgers University Human Research Protections Office approved secondary data analyses.
Measures
Sociodemographics included sex assigned at birth (0 = female, 1 = male), age (18-25), race (0 = White, 1 = Other race), Hispanic ethnicity (0 = no, 1 = yes), and college enrollment status (0 = no, 1 = yes), current employment (0 = no, 1 = yes), and living arrangement (0 = alone, 1 = with friends).
Alcohol Use Disorder Identification Test-Consumption (AUDIT-C) 5 included three items assessing frequency of drinking (0 = never to 4 = 4 or more times per week), number of drinks consumed on a typical drinking day (0 = 0 drinks to 5 = 10 or more), and frequency of past month HED (0 = never to 4 = daily or almost daily).
Contemplation Ladder6,7 measured readiness to change drinking behavior rated 0 (eg, “No thought about quitting”) to 10 (“I have changed my drinking and will never go back to the way I was drinking before”).
National Institute on Drug Abuse Modified Alcohol, Smoking and Substance Involvement Screening Test (NM-ASSIST) 8 assessed past three month use of tobacco, cannabis, and any illicit opioid use with response options of “never,” “once or twice,” “monthly,” “weekly” or “daily or almost daily.” Response options were dichotomized (0 = no, 1 = yes) after preliminary analyses indicating that ordinal categories did not provide added information relative to dichotomous coding.
Perceived risk of HED-related harm was assessed by, “How much do people risk physical or other harm when they drink five or more drinks?” rated 0 = no risk to 3 = great risk. Perceived peer HED was assessed with, “How many of your friends would you estimate get drunk at least once a week?” rated 0 = none to 4 = all.
Short UPPS-P Impulsive Behavior Scale 9 included 5 subscales that assessed facets of impulsivity: negative urgency, positive urgency, lack of premeditation, lack of perseverance, and sensation seeking using items rated 1 = agree strongly to 4 = disagree strongly. In this sample, internal consistency of UPPS-P subscales was adequate (Cronbach’s α = .71 to .87).
The three month outcome was number of past month HED days (post-randomization; end of 12-week intervention) derived from a 30-day timeline follow-back (TLFB). 10 Participants used a web-based calendar to report past 30-day drinking. The number of days on which a woman reported four or more standard drinks or a man reported five or more standard drinks counted as a HED day.
Statistical Analyses
Analyses were run in R version 4.2.2 using pre. 4 PREs, like other decision tree models, are a non-parametric method, which can handle a large number of predictors. 4 The PRE model included 21 baseline variables (See Supplemental Materials (Table)) +treatment (vs. control) to predict the outcome of three month HED days (count variable: Poisson regression). The PRE model included a confirmatory rule, based on prior work that the four intervention arms versus control were associated with lower number of HED days at three months. 1 The treatment variable (1 = treatment, 0 = control) combined the four interventions (vs. control), since each intervention showed similar reductions in HED relative to control at the three month follow-up. 1 Inclusion of this confirmatory rule meant that no penalty was applied to the estimated coefficient for this confirmatory rule, and an unbiased coefficient estimate was generated. 4
PRE model tuning used caret
11
to identify model parameters which optimized predictive accuracy for HED. Model tuning involved a grid search with ten fold cross-validation (see Supplemental Materials for details) across parameters (default model parameter values are bolded) of maximum depth (2,
Results
After tuning, the final PRE model identified 12 rules to predict HED at the three month follow-up, using 10-fold cross-validation. The prediction model generated the 12 rules using six of the 21 baseline variables and treatment assignment (seven total predictors; Table 1) to predict degree of response to intervention, that is, number of HED drinking days at the three month follow-up.
Final Prediction Model: Baseline Features Predicting 3-Month Heavy Episodic Drinking.
LEGEND: AUDIT = Alcohol Use Disorder Identification Test; HED = heavy episodic drinking
AUDIT1 = Frequency of drinking (range: 0-4): 0 = never, 1 = monthly or less, 2 = 2 to 4 times/month, 3 = 2 to 3 times/week, 4 = 4+ times a week.
AUDIT2 = Quantity consumed on a typical drinking day (range: 0-5): 0 = 0 drinks, 1 = 1 to 2 drinks, 2 = 3 or 4 drinks, 3 = 5 or 6 drinks, 4 = 7 to 9 drinks, 5 = 10+ drinks.
AUDIT3 = Frequency of HED: 0 = never, 1 = less than monthly, 2 = monthly, 3 = weekly, 4 = daily or almost daily.
Negative Urgency subscale score (range: 4-16).
Friends = Perceived number of friends who get drunk weekly (range: 0-4): 0 = none, 1 = a few, 2 = some, 3 = most, 4 = all.
HED Risk = Perceived risk of HED-related harms to health (range: 0-3): 0 = no risk, 1 = slight risk, 2 = moderate risk, 3 = great risk.
N = 2 cases (0.2%) out of 916 were not classified by any of the above 12 prediction rules (range: 0-8; mean = 5.0, SD = 1.3).
Notes: Final prediction model ensemble with cross-validation (CV) error within 1-standard error (SE) of minimum; CV error type: Poisson Deviance. Lambda = 0.21, mean CV error (SE) =3.70 (0.20).
To help interpret the prediction rules, Poisson regression coefficients are exponentiated. For example, the intercept indicates for individuals who do not meet the conditions of any rule, they are predicted to report (e1.56 = 4.76) 4.76 HED days at three month follow-up. As another example, patients assigned to treatment (vs control) who do not meet any other rule are predicted to report (e1.56-0.26 = 2.94) 2.94 HED days at three month follow-up.
Results are not presented as a figure (“decision tree”) because the rules represent combinations that provide parsimonious classification. The rules are not necessarily hierarchical (ie, a representation of branching subsets).
The first rule, representing the largest association with HED outcome at three months, involved intervention status (vs control) (Table 1). The next two rules with the largest coefficients used AUDIT-C items (i.e., frequency of drinking, frequency of HED) in combination with either reporting more than one friend who gets drunk at least weekly or a relatively high negative urgency score (≥12), respectively.
Predictor importance analysis (Figure 1) indicated that, among the seven predictors used to generate the rules, AUDIT-C frequency of drinking had the largest estimated importance coefficient, followed by AUDIT-C HED frequency, and negative urgency subscale score. Other important predictors of three month HED included intervention status, perceived number of friends who get drunk weekly, AUDIT-C alcohol quantity consumed per occasion, and perceived HED-related harm. Only two cases (0.2%) were not classified by the 12 rules, indicating good coverage with a relatively small number of baseline predictors and intervention status (seven total predictors).

Importance of baseline features as predictors of heavy episodic drinking at 3-month follow-up.
Discussion
Clinicians are in high need of quick assessments that can be used to efficiently identify and refer high-risk young drinkers who are unlikely to seek or receive intervention. 12 This study uniquely begins to address this critical gap by generating empirically derived rules using baseline characteristics to predict HED frequency, which can help to inform more effective personalized intervention referral. The prediction model identified 12 rules, culled from 21 possible characteristics, in addition to intervention status. The most important baseline features for predicting HED at three months included AUDIT-C items, negative urgency score, perceived peer drinking behavior, and perceived risk of possible HED-related health consequences. Even with a small set of six baseline features and treatment status (seven total), the rules covered almost all cases, with few cases (0.2%) not classified by any rule, supporting the PRE as an efficient and parsimonious prediction method.
Rules associated with lower frequency of HED at follow-up involved lower baseline AUDIT-C frequency of consumption and greater perceived HED-related harms to health. In contrast to the importance of pattern of alcohol consumption in predicting the three month HED outcome, baseline stage of change was not a key predictor, suggesting that motivational processes during the intervention, relative to baseline motivation and readiness to change, may be more important to the three month HED outcome.13,14
Higher frequency of HED at follow-up was associated with rules that involved baseline AUDIT-C frequency of HED and frequency of drinking, number of friends who get drunk at least weekly, and negative urgency score. These baseline characteristics suggest the importance of peer behavior among young adults, along with the need to better address the interpersonal and social context of HED.15,16 Results also suggest increasing support for coping responses in reaction to upsetting events to prevent and reduce HED that might occur “on the spur of the moment.”17,18 In this regard, prediction rules could inform development of more personalized intervention components that address baseline characteristics associated with HED at three month follow-up to improve outcomes for young adults who show limited response to intervention.
Study limitations warrant comment. Generalizability of results is limited to young adults recruited from EDs who had a phone with text messaging, a majority who self-identified as White race. Analyses included participants with complete data at baseline and three month follow-up, with females (vs. males) more likely to complete follow-up. Measures were based on self-report. Although ten fold cross-validation was used with a relatively large sample, results warrant replication with a larger, more diverse sample. Analyses focused only on HED, which has limitations, 19 as the outcome. In line with contemporary recovery definitions,20,21 other outcomes, such as improved quality of life and reduction in alcohol use disorder symptoms represent important non-abstinent outcomes that would provide a holistic view of an individual’s functioning in daily life. These example outcomes represent important future directions and guidance for prediction models of non-abstinent recovery outcomes.
Conclusions
This exploratory study identified 12 rules based on a small set of six baseline variables and intervention status (seven total features) that predict HED days at three month follow-up. Future possible clinical application might involve computerized assessment of key baseline predictors of intervention effects on HED days, which include drinking pattern, negative urgency, perceived number of friends who get drunk weekly, and perceived risk of possible HED-related consequences. The rules balance prediction performance and ease of interpretation to estimate short-term HED outcome among young adults exposed to a text message intervention.
Supplemental Material
sj-docx-1-saj-10.1177_29767342231206653 – Supplemental material for Prediction Rules Identify Which Young Adults Have Higher Rates of Heavy Episodic Drinking After Exposure to 12-Week Text Message Interventions
Supplemental material, sj-docx-1-saj-10.1177_29767342231206653 for Prediction Rules Identify Which Young Adults Have Higher Rates of Heavy Episodic Drinking After Exposure to 12-Week Text Message Interventions by Tammy Chung, Brian Suffoletto, Sarah W. Feldstein Ewing, Trishnee Bhurosy, Yanping Jiang and Pamela Valera in Substance Abuse
Footnotes
Author Contributions
TC, BS originated the project and TC, BS obtained funding. TC drafted the initial manuscript. TC and BS secured the data. TC conducted the analyses. All authors (SFE, TB, YJ, PV) participated in interpreting the results, contributed to the writing of the manuscript, provided critical feedback to the manuscript, and approved the final manuscript draft for submission.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: NIAAA R01AA023650 and R21AA030153. The sponsor had no role in developing or the review and approval of the content of this manuscript.
Compliance,Ethical Standards,and Ethical Approval
Institutional Review Board approval was not required.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
