The Opioid Abuse Risk Screener predicts aberrant same-day urine drug tests and 1-year controlled substance database checks: A brief report

Abstract

The Opioid Abuse Risk Screener was developed to support well-informed decision-making in opioid analgesic prescribing by extending the breadth of psychiatric risk factors evaluated relative to other non–clinician-administered measures. We examined the preliminary predictive validity of the Opioid Abuse Risk Screener relative to the widely used Screener and Opioid Assessment for Patients with Pain–Revised in predicting aberrant urine drug tests and controlled substance database checks. The Opioid Abuse Risk Screener is significantly different from the Screener and Opioid Assessment for Patients with Pain–Revised in predicting aberrant same-day urine drug tests (Z = 2.912, p = 0.0036) and controlled substance database checks within 1 year of assessment (Z = 3.731, p = 0.0002). Promising preliminary analyses using machine learning methods are also discussed.

Keywords

assessment chronic pain controlled substance controlled substance database machine learning opioid abuse prescription drug abuse risk stratification scale urine drug testing validation

Introduction

Prescription opioid drug abuse continues to be a major public health concern globally (Manchikanti et al., 2017). The American Society of Interventional Pain Physicians (ASIPP) recently published guidelines for responsible, safe, and effective prescription of opioids for individuals with chronic non-cancer pain (Manchikanti et al., 2017). Consistent with previously published recommendations and proposals (Chou et al., 2009a, 2009b, 2009c; Furlan et al., 2010; Graziotti and Goucke, 1997; Jovey et al., 2003; Kalso et al., 2003; US Food and Drug Administration, 2012), these 2017 guidelines detail initial steps that should be taken to ensure safe and effective opioid therapy including, but not limited to, the following: comprehensive assessment and documentation; screening for opioid abuse to identify opioid abusers or those at risk of abuse; establishing appropriate physical and psychological diagnosis if available; considering appropriate imaging, physical diagnosis, and psychological status to collaborate with subjective complaints; and stratifying patients based on risk (Manchikanti et al., 2017).

As noted in our original report of the Opioid Abuse Risk Screener (OARS) development (Henrie-Barrus et al., 2016), there are several currently available measures designed to assess risk of misusing opioids (e.g. Butler et al., 2008, 2009; Passik et al., 2000; Webster and Webster, 2005). Although these measures provide useful information regarding potential risk of opioid abuse, they are limited in scope and do not evaluate many psychiatric variables, which seems an oversight given what the literature tells us about these risk factors for opioid abuse (Alford and Livingston, 2013; Ballantyne and Mao, 2003; Becker et al., 2008; Braden et al., 2009; Cicero et al., 2009; Edlund et al., 2007a, 2007b, 2010a, 2010b; Richardson et al., 2012; Seal et al., 2012; Sehgal et al., 2012; Sullivan et al., 2005, 2006). Furthermore, a recent report by the Centers for Disease Control and Prevention (CDC) highlights the critical importance of evaluating psychiatric variables as these both increase risk of prescription drug misuse and interfere with the resolution of pain, thus drawing out the need for continued pharmacologic intervention (Dowell et al., 2016). This has potential to create a vicious cycle in which the psychiatric variables increase risk and/or hinder pain resolution, and the ongoing pain exacerbates psychiatric symptoms, further increasing potential of abuse or other adverse events. This CDC report also noted concern regarding the accuracy and insufficiency of currently available risk assessment tools (Dowell et al., 2016) based, in part, on reports comparing risk screening methods in predicting discharge from opioid treatment with inconsistent results across and within measures (Jones et al., 2012; Moore et al., 2009).

The OARS (Henrie-Barrus et al., 2016) was developed in response to the need for a more comprehensive assessment tool that evaluates not only substance use history and aberrant behaviors, but also, importantly, psychiatric variables known to be relevant to opioid misuse and abuse (e.g. depressive and anxiety symptoms, exposure to trauma/posttraumatic stress disorder (PTSD), history of abuse/neglect, tobacco use, impulsivity, maladaptive coping, and endorsement of self-medicating behaviors). The original OARS manuscript documented preliminary yet promising evidence for effective risk stratification using a bifactor model (one general factor for opioid abuse risk comprises five domain-specific factors including anxiety, depression, traumatic stress, medical noncompliance, and substance use history) (Henrie-Barrus et al., 2016). Data collection is ongoing in support of further investigations regarding the psychometric properties of the OARS.

Given the urgent need for effective screening and risk stratification tools, the utility of additional preventative and monitoring procedures such as urine drug testing (UDT) (Christo et al., 2011; Owen et al., 2012) and controlled substance database (CSDB) checks (Manchikanti et al., 2017; Schwarz et al., 2016) and mounting need to demonstrate medical necessity for these measures for insurance reimbursement purposes (Owen et al., 2012); our primary aim was to conduct a preliminary evaluation of the predictive validity of the OARS. Specifically, we wanted to examine the ability of the OARS to predict two real-world behavioral indicators of opioid risk, aberrant UDTs, and CSDB checks. As an exploratory aim, we also compared the predictive validity of the OARS to that of the widely used Screener and Opioid Assessment for Patients with Pain–Revised (SOAPP-R) (Butler et al., 2008) in the same sample. Given its more comprehensive assessment of both emotional and behavioral factors, beyond substance use and aberrant behaviors, we hypothesized that the OARS would be significantly better than chance in predicting these real signals/outcomes. We also hypothesized that OARS would have comparable, or superior, predictive power in identifying aberrant UDT and CSDB checks compared to the SOAPP-R due to the SOAPP-R’s lack of comprehensive psychiatric and behavioral risk variables. Finally, machine learning methods are growing in popularity in medical and psychiatric research due to their ability to learn from a data set to yield more consistent, robust, and insightful results (Acion et al., 2017; Ahn and Vassileva, 2016; Kalyanam et al., 2017; Karstoft et al., 2015a, 2015b; Kessler et al., 2015; Pan et al., 2017; Youyou et al., 2015). However, these methods often require very large data sets, so our second exploratory aim was to test the feasibility of using a machine learning algorithm to evaluate psychometric properties of the OARS in a small-to-moderate sample size, similar to traditional psychiatric research populations. This study was reviewed by the University of Utah Institutional Review Board.

Methods

Participants

Archival data were obtained from 612 patients who completed the OARS as part of routine clinical practice. All participants were included in previous analyses (Henrie-Barrus et al., 2016). Briefly, data were collected from patients presenting to a community-based pain management clinic in the Western United States. Patients’ age ranged from 18 to 85 years with a mean age of 44.5 years. About 54 percent of the present sample self-identified as female, 45 percent male, and 0.3 percent did not identify as either male or female. Patient data were excluded if they did not complete a UDT on the same day as OARS assessment (n = 250). Because we also wanted to compare the predictive power of the OARS and SOAPP-R, and in order to limit any potential systematic bias between those patients who completed the OARS versus those who completed both the OARS and the SOAPP-R, those patients who completed only one risk assessment were excluded (n = 87). Analysis of UDT was thus conducted upon a sample of 363 patients. For CSDB analyses, we further excluded any patients for whom no CSDB check was completed within 1 year of the OARS assessment (n = 193) leaving a sample of 169 patients for CSDB analysis. Demographics for the UDT sample are gender (female = 52%, male = 47%, other = 0.3%) and age (19–82, mean = 42.6 years). Demographics for the CSDB sample are gender (female = 53%, male = 47%) and age (19–70, mean = 42.5 years).

Clinical assessments

OARS

The OARS (Henrie-Barrus et al., 2016) is a 28-item self-report scale designed to evaluate risk of opioid misuse based on a relatively comprehensive item pool grounded in empirical evidence assessing biopsychosocial factors and aberrant behaviors. Items are rated on a 0 to 3 likert-type scale with response anchors ranging from strongly disagree to strongly agree. Possible total score ranges from 0 to 84, and a higher score indicates elevated opioid abuse risk.

SOAPP-R

The SOAPP-R (Butler et al., 2008) is a 24-item conceptually and empirically derived self-report scale developed to evaluate opioid risk and aberrant medication-related behaviors. Items are rated 0 to 4, using the response anchors “never,” “seldom,” “sometimes,” “often,” and “very often.” Possible total score ranges from 0 to 96, and a higher score indicates greater probability or risk of aberrant medication-related behavior.

UDT

Patients were asked to provide a urine sample for toxicology screening on the day of their intake visit. More specifically, they were asked to provide approximately 30–75 mL of urine in the clinic restroom, without supervision. They were asked to disclose any prescribed or non-prescribed substances that would likely appear in the testing results. We would expect some patients to have a UDT positive for opioids or other controlled substances or perhaps will have a prescription given in an emergency room or another physician. Additionally, an aberrant result could include the lack of a positive result for a substance the patient reported being prescribed and noted they are taking as this could suggest behaviors such as diversion. Since not all positive UDTs would represent a risky or aberrant behavior, we defined aberrant UDT to mean any unexpected result. An aberrant result would include anything that was inconsistent with the expectations of the clinical team at the time of testing, based on patient report, prescribed medications, and any other data available. Rating of aberrant or non-aberrant was made, with strict adherence to this definition, by the clinic staff conducting the chart reviews.

CSDB checks

The CSDB collects data on dispensation of schedule II–V drugs from all known outlets, including retail, institutional, and outpatient hospital pharmacies, and in- and out-of-state mail order pharmacies. Authorized prescribers and other individuals can access this information to identify potential cases of misuse, drug over-utilization, and over-prescribing. Given that many patients may take the OARS or SOAPP-R as part of their intake visit for chronic pain before receiving their first prescription for opioid analgesics, we examined CSDB checks up to 1 year following the date of the OARS, SOAPP-R, and UDT administration. The definition of aberrant versus non-aberrant was similar to that employed for UDT checks. Results were marked as aberrant if they were inconsistent with expectations of the clinical team based on available data and patient interviews or if they are generally considered to signify risk of “doctor shopping” or other illegal behaviors (number of doctors, number and timing of prescriptions, etc.). Rating of aberrant or non-aberrant was made, with strict adherence to this definition, by the clinic staff conducting the chart reviews.

Data analysis

Receiver operating characteristic (ROC) curve analysis was used as an index of model performance, specifically the sensitivity and specificity of the OARS and SOAPP-R in predicting aberrant UDTs and CSDB checks. ROC curve analysis is a fundamental tool for evaluating the diagnostic performance of psychometric tests and screening assessments. The area under the curve (AUC) provides a measure of the ability of the test to correctly predict the binary classification of a subject. ROC curves with either OARS or SOAPP-R as a predictor variable were conducted using the DeLong method from the pROC package in R (DeLong et al., 1988).

Support vector machines (SVMs) are a supervised machine learning algorithm that can be used to create discriminant classifications from labeled training data (i.e. providing input data such as assessment scores and classification data such as aberrant or non-aberrant UDT). We selected SVM as our exploratory machine learning method because it gave the highest accuracy for our specific data set. We used a 10-fold cross-validation approach to assess the feasibility (stability and confidence of results) of and inform future use of machine learning techniques on data sets of similar size. Finally, for each SVM, feature selection was conducted to evaluate the most stable and predictive items using sequential forward search where α = 0.01.

Results

Risk assessment predicts aberrant UDT and CSDB

Results of our primary analyses indicate that the OARS (AUC = 0.727) is significantly different (Z = 2.912, p = 0.0036) than the SOAPP-R (AUC = 0.628) as a predictor of aberrant same-day UDT. Aberrant CSDB checks within 1 year of assessment date were also evaluated, again demonstrating the OARS (AUC = 0.749) to be significantly different (Z = 3.731, p = 0.0002) than the SOAPP-R (AUC = 0.552) in as a predictor (Figures 1 and 2).

Figure 1.

Predictive validity relative to aberrant UDTs.

Figure 2.

Predictive validity of identifying aberrant CSDB checks.

Exploring application of machine learning (SVM)

Applying an SVM algorithm to test all OARS items supports the primary results, with better than chance prediction of aberrant same-day UDT (AUC = 0.626; n = 363) but non-aberrant CSDB checks within 1 year of assessment date in a smaller sample size (AUC = 0.422; n = 169). Two items (stress load and self-medicating behaviors) were selected as being the most stable and predictive of aberrant same-day UDT (AUC = 0.636; n = 363), while two different items (hopelessness and traumatic nightmares) were selected as most stable and predictive of aberrant CSDB checks within 1 year of assessment date (AUC = 0.590; n = 169).

Discussion

The traditional AUCs reported above are significantly greater than random predictions and provide the first evidence of the OARS’ predictive validity. The fact that the AUCs are not higher for both the OARS and the SOAPP-R is not particularly surprising, given that neither assessment was designed to capture the same information as a UDT or a CSDB check. Our results suggest, at least in this particular pain population, that the OARS is significantly more predictive than the SOAPP-R with regard to both aberrant UDT to aberrant CSDB checks. A cross-validation study of the SOAPP-R previously reported an AUC of 0.74 relative to UDTs (Butler et al., 2009). It is likely our results diverge due to differences in patient characteristics and demographics, including geographical recruitment area, score distributions, or possible differences in the analytical intent and the definition of “aberrant” used to prepare the data for analyses. Although any interpretation of preliminary data should be treated with caution, these results suggest that the OARS may have sufficient predictive power to document medical necessity for UDT in some patients and may help clinicians identify patients for whom it is particularly critical to check CSDB reports for red flags. These data also indicate that the OARS may have superior predictive validity to the SOAPP-R with regard to these measures. It is possible the additional items focused on psychiatric risk factors may contribute to this increased predictive power.

The SVM analyses, although very limited by our small sample size (by machine learning standards), were quite informative with regard to feasibility. The stability and accuracy of the results, as well as the varying level of consistency with traditional AUCs using the ROC method, suggest that a sample size of 363 is bordering upon, but not quite adequate for these particular scale validation methods, and a sample of 169 is significantly under-powered to make any trustworthy classifications. When considering the feature selection conducted for both the UDT and CSDB analyses, it is interesting to take note of the item content for the most stable and predictive items. When predicting aberrant same-day UDT, a single emotional item (high stress load) and a single behavioral item (misuse of medications to alleviate emotional distress) were selected and (even if with low confidence) were nearly as predictive on their own as the SOAPP-R using traditional ROC method. In the smaller CSDB sample, two emotional symptoms (hopelessness and traumatic nightmares) were most stable and predictive, with an AUC significantly larger than the whole scale, and again, similar to the SOAPP-R using traditional methods. While the sample size was significantly too small to take the CSDB SVM results too seriously, it is again an interesting pattern that emotional items are so strongly predictive of two gold standard real-world risk factors. These findings in particular may lend additional support to the CDCs urging for clinicians to evaluate psychological variables in a more comprehensive manner.

Informing our own future studies, and those of other investigators, our SVM experiment suggests that machine learning classification of risk may be feasible without requiring thousands of medical records to stand by the results, but that several hundred to a thousand may be required, depending on the complexity and design of the scale. Future studies of the OARS should further evaluate psychometric properties and external validity, evaluate more diverse patient populations to improve generalizability, and may benefit from using SVM or other machine learning methods to continue to optimize the assessment.

There are notable limitations for these results including the relatively small sample size (especially with regard to the machine learning analyses) and the limited diversity in the sample. The urgent need for additional methods to aid in well-informed decision-making regarding opioid analgesic prescribing practices, we felt it pertinent to provide a brief update regarding these interim results while we prepare the larger data set for additional analyses.

Footnotes

Acknowledgements

The authors are very appreciative to the patients whose data were used in this investigation and to the clinic staff who completed medical chart reviews and compiled the data regarding demographic information, OARS and SOAPP-R scores, and UDT and CSDB results. We are also thankful to the medical clinics and physicians who administer the OARS in their clinical practice and are willing to share these data with us to further extend our validation and reliability work for this measure. L.A.A. and C.L.A. contributed equally to this work.

Declaration of conflicting interests

The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Three authors (LAA, CLA, and PHB) are affiliated with InteraSolutions, a startup psychological assessment company that intends to market the OARS as a commercial product. IRB review was obtained and none of the authors had any contact with any study participants as all data used is archival in nature only. These authors have no additional potential conflicts and all other authors deny any potential conflicts.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Acion

Kelmansky

van der Laan

et al . (2017) Use of a machine learning framework to predict substance use disorder treatment success. PLoS ONE 12: e0175383.

Ahn

Vassileva

(2016) Machine-learning identifies substance-specific behavioral markers for opiate and stimulant dependence. Drug and Alcohol Dependence 161: 247–257.

Alford

Livingston

(2013) JAMA patient page. Misuse of opioid medication. Journal of the American Medical Association 309: 2055.

Ballantyne

Mao

(2003) Opioid therapy for chronic pain. The New England Journal of Medicine 349: 1943–1953.

Becker

Sullivan

Tetrault

et al . (2008) Non-medical use, abuse and dependence on prescription opioids among U.S. adults: Psychiatric, medical and substance use correlates. Drug and Alcohol Dependence 94: 38–47.

Braden

Sullivan

Ray

et al . (2009) Trends in long-term opioid therapy for noncancer pain among persons with a history of depression. General Hospital Psychiatry 31: 564–570.

Butler

Budman

Fernandez

et al . (2009) Cross-validation of a screener to predict opioid misuse in chronic pain patients (SOAPP-R). Journal of Addiction Medicine 3: 66–73.

Butler

Fernandez

Benoit

et al . (2008) Validation of the revised Screener and Opioid Assessment for Patients with Pain (SOAPP-R). The Journal of Pain 9: 360–372.

Chou

Ballantyne

Fanciullo

et al . (2009a) Research gaps on use of opioids for chronic noncancer pain: Findings from a review of the evidence for an American Pain Society and American Academy of Pain Medicine clinical practice guideline. The Journal of Pain 10: 147–159.

10.

Chou

Fanciullo

Fine

et al . (2009b) Clinical guidelines for the use of chronic opioid therapy in chronic noncancer pain. The Journal of Pain 10: 113–130.

11.

Chou

Fanciullo

Fine

et al . (2009c) Opioids for chronic noncancer pain: Prediction and identification of aberrant drug-related behaviors: A review of the evidence for an American Pain Society and American Academy of Pain Medicine clinical practice guideline. The Journal of Pain 10: 131–146.

12.

Christo

Manchikanti

Ruan

et al . (2011) Urine drug testing in chronic pain. Pain Physician 14: 123–143.

13.

Cicero

Wong

Tian

et al . (2009) Co-morbidity and utilization of medical services by pain patients receiving opioid medications: Data from an insurance claims database. Pain 144: 20–27.

14.

DeLong

Clarke-Pearson

(1988) Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics 44: 837–845.

15.

Dowell

Haegerich

Chou

(2016) CDC guideline for prescribing opioids for chronic pain—United States, 2016. MMWR Recommendations and Reports 65: 1–49.

16.

Edlund

Martin

Devries

et al . (2010a) Trends in use of opioids for chronic noncancer pain among individuals with mental health and substance use disorders: The TROUP study. The Clinical Journal of Pain 26: 1–8.

17.

Edlund

Martin

Fan

et al . (2010b) Risks for opioid abuse and dependence among recipients of chronic opioid therapy: Results from the TROUP study. Drug and Alcohol Dependence 112: 90–98.

18.

Edlund

Steffick

Hudson

et al . (2007a) Risk factors for clinically recognized opioid abuse and dependence among veterans using opioids for chronic non-cancer pain. Pain 129: 355–362.

19.

Edlund

Sullivan

Steffick

et al . (2007b) Do users of regularly prescribed opioids have higher rates of substance use problems than nonusers? Pain Medicine 8: 647–656.

20.

Furlan

Reardon

Weppler

(2010) Opioids for chronic noncancer pain: A new Canadian practice guideline. Canadian Medical Association Journal 182: 923–930.

21.

Graziotti

Goucke

(1997) The use of oral opioids in patients with chronic non-cancer pain. Management strategies. The Medical Journal of Australia 167: 30–34.

22.

Henrie-Barrus

Averill

Sudweeks

et al . (2016) Development and preliminary validation of the Opioid Abuse Risk Screener. Health Psychology Open 3: 2055102916648995.

23.

Jones

Moore

Levy

et al . (2012) A comparison of various risk screening methods in predicting discharge from opioid treatment. The Clinical Journal of Pain 28: 93–100.

24.

Jovey

Ennis

Gardner-Nix

et al . (2003) Use of opioid analgesics for the treatment of chronic noncancer pain—A consensus statement and guidelines from the Canadian Pain Society, 2002. Pain Research & Management 8(Suppl. A): 3a–28a.

25.

Kalso

Allan

Dellemijn

et al . (2003) Recommendations for using opioids in chronic non-cancer pain. European Journal of Pain 7: 381–386.

26.

Kalyanam

Katsuki

Lanckriet

GRG

et al . (2017) Exploring trends of nonmedical use of prescription drugs and polydrug abuse in the Twittersphere using unsupervised machine learning. Addictive Behaviors 65: 289–295.

27.

Karstoft

Galatzer-Levy

Statnikov

et al . (2015a) Bridging a translational gap: Using machine learning to improve the prediction of PTSD. BMC Psychiatry 15: 30.

28.

Karstoft

Statnikov

Andersen

et al . (2015b) Early identification of posttraumatic stress following military deployment: Application of machine learning methods to a prospective study of Danish soldiers. Journal of Affective Disorders 184: 170–175.

29.

Kessler

Warner

Ivany

et al . (2015) Predicting suicides after psychiatric hospitalization in US army soldiers: The Army Study to Assess Risk and Resilience in Servicemembers (Army STARRS). JAMA Psychiatry 72: 49–57.

30.

Manchikanti

Kaye

Knezevic

et al . (2017) Responsible, safe, and effective prescription of opioids for chronic non-cancer pain: American Society of Interventional Pain Physicians (ASIPP) guidelines. Pain Physician 20: S3–S92.

31.

Moore

Jones

Browder

et al . (2009) A comparison of common screening methods for predicting aberrant drug-related behavior among patients receiving opioids for chronic pain management. Pain Medicine 10: 1426–1433.

32.

Owen

Burton

Schade

et al . (2012) Urine drug testing: Current recommendations and best practices. Pain Physician 15: ES119–133.

33.

Pan

Liu

Metsch

et al . (2017) Factors associated with HIV testing among participants from substance use disorder treatment programs in the US: A machine learning approach. AIDS and Behavior 21: 534–546.

34.

Passik

Kirsh

McDonald

et al . (2000) A pilot survey of aberrant drug-taking attitudes and behaviors in samples of cancer and AIDS patients. Journal of Pain and Symptom Management 19: 274–286.

35.

Richardson

Russo

Katon

et al . (2012) Mental health disorders and long-term opioid use among adolescents and young adults with chronic pain. The Journal of Adolescent Health 50: 553–558.

36.

Schwarz

George

Bluth

(2016) Toxicology in pain management. Clinics in Laboratory Medicine 36: 673–684.

37.

Seal

Shi

Cohen

et al . (2012) Association of mental health disorders with prescription opioids and high-risk opioid use in US veterans of Iraq and Afghanistan. Journal of the American Medical Association 307: 940–947.

38.

Sehgal

Manchikanti

Smith

(2012) Prescription opioid abuse in chronic pain: A review of opioid abuse predictors and strategies to curb opioid abuse. Pain Physician 15: ES67–92.

39.

Sullivan

Edlund

Steffick

et al . (2005) Regular use of prescribed opioids: Association with common psychiatric disorders. Pain 119: 95–103.

40.

Sullivan

Edlund

Zhang

et al . (2006) Association between mental health disorders, problem drug use, and regular prescription opioid use. Archives of Internal Medicine 166: 2087–2093.

41.

US Food and Drug Administration (2012) Extended-Release (ER) and Long-Acting (LA) Opioid Analgesics Risk Evaluation and Mitigation Stragety (REMS). Silver Spring, MD: US Food and Drug Administration.

42.

Webster

(2005) Predicting aberrant behaviors in opioid-treated patients: Preliminary validation of the opioid risk tool. Pain Medicine 6: 432–442.

43.

Youyou

Kosinski

Stillwell

(2015) Computer-based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences of the United States of America 112: 1036–1040.