Abstract
Introduction:
Abuse potential assessment of drugs with central nervous system activity is a critical component of the development of new medications for regulatory filings for approval worldwide. These new drug applications typically include recommendations for scheduling status as a controlled substance, if scheduling is warranted. This commentary focuses on the approach and current issues related to scheduling in the United States (US) Controlled Substances Act (CSA); however, scheduling more globally generally involves similar approaches and issues. Although scheduling is intended to protect public health and safety by adequate restrictive control based on the abuse potential assessment, it is also intended to serve medicinal use and access and to incentivize the development of new medicines with reduced abuse-related risks. Thus, overly restrictive scheduling can present risks to public health, as does under-scheduling.
Objective:
Identify the categories of substances and new drugs that appear most challenging for reliable and valid abuse potential assessment by the widely used abuse potential assessment approaches recommended in the U.S. Food and Drug Administration (FDA) 2017 Guidance and which generally serve globally.
Methodology:
A panel of abuse potential assessment experts convened at a scientific meeting that included many leading abuse potential and regulatory experts, namely the International Study Group Investigating Drugs as Reinforcers. The focus of the panel was novel and psychedelic substances and cannabinoids that raise challenges for accurate abuse potential assessment, in which postmarketing evaluations by the FDA suggested were overestimates of their abuse potential that resulted in overly restrictive scheduling.
Results:
There is agreement that many novel acting substances, including medications for treating disorders related to anxiety, sleep, depression, and pain, and including orexin receptor acting substances, psychedelics, and diverse cannabinoids, may require modifications of existing methods and alternative approaches to more fully and accurately characterize their abuse potential and guide CSA scheduling. This included a human abuse potential (HAP) assessment that can raise challenges for the identification of the most appropriate placebo and positive comparators, study populations, and protocols to ensure both safety and scientific reliability.
Conclusions:
The core abuse potential methods that serve U.S. CSA scheduling and global new drug scheduling are generally reliable for many categories of drugs but need to be modified and perhaps supplemented with additional outcome measures to more fully and accurately characterize potential abuse-related risks. This may include behavioral economic assessments in preclinical and clinical studies and a broader range of outcome measures in HAP studies and adverse event assessment in all clinical studies in drug development programs.
Introduction
Abuse potential assessment during drug development is a component of the safety evaluation required by the United States (U.S.) Food and Drug Administration (FDA) for drugs with central nervous system (CNS) activity (U.S. FDA, 2017). This assessment informs the FDA’s recommendations to the Drug Enforcement Administration (DEA) for scheduling 1 in the U.S. Controlled Substances Act (U.S. CSA), as well as international scheduling (U.S. FDA, 2017; United Nations Office on Drugs and Crime, 2020). For new chemical entities, the abuse potential assessment informs the FDA’s determination as to whether CSA control is warranted and, if so, what schedule it will recommend. This commentary summarizes some of the main points and discussion from a panel at the 2024 International Study Group Investigating Drugs as Reinforcers (ISGIDAR) that addressed emerging issues and thinking about how animal and human abuse potential (HAP) assessment approaches might be improved to more reliably characterize the potential risks of recreational use and misuse and guide scheduling recommendations with a focus on novel CNS acting substances and psychedelics.
Regulatory considerations in abuse potential assessment and CSA scheduling. If a new drug is derived from opium alkaloids, it is in Schedule II during development; then, if approved by FDA, the approval must include the FDA’s recommendation for potential rescheduling or removal from control largely based on abuse potential assessment during development. 2 For new drugs containing a substance already placed in Schedule I (e.g., many cannabinoids and psychedelics), FDA approval of a drug product containing a Schedule I substance must be accompanied by a recommendation to DEA as to whether continued control is warranted, and if so, a recommendation for rescheduling (Henningfield et al., 2022a). 3 FDA’s scheduling recommendation and DEA’s final scheduling action must be guided by each organization’s evaluation of eight factors of the CSA (commonly referred to as an eight-factor analysis (8FA); U.S. Code of Federal Regulations, 2024a).
As summarized by the FDA and the CSA, the level of the restrictiveness or “control” is related to the level of abuse-related risks based on the 8FA and other factors: “Drugs or other substances with a high abuse potential, no currently accepted medical use, and a lack of accepted safety for use under medical supervision are controlled in Schedule I. Drugs or other substances with abuse potential that do have a currently accepted medical use (e.g., the drug or substance is in an FDA-approved product) are placed into Schedule II, III, IV, or V. The specific placement of a drug or other substance within Schedules II–V is determined by the relative abuse potential of the drug or substance and the relative degree to which it induces psychological or physical dependence” (21 U.S.C. 812(b)), (U.S. FDA, 2017: 1). Table 1 lists the eight factors determinative of CSA control.
Under 21 U.S.C. 811(b) of the CSA, the medical and scientific analysis considers the following eight factors determinative of control of the drug under the CSA (21 U.S.C. 811(c)): (from U.S. FDA, 2017).
The most restrictive schedule for drugs approved for therapeutic use including those designated as commonly accepted for medical use (CAMU) regardless of FDA approval 4 is Schedule II, which includes the strongest barriers to distribution, access, prescribing and risk management programs whereas drugs placed in Schedules III, IV, and V generally have fewer restrictions and are more accessible to patients (U.S. Code of Federal Regulations, 2024a, 2024b, 2024c).
A broad range of studies and data collected during drug development, and other potentially relevant data, inform the FDA’s 8FA. For example, nonclinical and clinical studies, including adverse event (AE) assessment, help understand how the overall pharmacology, safety, and abuse potential of the drug product compares with previously scheduled drugs. These findings are presented in factors 1, 2, 3, and 7 of the 8FA.
Factors 4, 5, and 6 address whether the new drug presents a known or imminent risk to public health, and if so, the nature, scope, and severity of the risk, often in the context of potential benefits (U.S. FDA, 2017; Henningfield et al., 2022a, 2023; U.S. Code of Federal Regulations, 2024a). 5 For novel acting new chemical entities, where little or no real world experience outside of the drug development program is available, the analysis of factors 4, 5, and 6 may be quite limited; however, it is sometimes relevant to compare the expected abuse-related risks and benefits to those of previously scheduled drugs that have been approved for similar indications as the new drug. This may influence the determination of how scheduling and other aspects of labeling might influence prescribing of the new drug relative to prior drugs. Thus, for drugs with similar potential benefits but expected lower risks of recreational use, misuse, and abuse, it may be considered in the interests of public safety to recommend less restrictive scheduling to increase the likelihood that prescribers will consider the new drug as an alternative to legacy drugs with greater abuse potential.
Impact of the CSA on abuse potential assessment methods. Evidence-based abuse potential assessments began early in the 20th century with a strong emphasis on animal models of physiological dependence and withdrawal in search of “less addictive” analgesics than morphine-type opioids (Jasinski, 1977; Martin, 1973, 1977; May and Jacobson, 1989).
The 1970 passage of the CSA and its requirements for evidence-guided scheduling incentivized researchers who were already developing opioid-focused abuse potential assessment methods to extend these models to the broad range of substances in need of evaluation, including stimulants, sedatives, and other CNS active substances (e.g., Jasinski et al. 1967; Thompson and Schuster, 1968; Thompson and Unna, 1977). By the 1970s, abuse potential assessments focused increasingly on self-reported subjective effects in humans in studies and models for assessing reinforcing and interoceptive effects in animals and sometimes humans (e.g., intravenous self-administration (IVSA) and drug discrimination (DD); Fischmann and Mello, 1989; Jasinski, 1977; Jasinski et al., 1984). The methods have continued to evolve as summarized in a special meeting convened by the College on Problems of Drug Dependence (CPDD) in 2002 (Expert Panel, 2003; Schuster and Henningfield, 2003) which contributed to the development of FDA’s first formal abuse potential assessment guidance, released in draft form in 2010 and finalized in 2017 (U.S. FDA, 2010, 2017). 6
The CSA incentivized the development of lower abuse potential pharmaceuticals that might be approved without scheduling recommendations or at least less restrictive scheduling than legacy drugs in the same therapeutic category. This increased the importance of abuse potential assessment methods that are reliable in differentiating the abuse potential of a test drug relative to previously scheduled drugs, analogous to comparing a potential new drug to previously approved treatments based on their relative therapeutic efficacy and overall safety.
Thus, scheduling goals are often part of the pharmaceutical developer’s target product profile, with a frequent goal being no scheduling, or scheduling that is less restrictive than already marketed products in the same therapeutic category. Indeed, since the 1970s, many new chemical entities for pain, epilepsy, sleep, wakefulness, and attention-deficit/hyperactivity disorder are lower in abuse potential than traditional drugs in these categories (Calderon et al., 2020; Caro et al., 2022; U.S. Department of Justice, DEA, 2019).
FDA abuse potential assessment requirements for drugs in development. In accordance with FDA’s 2017 guidance, sponsors of new CNS active drugs must include a summary of their abuse potential assessment studies and scheduling recommendation in their New Drug Application (NDA). 7 Additionally, the FDA has been increasingly likely to recommend abuse potential assessment studies for new drugs with brain penetrance but unknown CNS activity to determine if those new drugs produce behavioral effects that might indicate abuse potential. Initial findings may inform the need for a more extensive assessment.
FDA’s 2017 guidance takes a step-by-step approach to determine the level and nature of CNS activity in which findings from each study inform whether dedicated abuse potential studies will be recommended. Thus, chemistry studies and determination of the ability of the drug to cross the blood-brain barrier inform brain penetrance and the potential for CNS activity, and this is augmented by data from cellular models profiling binding and effects at brain receptors, transporters, and enzymes in comparison with known substances of abuse. One of the limitations of such data is that drugs with little apparent CNS activity in vitro might be pro-drugs of substances that emerge in vivo. Thus, studies of drug-elicited behaviors (e.g., Irwin test or Functional Observational Batteries) that are routinely used to assess dose-related safety can inform whether the drug has general stimulant, depressant, and intoxicating effects, head twitch effects that predict potential serotonergically mediated psychedelic effects, and/or cannabinoid-elicited reductions in core body temperature.
All of the foregoing findings taken together influence the likelihood that FDA will recommend dedicated animal abuse-related studies such as IVSA and DD, and the results of such animal studies will contribute to the determination of whether dedicated HAP studies will be recommended. Regardless of that latter determination, all clinical studies must assess potential abuse-related AEs as per FDA’s 2017 guidance, recommendations provided at the 2023 Cross Company Abuse Liability Council (CCALC) meeting (CCALC, 2023), and the most recent thinking of FDA that sponsors routinely obtain in their early clinical development meetings with FDA staff.
At least until the early 21st century, scheduling was generally predictable based largely on animal and HAP assessments addressing discriminative and reinforcing effects in animals, and subjective and reinforcing effects in humans. Because many of these drugs had generally similar mechanisms of action and effects as previously studied and scheduled opioids, stimulants, and/or sedatives, the new drugs could be compared to already approved and scheduled drugs in dedicated abuse potential assessment studies, including IVSA and DD in animals, as well as HAP studies focused on subjective effects assessments (Ator and Griffiths, 2003; Carter and Griffiths, 2009; Expert Panel, 2003; Griffiths et al., 2003; U.S. FDA, 2017).
With respect to comparator drugs, FDA states that for self-administration studies, “any drug of abuse that is scheduled under the CSA may serve as the training drug for the self-administration study” providing that it produces self-administration at rates significantly higher than placebo (U.S. FDA, 2017: 18). In practice, cocaine, amphetamine, or morphine typically serves as the comparator drug despite the FDA indicating that “The ideal positive control drug is in the same pharmacological class as the test drug” (U.S. FDA, 2017: 18).
For DD studies, drugs with novel mechanisms of action often raise more questions about what comparators are most appropriate. This is especially true in cases where the test drugs under development are expected to be of lower abuse potential relative to the drug used for the same indication due, in part, to a novel mechanism of action, such as reduced/low activity at the neuronal targets of most prototypic drugs of abuse. As the FDA states, “DD is dependent on the mechanism of action, so only those test drugs that have pharmacological activity similar to that of the training drug will likely produce a significant degree of generalization to the training drug. Thus, if the test drug has a novel mechanism of action, sponsors may prefer to propose an alternative approach for identifying an appropriate training drug.” (U.S. FDA, 2017: 19). However, FDA’s guidance offers no further advice regarding selection of an “alternative approach,” and this is often a point of active discussion with FDA which sometimes results in more than one comparator being recommended for study.
For instance, when cannabidiol (CBD)—the primary active ingredient in Epidiolex—was submitted to the FDA seeking approval for the treatment of seizures associated with Lennox–Gastaut syndrome and Dravet syndrome, cocaine was used as a training drug and cocaine and amphetamine both served as comparator drugs for the preclinical, rat IVSA studies that were conducted (U.S. FDA, 2018). For the HAP studies, dronabinol (a synthetic form of delta-9-tetrahydrocannabinol (THC)) and alprazolam were used as comparators (Schoedel et al., 2018; U.S. FDA, 2018). More recently, to better understand how various cannabinoids and terpenes in cannabis modulate the discriminative effects of THC, 8 cannabinoids and terpenes were evaluated in a THC DD procedure to determine their impact on THC discrimination (Moore et al., 2023).
Botanicals with CNS effects (e.g., caffeinated products, kava, and kratom) can generally be legally marketed unless explicitly banned for safety reasons or placed in Schedule I of the CSA due to unacceptably high abuse potential (Congressional Research Service, 2021; Henningfield et al. 2022a; Richardson et al., 2007; U.S. National Center for Complimentary and Integrative Health, 2025). 8
Regarding kratom, in a study of the discriminative effects of the most abundant and apparently most important contributor to kratom use, mitragynine was compared to eleven substances, including opioids, cannabinoids, and alpha adrenergic antagonists, and concluded that the nonscheduled over-the-counter lofexidine and phenylephrine were most strongly discriminated by rats trained to respond to mitragynine (Reeve et al., 2020). It is important to note, however, that coffee beans, kava, and kratom, like many other botanicals, contain numerous alkaloids that might contribute to their effects, and these may have widely differing types and levels of effects, depending in part on the amounts contained in the product and absorbed. Thus, it can be challenging and may require multiple studies to understand the abuse liability of the botanical in its various forms, along with the potential contribution of metabolites emerging following consumption.
The HAP study approach has long been considered by the FDA and many experts to be the most predictive premarket abuse potential studies and they are generally the most heavily weighted studies in scheduling recommendations (Calderon et al., 2020; Caro et al., 2022 Carter and Griffiths, 2009; Expert Panel, 2003; Griffiths et al., 1980; Griffiths et al., 2003; Moline et al., 2023;U.S. FDA, 2017). These studies are often described as the “gold standard” of human abuse liability testing (Carter and Griffiths, 2009: 1). Key factors in HAP studies that can influence outcomes include the selection of the research participants, the outcome measures employed, the selection of comparator substances, and the doses of the test drug and comparator, as discussed in the FDA’s 2017 guidance and elsewhere (Carter et al., 2009; Griffiths et al., 2003; Henningfield et al., 2022a). The importance of such factors for novel acting pharmaceuticals such as the dual orexin receptor antagonists (DORA’s), cannabinoids, kratom, and psychedelics has been discussed elsewhere (Cooper et al., 2018; Cooper and Abrams, 2019; Henningfield et al., 2022a, 2022b; Pabon and Cooper, 2023; Reissig et al., 2024; Schindler et al., 2020; Spindle et al., 2020).
Over the past 2 decades, however, HAP studies have seemingly increasingly emerged as a discrepant model that the FDA’s Controlled Substance Staff found overestimate abuse potential and risk of post-marketing abuse, at least for some novel acting new drugs (Calderon et al., 2020; Caro et al., 2022). In the FDA’s 2020 evaluation, Calderon et al. (2020) reported that four drugs (ezogabine, brivaracetam, lacosamide, and suvorexant) were recommended for CSA scheduling by FDA to the DEA, primarily on the basis of HAP findings in which the new drug produced similar effects on the primary outcome measure (“drug liking”) as the positive comparator drug, despite the fact that there was “a lack of a positive signal indicative of abuse from animal behavioral studies.” Although Calderon et al. noted that there appeared to be little evidence of post-market abuse and recreational use, they concluded that a thorough post-marketing evaluation was needed to better understand the post-marketing rates of abuse and clinical significance of the HAP findings. This was done for the FDA’s 2022 presentation in which the same drugs, as well as two additional DORAs, daridorexant and lemborexant, were examined.
The 2022 FDA evaluation presented by Caro et al. (2022) confirmed the results of the 2020 evaluation with a more thorough post-marketing evaluation of abuse rates and concluded, “Overall, HAP studies are highly sensitive in detecting important differences between drug effects and predicting the likelihood of drug abuse in people who recreationally use drugs. Although the current preclinical and clinical methodology for assessing abuse liability has demonstrated high predictive validity, there is a discrepancy between post-marketing indicators of abuse and data from HAP studies.” Caro et al. included the following on their poster: “Regarding the HAP study methodology itself, as new drugs with new mechanisms of action and lower abuse potential are being developed, other study approaches (e.g., behavioral economics, money vs drug choice) may be worth further investigations to determine if they may provide more predictive data determinative of actual abuse post-marketing.”
Focusing on one of these drugs, lemborexant, and evaluating a broader range of post-marketing surveillance systems, Moline et al. (2023) came to similar conclusions. In short, among all data evaluated, the HAP study data appeared “discrepant” and overestimated real-world recreational use. A limitation of animal IVSA studies was discussed by Calderon et al. (2023), who stated that they “may not necessarily have been rigorously designed to assess the safety or the abuse potential of the drug for regulatory purposes” (p. 3). 9
Special challenges of psychedelic drugs in abuse potential assessment
The definition of psychedelic drugs varies, and the FDA’s 2023 draft guidance—Psychedelic Drugs: Considerations for Clinical Investigations—FDA provides no definition but does provide examples, including lysergic acid diethylamide (LSD) and psilocybin (U.S. FDA, 2023). For the purpose of this drug development and regulatory-focused commentary, we are therefore being consistent with the FDA’s Controlled Substance Staff’s 2023 review addressing psychedelic abuse potential assessment in which it is stated that “classic psychedelics are defined as drugs known for their ability to alter perception, cognition, and mood that, despite having a complex pharmacology, are thought to mediate their psychedelic effects primarily via the activation of the serotonin 5HT2A receptors. . . [and] include but are not limited to LSD, psilocybin, psilocin (the active metabolite of psilocybin), mescaline, and N,N-dimethyltryptamine, and are further defined by the fact that they are all currently controlled in Schedule I of the Federal CSA.” (Calderon et al., 2023: 1) Note that from a regulatory perspective, although 3,4-methylenedioxymethamphetamine (MDMA) seems to be viewed as a classic psychedelic by the FDA, MDMA is an amphetamine analog that shares key clinical effects with both amphetamine-like stimulants and LSD-like psychedelics and is referred to by some as representative of a novel class of substances termed entactogens (Henningfield et al., 2022a, Nichols, 2022).
Although the reliability of animal potential assessment models was questioned by some in the 1980s (Griffiths et al., 1980), more recent studies and reviews suggest promise for the evaluation and differentiation of the reinforcing effects of both classic psychedelic and the entactogens as discussed elsewhere (Fantegrossi et al., 2008; Goodwin, 2016; Heal et al., 2022; Henningfield et al., 2022a, 2023). As discussed in this commentary and elsewhere (e.g., Henningfield et al., 2023, however, the development of classical psychedelics with robust hallucinogenic activity for some people within the range of therapeutic doses, as captured by the FDA’s definition above, raises challenges for HAP assessment of psychedelics and novel acting substances such as the DORA’s and anti-convulsants discussed in FDA’s 2020 and 2022 (Calderon et al., 2020, 2022, Caro et al., 2022; Henningfield et al., 2022a, 2023; Moline et al., 2023).
Specifically, the U.S. FDA (2017) recommended HAP study approach may also be problematic for the evaluation of psychedelic substances due to both safety and scientific validity concerns. For example, protocol elements including the risks of testing supratherapeutic doses, and participant preparation to prevent serious AEs and other safety restrictions designed to protect participants might alter subjective response profiles and AE patterns that may occur in general clinical practice for approved drugs (Henningfield et al., 2022a). Furthermore, because nonmedical and recreational use of psychedelics may be motivated by effects in addition to drug liking, these other potential effects may be more important motivators and predictors of such use than “liking, “euphoria,” “high,” for example, enhancement of artistic creativity and spirituality, connection with others, and self-exploration of consciousness as can be assessed with a variety of instruments (Carbonaro et al., 2020; Henningfield et al., 2022a, 2023; Reissig et al., 2012; Zia et al., 2023). This possibility was also suggested in the most recent review of psychedelic abuse potential methods by FDA staff, although that article is not an official FDA guidance document, and this concept was not addressed in the FDA’s 2023 draft guidance addressing considerations for clinical investigations with psychedelic drugs (U.S. FDA, 2023).
CCALC 2023 meeting: Advances and challenges in abuse potential assessment (CCALC, 2023), involving FDA, focused on challenges of abuse potential evaluation for novel and psychedelic substances
The September 2023 CCALC meeting was convened in part to address issues raised in the Calderon et al. (2020) and Caro et al. (2022) CPDD posters, and also for the FDA to present its latest thinking on the challenges and potential solutions posed by novel acting and psychedelic substances. Several co-authors of this commentary, and other experts in abuse potential assessment, including pharmaceutical industry representatives, also participated. Key points of focus included the discrepancy between premarket animal and HAP assessments and the need to reassess current methods of abuse potential assessment more generally to address challenges posed by the rapidly increasing drug development pipeline of novel-acting and/or psychedelic substances.
As discussed at CCALC, HAP studies are designed to be sensitive to abuse-related effects by screening and recruiting potential study participants according to their histories of recreational use of potentially related substances to the new drug in development. For these studies, potential study participants must be pretested and “qualified” to ensure that administration of the comparator drug(s) produces robust elevations in drug liking scores (U.S. FDA, 2017). The possibility was discussed that the FDA’s recommended qualification phase may increase the likelihood of finding elevated liking scores by a drug with any effects similar to the positive comparator. 10
Other challenges to reliable abuse potential assessment include the comparator drugs recommended for animal and HAP studies. The FDA recommends that a comparator drug “should be an FDA-approved drug that is pharmacologically similar to the test [new] drug and scheduled under the CSA” (U.S. FDA, 2017: 25). However, for novel drugs in development, it is not always clear what potential comparator is most pharmacologically similar, especially if the mechanism of action of the novel drug is not clear and/or if the novel drug’s range of CNS effects in animals and humans has not been fully characterized.
Furthermore, if the new drug appears to be a plausible candidate for low or no CSA scheduling, there is a limited pool of potential comparators with low abuse potential (e.g., placed in Schedules IV or V, see DEA, 2024) that are reliable as euphoriants in HAP studies (e.g., most benzodiazepine-type drugs are placed in Schedule IV even though there is variability in real-world abuse and safety).
The FDA acknowledges that novel drugs present “a difficult selection for a positive control” (U.S. FDA, 2017: 25). In addition to novel mechanisms of action, the subjective-effect profiles and time course (e.g., onset and offset) of novel drugs may differ from potential comparators. For example, to evaluate the abuse potential of esmethadone, based on FDA guidance, two HAP studies were conducted in which 40 mg oral oxycodone was the positive comparator in one study, and the other study used intravenous ketamine 0.5 mg/kg infused over 40 minutes as the primary comparator as well as oral dextromethorphan 300 mg as an exploratory comparator (Shram et al., 2022). With respect to psychedelic substances, if HAP studies are recommended (e.g., for novel psychedelics), there appears to be little consensus as to the most appropriate active and placebo comparators, which have included dextromethorphan, methylphenidate, niacin, and inactive placebo (Carbonaro et al., 2020; Colagiuri and Boakes, 2010; Griffiths et al., 2006; Grob et al., 2011; Olson et al., 2020; Raison et al., 2023).
Another contributor to potential abuse-related effects is the speed of onset of effects. Some novel drugs in development may exhibit a substantially slower onset of CNS effects and/or half-lives that are substantially longer than those of potential comparators, both of which may confound any abuse-related conclusions that would traditionally be drawn based on the momentary peak liking. For example, a novel opioid under development, that is, oxycodegol (also known as oxicodegol and NKTR-181), produced low reinforcing effects in animal IVSA studies, low abuse-related AEs during clinical trials, and reliably and substantially slower onsetting and peak drug liking effects at all doses as compared to two oxycodone comparators in two HAP studies, suggesting to the sponsor that the drug warranted scheduling less restrictive than Schedule II (Ge et al., 2020; Miyazaki et al., 2017; Webster et al., 2018). However, primarily because peak liking at the highest oxycodegol dose was not significantly lower than peak liking at the lowest oxycodone dose, FDA and its advisory committee concluded that oxycodegol was not lower than oxycodone in its abuse potential (Kaufman, 2020).
The CCALC meeting also included considerable discussion about the importance of rigorous evaluation of potential abuse-related AEs in all clinical studies, as recommended in the FDA’s 2017 guidance and in its psychedelic-focused guidance and article. Regarding AE evaluation approaches, as with other aspects of clinical study design, the FDA staff made it clear that its thinking has evolved since 2017, and new drug-development sponsors should discuss their abuse potential assessment approaches with the FDA on a case-by-case basis.
ISGIDAR 2024 panel presentations and discussion
The ISGIDAR panel that is the primary basis for this commentary included participants from the CCALC meeting and other abuse potential experts who presented in the 2024 ISGIDAR panel on their experience with novel substance evaluation and potential new methods development. Presentations discussed experience in assessing the abuse potential of benzodiazepines, cannabinoids, and other substances with lower abuse potential than most Schedule II and III drugs. The presentations and discussion included the potential to more actively incorporate behavioral economics procedures and other approaches to abuse potential assessments, as well as to give as much weight to preclinical evidence and AE data from clinical studies as would seem consistent with the FDA’s conclusions in its CPDD presentations (Calderon et al., 2020; Caro et al., 2022).
Discussions among the ISGIDAR panelists and attendees, as well as other researchers, after the meeting were rich in ideas to advance the science of abuse potential assessment and its application to CSA scheduling. A key point of discussion during the meeting and among these coauthors is the importance of taking into account all lines of abuse potential-related evidence in a balanced fashion when developing recommendations for CSA and international drug scheduling and not to overly weigh, for example, peak liking scores in HAP studies or positive reinforcing effects in animal self-administration studies, when other lines of evidence suggest effects that would be expected to limit attractiveness for post-marketing recreational use.
For example, a selective dopamine and norepinephrine reuptake blocker, solriamfetol produced significantly elevated peaking liking scores in humans, and reinforcing effects in animals; however, it also produced direct dose-dependent “negative” and “bad” ratings which would be expected to limit recreational use and dose escalation. In FDA’s review of the abuse potential evidence provided by the sponsor, it appears multiple lines of evidence were considered and limitations in some studies (e.g., animal self-administration) were acknowledged in its final conclusion, summary, and scheduling recommendation: “Based on the findings of the HAP study, low rate of abuse-related AEs in clinical trials, absence of withdrawal symptoms, and lack of evidence of drug abuse and diversion during the clinical trials, we concur with the Sponsor that Solriamfetol should be placed in Schedule IV of the CSA.” (U.S. FDA, 2019). FDA’s scheduling recommendation was then reviewed by National Institute on Drug Abuse (NIDA) and subsequently submitted to the DEA by the Assistant Secretary of Health (as summarized in U.S. FDA, 2017), in its publication of its final scheduling rule in the Federal Register (DEA, 2019). 11
Conclusions and recommendations on some of the limitations of current approaches for premarket abuse potential assessments and proposals for research on how to improve predictions of post-market recreational use and misuse are summarized in Tables 2 and 3.
Conclusions and recommendations.
Research Needs.
Discussion
Abuse potential assessment provides an orderly and generally reliable basis for evaluating the rewarding, reinforcing, discriminative, and interoceptive drug effects that may contribute to recreational substance use and misuse. Overall, the methods recommended by the FDA in its 2017 guidance appear to have provided generally reliable premarket evidence of potential post-market abuse and misuse-related risks upon which to base CSA scheduling recommendations. However, the FDA recognizes (Calderon et al., 2023 Caro et al., 2022) the challenges associated with HAP studies with respect to novel substances in recent years that may have contributed to the overestimation of the abuse potential (and likely scheduling) of some new drug products. This is a serious matter because overly restrictive scheduling can be as problematic for public health and medical treatment as under-scheduling, because over-scheduling may reduce (1) the likelihood that clinicians will prescribe the novel medication and (2) the incentive for industry to invest in the development of substances with lower abuse potential.
As summarized in Table 2, it is evident that CSA scheduling recommendations for novel acting drugs may be more reliable with greater weight given to animal studies along with AEs and other premarket data, and not assume that HAP studies are the most reliable evidence. Nonetheless, HAP studies can provide important information that contributes to the overall drug safety assessment, scheduling, and labeling, and may be improved by consideration of assessments in addition to drug-liking. Although the main line of evidence identified by FDA as “discrepant” was HAP study outcomes (Caro et al., 2022; see also Hawkins and Gidal, 2017), it is timely to reevaluate both animal and human approaches to abuse potential assessment. Psychedelic substances with strong hallucinogenic effects raise safety and reliability concerns that should be taken into account before being recommended as part of clinical development.
It would seem appropriate for the NIDA to lead and coordinate research to reexamine abuse potential assessment approaches because the range of potential research is broad and may include the entire spectrum of NIDA-supported intramural and extramural research. However, because abuse potential assessment provides the foundation for FDA and DEA scheduling recommendations and actions, it would seem ideal for those agencies to have input to ensure that the methods and data will be relevant to CSA scheduling.
Table 3 provides specific recommendations for areas of needed research and development, including refinement of methods to ensure the safety and reliability of studies involving psychedelics with strong hallucinogenic effects. Furthermore, insofar as abuse potential assessment is a relative process relying upon positive and sometimes negative control comparator substances to guide recommendations for scheduling, it may be helpful to obtain more systematic input from the DEA, FDA, and NIDA as to their thoughts on the advantages and disadvantages of various potential comparators for novel substances in development. Natural products that may contain many potential CNS active alkaloids, as exemplified by cannabis and kratom, also pose challenges to the single-entity focused methods typical of pharmaceutical-focused research. There also seems to be an emerging consensus that various behavioral economics-based approaches to abuse potential assessment hold promise to advancing reliability.
We agree that research is needed to improve the reliability of HAP studies for predicting post-market recreational use and guide drug scheduling, but in the meantime, rather than assuming that HAP findings on a drug liking scale are substantially determinative of scheduling, it seems more appropriate to consider HAP study drug liking data in the context of other data from the same studies, as well as overall evidence from other studies and lines of evidence in the development of CSA scheduling recommendations.
FDA 2017 Guidance – Assessment of abuse potential of drugs. While we believe the FDA’s 2017 Guidance warrants updating, we also recognize that new guidance could take several years to draft and finalize, during which the pharmaceutical pipeline would likely continue to evolve. Thus, in the near and long term it may be more important for sponsors to interact early and continuously with FDA’s Controlled Substance Staff during drug development to discuss their data and approaches to abuse potential assessment, keeping in mind that the FDA’s 2017 guidance states its recommendations are not binding and that the Agency will consider alternative approaches (U.S. FDA, 2017: 1).
In the experience of the authors of this commentary, drug developing sponsors often create proposals for abuse potential assessment that include various alternative approaches to those recommended by FDA’s 2017 guidance to more fully understand and predict the broad range of effects of their drug product and real-world post-marketing attractiveness for recreational use. Clearly, there is no one-size-fits-all for assessing the abuse potential of diverse substances, including novel and psychedelic substances, formulations, and products. Such approaches may include behavioral economic methods in animal self-administration studies (e.g., variations on progressive ratio protocols as discussed by Ator and Griffiths, 2003; Carter and Griffiths, 2009) and a broader range of outcome measures including designated primary outcome measures when a sponsor concludes that peak liking should not be the determinative factor in overall abuse potential assessment (Henningfield et al., 2022a, 2023; Strickland and Lacy, 2020; Strickland et al., 2023).
FDA does not always endorse the sponsor’s proposals for abuse potential assessment; however, FDA’s recommendations are not binding. Thus, whereas some sponsors meticulously follow FDA’s recommendations instead of their own earlier proposals, other sponsors proceed with approaches that were not endorsed by the FDA in their drug development program as well as in their abuse potential assessment sections of their NDAs.
On the assumption that rapid evolution of novel acting substances will continue and perhaps accelerate, it seems timely, if not imperative, that all agencies that contribute to CSA drug scheduling work together, taking flexible and adaptive approaches to address the issues posed by novel substances. Development and validation of such alternative approaches would likely be expedited by funding from DEA and the FDA as well as the NIH and provide necessary funding for research that will guide abuse potential assessment into the future. The intent and hope of the authors is that this commentary will contribute to those ends.
Footnotes
Author note
This commentary is based on a panel held at the annual meeting of the ISGIDARs, which is a satellite meeting of the CPDD. This panel was moderated by Jack E. Henningfield and Sandra D. Comer with Ziva D. Cooper, James K. Rowlett, and Justin C. Strickland as presenters and Sandra D. Comer as discussant. James K. Rowlett discussed the evolution of methods and findings over several decades of benzodiazepine and related drug studies and implications for abuse potential assessments with novel and relatively weak abuse-signaling substances. JC Strickland focused on behavioral economic assessment methods that might be adopted to improve abuse potential assessment reliability and accuracy. ZD Cooper discussed challenges raised by the diverse pipeline of cannabinoids contained in and derived from cannabis that also pose challenging issues for abuse potential assessments.
ORCID iDs
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Through Pinney Associates, JEH, MAC, CJD, & RKL provide scientific and regulatory consulting on a broad range of CNS-active substances and drug products including psychedelic substances, cannabinoids, new chemical entities, and alternative formulations and routes of delivery, as well as dietary ingredient notifications. JEH has also developed reports and given depositions addressing kratom addictiveness, and risks and benefits to kratom users, and the regulatory context on behalf of several defendants in kratom litigation.
