Abstract
Objectives:
There is an increasing interest in combining psilocybin or methylenedioxymethamphetamine with psychological support in treating psychiatric disorders. Although there have been several recent systematic reviews, study and participant numbers have been limited, and the field is rapidly evolving with the publication of more studies. We therefore conducted a systematic review of PubMed, MEDLINE, PsycINFO, the Cochrane Central Register of Controlled Trials, Embase, and CINAHL for randomised controlled trials of methylenedioxymethamphetamine and psilocybin with either inactive or active controls.
Methods:
Outcomes were psychiatric symptoms measured by standardised, validated and internationally recognised instruments at least 2 weeks following drug administration, Quality was independently assessed using the Cochrane risk of bias assessment tool and Grading of Recommendations Assessment, Development and Evaluation framework.
Results:
There were eight studies on methylenedioxymethamphetamine and six on psilocybin. Diagnoses included post-traumatic stress disorder, long-standing/treatment-resistant depression, obsessive-compulsive disorder, social anxiety in adults with autism, and anxiety or depression in life-threatening disease. The most information and strongest association was for the change in methylenedioxymethamphetamine scores compared to active controls in post-traumatic stress disorder (k = 4; standardised mean difference = −0.86; 95% confidence interval = [−1.23, −0.50]; p < 0.0001). There were also small benefits for social anxiety in adults with autism. Psilocybin was superior to wait-list but not niacin (active control) in life-threatening disease anxiety or depression. It was equally as effective as escitalopram in long-standing depression for the primary study outcome and superior for most of the secondary outcomes in analyses uncorrected for multiple comparisons. Both agents were well tolerated in supervised trials. Trial quality varied with only small proportions of potential participants included in the randomised phase. Overall certainty of evidence was low or very low using the Grading of Recommendations Assessment, Development and Evaluation framework.
Conclusion:
Methylenedioxymethamphetamine and psilocybin may show promise in highly selected populations when administered in closely supervised settings and with intensive support.
Introduction
There has been recent interest in combining psychedelics and related compounds with psychotherapy for the treatment of mental, behavioural or developmental disorders including methylenedioxymethamphetamine (MDMA) and psilocybin (Dos Santos et al., 2018; Gill et al., 2020; Goldberg et al., 2020; Illingworth et al., 2021; Nutt et al., 2020; Vargas et al., 2020).
Psilocybin is a prodrug of the more active psilocin, which is produced by rapid dephosphorylation (Wolbach et al., 1962) although psilocybin does have some biological activity of its own (Sard et al., 2005). Both drugs are agonists of 5-HT2 receptors (Rickli et al., 2016; Sard et al., 2005), and studies in humans suggest that 5HT2A receptor occupancy may be critical for the psychedelic experience of psilocybin (Madsen et al., 2019), although actions at other receptors could also be involved. Psilocin has a half-life of about 3 hours and its kinetics appears to be dose linear (Brown et al., 2017).
MDMA is a related agent although it is not a classical psychedelic. However, prominent actions on transporters for 5-HT, noradrenaline and dopamine, as well as those for vesicular monoamines, mean that it has similar subjective effects (Holze et al., 2020). MDMA produces an increase in extracellular levels of each of these neurotransmitters (Dunlap et al., 2018), and while it may also have some direct actions at neurotransmitter receptors, elevations of 5-HT and noradrenaline are likely to be the most important proximal cause of the conscious effects of MDMA in humans (Hysek et al., 2012; Liechti et al., 2000).
There is emerging evidence of the therapeutic potential of both agents, and this is reflected in the breakthrough designation by the Food and Drug Administration (FDA) in the United States of MDMA and psilocybin for the treatment of post-traumatic stress disorder (PTSD) and treatment-resistant depression (TRD), respectively (Dos Santos et al., 2018; Gill et al., 2020; Goldberg et al., 2020; Illingworth et al., 2021; Nutt et al., 2020; Vargas et al., 2020). Psilocybin may also be effective for treating anxiety disorders, substance use disorders and end-of-life distress (Gill et al., 2020; Goldberg et al., 2020; Johnson et al., 2019; Vargas et al., 2020).
However, despite promising results to date, no psychedelics have been approved for clinical use on the Australian Register of Therapeutic Goods. In February 2020, Australia’s Therapeutic Goods Administration (TGA) made an interim decision rejecting a proposal to reclassify MDMA and psilocybin from Prohibited Substance (Schedule 9) to Controlled Drug (Schedule 8). Both classifications refer to substances that have the potential for abuse, misuse or dependence (Moulds, 1997). Controlled drugs have legitimate therapeutic uses and may be prescribed under strict legislative controls. By contrast, prohibited substances or drugs have no established therapeutic use and are only available for medical or scientific research, or for analytical, teaching or training purposes following approval from Commonwealth and/or State or Territory Health Authorities. The decision to not down-schedule was criticised on the basis that the TGA had placed excessive weight on sources that emphasised adverse effects in uncontrolled or recreational, as opposed to clinical settings (Chiruta et al., 2021). Importantly, the TGA deferred making a final decision, pending an independent report into the risks and therapeutic benefits of the drugs (Kisely et al., 2021). Following consideration of the report and other factors, the TGA confirmed the original decision to not down-schedule either agent in 2021 (Therapeutic Goods Administration, 2021).
Although there have been several recent full-text systematic reviews, the number of studies has been limited with low participant numbers (Andersen et al., 2021; Dos Santos et al., 2018; Gill et al., 2020; Goldberg et al., 2020; Illingworth et al., 2021; Muttoni et al., 2019; Vargas et al., 2020; Varker et al., 2021). In addition, the field is rapidly evolving with the availability of more studies and data that were not included in previous reviews (Carhart-Harris et al., 2021; Davis et al., 2021; Mitchell et al., 2021; Wolfson et al., 2020). In addition, reviews generally focussed on specific diagnoses, and it was also not always clear whether there were any restrictions by language in the literature searches. Only one review assessed the overall credibility of outcomes using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) framework as recommended by the Cochrane Collaboration, but this was restricted to PTSD and it was unclear how the guidelines were applied (Varker et al., 2021). Furthermore, for drugs with a rapid onset of action, it is important to consider studies with inactive and active controls in separate comparisons. This is because if a participant has been told that there will be a brief interval before the onset of any effect, it will be very obvious that in the absence of this, they are taking an inactive placebo. There is therefore a risk that any response in the intervention group may be enhanced by expectancy effects while for those in the control groups it may be decreased by disappointment on receiving a placebo (Muthukumaraswamy et al., 2021). These contrasting reactions may therefore artificially increase the treatment effect (Muthukumaraswamy et al., 2021).
In this report, we compiled data from randomised, double-blind, placebo-controlled trials of psilocybin and MDMA for all mental health conditions. These included anxiety, depression and post-traumatic stress and substance use disorders symptoms.
Method
The protocol was prospectively registered with the Open Science Framework (osf.io/hdt3s) and PROSPERO (CRD42021272217), an international database of prospectively registered systematic review protocols. In addition, we followed recommendations for the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement including background, search strategy, methods, results, discussion and conclusions (Moher et al., 2009). No Ethical Committee approval was required as this was a study of previously published papers.
Health outcomes
The primary outcomes of interest were psychiatric symptoms as measured by standardised, validated and internationally recognised instruments at least 2 weeks following drug administration. We only considered data from the randomised component of included studies and not any open-label extensions. As noted previously, these could include anxiety, depression and post-traumatic stress and substance use disorders. Secondary outcomes were response and remission rates, psychiatric symptoms at other times and any reported adverse effects, either immediately following administration or up to 7 days afterwards.
Inclusion and exclusion criteria
We included randomised controlled trials (RCTs) with inactive or active controls in the treatment of International Classification of Diseases, 10th Revision (ICD-10) mental, behavioural or developmental disorders that were published in a peer-reviewed paper from any of the databases in the following paragraph. We only included studies on humans and excluded studies in healthy volunteers, and pre-prints that had not been peer-reviewed. Both crossover and parallel group trials were eligible for inclusion. However, we only used results of the first phase/arm of treatment in crossover trials. This was to minimise the bias of study designs whereby participants experience both active and control conditions, and, in the context of informed consent, know that the intervention they are allocated to in the second phase/arm of a study will be the opposite of what they have already experienced in the first phase.
Search strategy
With the aid of a professional librarian, we searched the following databases up till August 2021 with no language limitations: PubMed, Embase, PsycINFO, CINAHL and the Cochrane Central Register of Controlled Trials (CENTRAL). Online Appendix 1 gives details of the searches. There were no restrictions on language. We searched for further publications by scrutinising the reference lists of initial studies identified and other relevant review papers. Where necessary, we made attempts to contact selected authors and experts. Pairs of reviewers (S.K., M.C. and A.A.S.) independently assessed titles, abstracts and papers, as well as extracted and checked extracted data for accuracy. In the case of disagreements, consensus was reached on all occasions.
Study quality and certainty of evidence
We assessed the quality of included studies using the following criteria of the risk of bias assessment tool, developed by the Cochrane Collaboration to assess possible sources of bias in RCTs: (1) Adequate generation of allocation sequence; (2) Concealment of allocation to conditions; (3) Prevention of knowledge of the allocated intervention to participants and personnel; (4) Prevention of knowledge of the allocated intervention to assessors of outcome; (5) Dealing with incomplete outcome data; (6) Selective reporting of outcomes; and (7) Other sources of bias (Higgins and Green, 2008).
The GRADE framework was used to assess the overall credibility of evidence for each outcome (Schünemann et al., 2013). The GRADE approach uses several domains to categorise levels of certainty as very low, low, moderate and high. Evidence from RCTs is initially graded as high but can then be downgraded to lower levels depending on study limitations (risk of bias), inconsistency, indirectness, imprecision and publication bias. Two reviewers (S.K. and D.S.) undertook the ratings.
Statistical analysis
We used Review Manager version 5.2 for Windows, a statistical software package for analysing Cochrane Collaboration systematic reviews. We calculated the standardised mean difference (SMD) for continuous data even where studies used the same scale as the SMD is more generalisable than the mean difference. We reported the risk ratio (RR) for any dichotomous outcome. Where possible, intention-to-treat (ITT) analyses were used. For SMDs, we categorised the strength of effect size in terms of weak (0.2), moderate (0.5) and strong (0.8) (Ferguson, 2009; Sullivan and Feinn, 2012). For RRs, the equivalent values were 2.0, 3.0 and 4.0, respectively (Ferguson, 2009; Sullivan and Feinn, 2012). Where data were available for two or more studies, they were combined in a meta-analysis.
Where studies compared different doses of active agent against the same controls, the number of controls was halved to avoid counting the same subjects twice. Where there were odd numbers that could not be halved, differences in comparisons were investigated using sensitivity analyses. Some studies measured the same outcome with several different scales such as the Beck Depression Inventory (BDI) and Hamilton Depression Rating Scale. In such studies, we undertook sensitivity analysis of the effect of substituting one scale for another.
We assessed heterogeneity using the I2 statistic, a measure that does not depend on the number of studies in the meta-analysis and hence has greater power to detect heterogeneity when the number of studies is small. An estimate of 50% or greater indicates possible heterogeneity, and scores of 75–100% indicate considerable heterogeneity.
We used the random effects model for all the analyses as we could not definitely exclude between-study variation even in the absence of statistical heterogeneity given the range of interventions under review. For any outcomes where there were at least 10 studies, we planned to test for publication bias using funnel plot asymmetry where low p-values suggest publication bias. The US National Library of Medicine Clinical Trials Registry (https://clinicaltrials.gov/) was also searched for registered trials of psilocybin and MDMA on 15 October 2021. All protocols registered prior to 2018 were reviewed for unpublished data.
Results
After the elimination of duplicates, we found 837 citations of interest, of which 143 full-text papers were potentially relevant and assessed for eligibility. Of these, 129 papers were excluded for reasons listed in Figure 1 and Supplementary Table 1. This left 14 papers (Figure 1). Of these, nine had data that could be combined into a meta-analysis of either beneficial or adverse effects. There were eight studies on MDMA (Bouso et al., 2008; Danforth et al., 2018; Mitchell et al., 2021; Mithoefer et al., 2011, 2018; Oehen et al., 2013; Ot’alora et al., 2018; Wolfson et al., 2020) and six on psilocybin (Carhart-Harris et al., 2021; Davis et al., 2021; Griffiths et al., 2016; Grob et al., 2011; Moreno et al., 2006; Ross et al., 2016). Clinical conditions included PTSD, TRD, obsessive-compulsive disorder, social anxiety in adults with autism and anxiety or depression in life-threatening disease (LTD). There were no RCTs on other conditions such as substance use disorders and no studies were conducted in Australia. All studies were conducted in closely supervised settings and with intensive support.

PRISMA 2020 flow diagram for new systematic reviews which included searches of databases and registers only.
MDMA
Of the eight studies on MDMA (n = 212 participants), six were on PTSD (n = 182), one was on anxiety due to an LTD (n = 18) and the other on social anxiety in adults with autism (n = 12) (Table 1). All were parallel-arm RCTs often followed by open-label extensions. Five out of eight studies used inactive placebo as the control while the remaining three used low doses of MDMA. In all studies, both the intervention group and controls received psychotherapy.
Included MDMA studies.
M: male; F: female; SD: standard deviation; MDMA: methylenedioxymethamphetamine; BP: blood pressure; TRD: treatment-resistant depression; TR: treatment resistant; PTSD: post-traumatic stress disorder; RCT: randomised controlled trial; CAPS: clinically administered-PTSD scale; BDI: Beck Depression Inventory; HADS: Hospital Anxiety and Depression Scale; LTD: life-threatening disease; HAM-D: Hamilton Depression Rating Scale; SDS: Sheehan Disability Scale; PTGI: Posttraumatic Growth Inventory; STAI: State-Trait Anxiety Inventory; PDS: Posttraumatic Diagnostic Scale; LSAS: Leibowitz Social Anxiety Scale; FFMQ: Five Factor Mindfulness Questionnaire; ITT: intention-to-treat; AEs: adverse effects; IESR: Impact of Event Scale–Revised; SSSPTSD: Severity of Symptoms Scale for Post-traumatic Stress Disorder; PSQI: Pittsburgh Sleep Quality Index; GAF: Global Assessment of Functioning; NEO-PI-R: Revised NEO Personality Inventory; DES-II: Dissociative Experiences Scale II; FFMQ: Five Facet Mindfulness Questionnaire; FDA: Food and Drug administration; SBP: systolic blood pressure; TEAE: Treatment emergent adverse effect; C-SSRS: Columbia Suicide Severity Rating Scale.
There were statistically significant differences between intervention and control groups in five out of eight studies, and non-significant differences in the remaining three (Table 1). However, all of the studies except one (Mitchell et al., 2021) reported on a relatively small number of participants (Table 1). Five out of the six studies on PTSD used the Clinician Administered PTSD Scale (CAPS) (Table 1).
Figure 2 summarises the outcomes and change in continuous scores for PTSD at between 4 and 12 weeks following drug administration. Of the five analyses, there were statistically significant differences in two: (1) endpoint scores for MDMA doses of greater that 100 mg in comparison with inactive controls and (2) change scores in comparison with active controls.

Continuous PTSD outcomes at 4–12 weeks in MDMA studies.
Two studies also assessed whether participants had shown a 30% reduction in CAPS scores (response) or no longer met criteria for a case (remission). One study was a comparison with inactive controls and reported identical results that just failed to reach predefined statistical significance for both response and remission (k = 1; RR = 3.33; 95% confidence interval [CI] = [0.98, 11.37]; p = 0.054) (Mithoefer et al., 2011). The other was a comparison with active controls where results were clearly non-significant for both response (RR = 2.33; 95% CI = [0.68, 8.04]; p = 0.18) and remission (k = 1; RR = 2.04; 95% CI = [0.58, 11.37]; p = 0.27) (Mithoefer et al., 2018).
MDMA also resulted in statistically significant improvements in social anxiety in the one study of adults with autism when compared to placebo (k = 1; SMD = −1.42; 95% CI: [−2.81, 0.04]; p = 0.04) (Danforth et al., 2018). However, the results in another study on anxiety in LTD were non-significant (k = 1; SMD = −1.03; 95% CI = [−2.13, 0.07]; p = 0.07) although participant numbers were low (Wolfson et al., 2020). Effect sizes were large in all comparisons but with wide CIs.
MDMA was well tolerated in all the studies. The main adverse effects reported were anxiety, restlessness, fatigue, jaw-clenching, headache and transient increases in blood pressure. Serious events such as suicidal ideation were rare and occurred almost entirely in the placebo arm or were otherwise unrelated to the treatment. There were no attempts at assessing any biochemical or haematological changes. We were able to perform a meta-analysis from the results of five of the common adverse effects. Most information concerned the number of participants experiencing adverse events immediately after administration (Figure 3). The only statistically significant difference was that participants receiving MDMA were more likely to experience jaw clenching.

MDMA – adverse effects per subject (immediate).
There were similar findings for adverse events up to 7 days after drug administration except that participants who received MDMA were more likely to report a reduced appetite (Supplementary Figure 1). Two studies reported on events per session rather than patient. There were no significant differences between MDMA and control groups either immediately or up to 7 days afterwards (Supplementary Figures 2 and 3).
Psilocybin
Among the six studies on psilocybin (n = 187 participants), three were for anxiety or depression for an LTD (n = 92), two for long-standing or TRD (n = 86) and one for obsessive-compulsive disorder (n = 9) (Table 2). Two studies used low doses of psilocybin as the control and another two used the vasodilator niacin, as the latter induces a mild physiological reaction (e.g. flushing) without any psychological effects (Table 2). One study used escitalopram as the comparator and the final study used wait-list controls (Table 2). Only one study was a parallel-arm RCT (Carhart-Harris et al., 2021), all the others having a cross-over design (Table 2). Primary outcomes such as those prior to cross-over were measured at between 2 and 8 weeks. Four of the six studies used a single dose and three studies used psychotherapy or psychological support.
Included psilocybin studies.
M: male; F: female; SD: standard deviation; TRD: treatment-resistant depression; RCT: randomised controlled trial; QIDS-SR16: Quick Inventory of Depressive Symptomatology (self-report) (16-item); BDI: Beck Depression Inventory; MDMA: methylenedioxymethamphetamine; BP: blood pressure; HR: heart rate; AEs: adverse effects; HADS: Hospital Anxiety and Depression Scale; LTD: life-threatening disease; MADRS: Montgomery–Åsberg Depression Rating Scale; HAM-D: Hamilton Depression Rating Scale; SDS: Sheehan Disability Scale; STAI: State-Trait Anxiety Inventory; YBOCS: The Yale-Brown Obsessive Compulsive Scale; SIDAS: Suicidal Ideation Attributes Scale; PRSexDQ: Psychotropic-Related Sexual Dysfunction Questionnaire; LEIS: Laukes Emotional Intensity Scale; BEAQ: Brief Experiential Avoidance Questionnaire; SHAPS: Snaith Hamilton Anhedonia Pleasure Scale; WEMWBS: Warwick-Edinburgh Mental Wellbeing Scale; FS: Flourishing Scale; ITT: intention-to-treat; POMS: Profile of Mood States; BPRS: Brief Psychiatric Rating Scale; HADS: Hospital Anxiety and Depression Scale; GRID-HAMD: GRID-Hamilton Depression Rating Scale; WSAS: Work and Social Adjustment Scale.
One study reported statistically significant differences between psilocybin and niacin (Ross et al., 2016), and another between high- and low-dose psilocybin for subjects with anxiety or depression due to LTD (Griffiths et al., 2016). Psilocybin was superior to remaining on a wait-list in a study of treatment-resistant depression (Davis et al., 2021). In another study of depression, there was no significant difference between psilocybin and escitalopram in the pre-determined primary outcome, although changes in secondary outcomes generally favoured psilocybin (Carhart-Harris et al., 2021). In a fifth study, there were no statistically significant differences between psilocybin and controls at the 2-week follow-up, although both groups showed long-term improvements following cross-over (Grob et al., 2011). In the final study, there was no significant effect of dose on obsessive-compulsive symptoms possibly because of low numbers and unexpectedly high response to the low-dose placebo (25 μg/kg) (Moreno et al., 2006).
In terms of continuous scores for the primary outcomes of endpoint depression with or without anxiety at between 2 and 8 weeks following administration, psilocybin was significantly better to wait-list control (k = 1; SMD = −2.60; 95% CI = [3.70, −0.5]; p < 0.0001) but not to niacin (k = 2; SMD = −0.99; 95% CI = [2.33, 0.35]; p = 0.15). It was equally as effective as escitalopram on the primary outcome of the 16-item Quick Inventory of Depressive Symptomatology–Self-Report scale (QIDS-SR) (k = 1; SMD = −0.36; 95% CI = [−0.87, 0.15]; p = 0.17) but superior to escitalopram for most of the secondary outcomes although there was no correction for multiple comparisons (Table 2).
Three studies also assessed whether participants had shown a clinically significant response or were in remission as regards depression or anxiety (Figure 4). There were statistically significant differences between psilocybin and active placebo (niacin or low-dose psilocybin) while psilocybin remained as effective as escitalopram in terms of QIDS-SR response and better in terms of remission (Figure 4). Psilocybin was also significantly better than escitalopram for all the other secondary outcomes although there was no correction for multiple comparisons (Table 2). In comparison with either active placebo or escitalopram, effect sizes ranged from small to strong.

Remission and response rates in anxiety or depression scores in psilocybin studies.
Adverse events were similar to those of MDMA and well tolerated in all the studies (Table 2). The main effects were anxiety, headache and transient increases in blood pressure. None were coded as serious. It was not possible to combine the results quantitatively.
Heterogeneity, publication bias and sensitivity analyses
All but two of the results had an I2 estimate of less than 50% suggesting that our results were not affected by heterogeneity. Sensitivity analyses of substituting one scale for another generally made little difference to the outcomes.
There were insufficient studies to formally test for publication bias using funnel plot asymmetry. However, we were able to undertake a clinical trial registry review to identify unpublished registered protocols. Sixty-three MDMA trials were listed on the registry, of which 39 had been registered prior to 2018. Of these 8 were included in our search, 27 were not in relevant patient populations, and 1 reported its data on 8 included studies on the registry site but did not have an associated publication. An additional three studies were terminated due to problems with enrolment. Three protocols were reported as having been completed in the past 4 years, of which one was not in a relevant population, one was included in our review and another did not have a suitable trial design. Among 69 registered clinical trials of psilocybin, 22 were registered prior to 2018. Of these, 5 were identified in our search, 11 were in populations outside of mental health, 3 were not expected to report until after the date of the registry search, 1 was withdrawn before it began recruiting and 2 studies from 2012 and 2014, respectively, were yet to report any data. Two trials registered in the last 4 years were reported as completed, one of which was identified in our search and the other was not in a relevant patient population.
Assessment of quality and credibility
Study quality was not optimal on the risk of bias assessment tool (Supplementary Table 2). Generation of the random allocation sequence and risk of bias in allocation concealment were adequate in seven studies, while in the other seven it was unclear. Twelve of the studies were described as double-blinded (Supplementary Table 2) while one used a wait-list control (Davis et al., 2021) and another made no mention of blinding (Moreno et al., 2006). However, in three studies, it was unclear whether blinding was successful as investigators were able to guess the correct allocation in a high proportion of cases (Mithoefer et al., 2011; Ross et al., 2016; Wolfson et al., 2020).
Attrition bias was low in 13 out of 14 studies because of high rates of follow-up although only 6 explicitly used ITT analyses, all but one of which were on MDMA (Carhart-Harris et al., 2021; Danforth et al., 2018; Mitchell et al., 2021; Mithoefer et al., 2018; Ot’alora et al., 2018; Wolfson et al., 2020). All but two of the studies were rated as unclear for reporting bias largely because there was no protocol with which to make a comparison. In a further two studies, outcomes were largely presented as graphs. In the case of one study, where it was difficult to extract numbers from the relevant figures, the authors were contacted for clarification, but no reply was received. In terms of other sources of bias, all but two of the studies (Carhart-Harris et al., 2021; Davis et al., 2021) were either fully or partly funded and/or supported by the Heffter Research Institute or the Multidisciplinary Association of Psychedelic Studies (MAPS). Both are privately funded non-profit research and educational organisations that promote the therapeutic uses of psychedelics. The latter organisation includes MAPS Public Benefit Corporation (MAPS PBC), a wholly owned subsidiary that reports that it balances income from legal sales of MDMA with the social benefits of MAPS’ mission.
Another source of bias was that only a small proportion of potential participants were actually randomised. Where it was recorded, participants were overwhelmingly White/European. In addition, trials generally excluded people with a personal or family history of psychosis, personal history of mania, repeated violence towards others and a recent personal history of a suicide attempt, as well as those with current drug or alcohol use disorders, which may limit generalisability. There was also an uneven distribution between the intervention and control arms with more participants allocated to the experimental group in all but three studies (Carhart-Harris et al., 2021; Griffiths et al., 2016; Ross et al., 2016).
Supplementary Tables 3 and 4 present the GRADE assessments for MDMA and psilocybin, respectively. Entries for risk of bias and inconsistency were derived from Supplementary Table 2 and I2 statistics, respectively. In terms of indirectness, all studies were rated as having serious limitations largely because of concerns about whether the population and/or intervention differed from those that might be relevant for the wider population. All but one outcome was rated as having severe limitations under imprecision, because data came from one or two small studies.
Given it was not possible to test for publication bias using tests of funnel asymmetry, this was assessed in terms of the number and size of studies, possible conflicts of interest in study sponsors or evidence of unpublished studies in clinical trial registries. Despite there being relatively few unpublished studies, all outcomes were rated as being at moderate risk of publication bias because of the limited number of small studies and/or being either fully or partly funded and/or supported by the Heffter Research Institute or the MAPS. As a result of the above limitations, the overall certainty of evidence was rated low or very low.
Discussion
By combining the effects of small and possibly underpowered studies, meta-analyses can help to establish the relative efficacy of MDMA and psilocybin-assisted psychotherapy where large studies may be impractical. The strengths of this review include a search that covered a wide range of diagnoses, the inclusion of four recent and relatively large participant number studies for the field and assessment of outcomes using the GRADE framework. Although we were only able to combine results from nine studies into separate (MDMA, psilocybin) meta-analyses, we did demonstrate statistically significant (p < 0.05) differences between the two psychedelic agents and both inactive and active treatments for either continuous scores or dichotomous responses. However, effect sizes did range from small to strong and 95% CIs were wide. Evidence was strongest for MDMA, especially in doses of over 100 mg.
Both agents were well tolerated with limited evidence of acute serious adverse reactions in trial participants that could be attributed to either agent at the dosing regimens used. This is an important observation given concerns over the potential for neurotoxicity, diversion and psychosis in unregulated environments (Royal Australian and New Zealand College of Psychiatrists, 2020).
However, it is important to note that this was in highly supportive and structured environments including intense psychotherapy sessions in many cases, especially for MDMA. Indeed, it appears that the interaction between the pharmacological action of both agents and concurrent psychotherapy is important for success, although this has not been formally examined (Andersen et al., 2021; Perkins et al., 2021). Unlike conventional psychotropic pharmacotherapy, MDMA and psilocybin therapy involves psychological preparation prior to administration and understanding the subjective experience during treatment, as well as psychological support with assimilation and integration afterwards (Perkins et al., 2021). For instance, both the classical psychedelics and MDMA appear to increase the affective bond between patient and therapist thereby enhancing the therapeutic alliance through increasing a sense of closeness, openness and trust (Andersen et al., 2021; Perkins et al., 2021). In addition, these trials are often conducted in specific settings with close attention to the room and ambience including the use of sounds or music.
There are several limitations. The most obvious is that we were only able to find and combine data for meta-analysis from 9 of 14 eligible studies (n = 399 participants). Overall, study quality was not optimal, and despite studies being described as double-blind, there was a concern that observers and/or patients may still have been aware of their treatment allocation (Burke and Blumberger, 2021). There was relatively little loss to follow-up after randomisation in any of the studies. However, in several trials, only a small proportion of potential participants were included in the randomised phase. In addition, the exclusion criteria limit the findings to people with PTSD, depression, anxiety and obsessive-compulsive disorders, but not those with a family or past history of other psychiatric disorders (particularly schizophrenia and bipolar disorder). Furthermore, we were unable to find any RCTs on substance use disorders, and there were relatively small participant numbers largely restricted to White/European populations. This is particularly relevant given the high rates of PTSD in Indigenous Australians (Nasir et al., 2021).
Another potential source of potential bias was reliance on funding from research and educational organisations that promoted the therapeutic uses of psychedelics. This may partly be explained by barriers to Federal funding of such research in the United States (Marks and Cohen, 2021).
A major unknown is the degree to which the psychedelic/psychotherapy interaction is dependent on the specific type of psychotherapy administered – would clinical practice need to follow a specific manual, or would other styles work?
Many of the studies on psilocybin used a crossover design, which limits the interpretation after the crossover, such that only the outcomes prior to the crossover at 5–7 weeks, respectively, could be reliably due to the drug. This may be of concern given recently released preliminary results from an adequately powered RCT (n = 233) that was however not peer reviewed and so ineligible for inclusion in this study (Compass, 2021). Subjects with TRD were randomised to receive psilocybin doses of 25, 10 or 1 mg, and while there were statistically significant differences in both continuous and dichotomised Montgomery–Åsberg Depression Rating Scale (MADRS) scores between the 25 and 1 mg doses up to week 6, differences in some of the same outcomes were non-significant at 12 weeks.
In the parallel controlled comparison of psilocybin with escitalopram, the statistically significant benefits in secondary outcomes for psilocybin were not corrected for multiple comparisons. These findings should therefore be viewed with caution. Finally, we had insufficient studies to test for publication bias and although the I2 values were low, we cannot exclude the possibility of heterogeneity given the wide 95% CIs. All these limitations are reflected in the overall certainty of evidence being rated as low or very low using the GRADE framework.
The TGA’s 2021 decision to not down-schedule these agents reflects the need for additional and larger RCTs to confirm initial promising results. Importantly, participants in future clinical trials should be more representative of the general population with serious and treatment-resistant anxiety and depression, and PTSD. Future trials should also consider how psychotherapy contributes to treatment success through factors such as the therapeutic alliance (Perkins et al., 2021). The Working Alliance Inventory is an example of a valid and reliable instrument that measures the therapeutic alliance (Muthukumaraswamy et al., 2021).
Given the short-term and obvious effects of these agents, trials should also attempt to minimise expectancy through the use of active placebo agents and parallel rather than cross-over designs (Burke and Blumberger, 2021; Muthukumaraswamy et al., 2021). In addition, RCTs should include assessments of expectancy and the success of masking using standardised instruments (Muthukumaraswamy et al., 2021). Examples include the six-item Credibility/Expectancy Questionnaire and the Stanford Expectation of Treatment Scale (Muthukumaraswamy et al., 2021).
In conclusion, MDMA and psilocybin show potential as therapeutic agents in highly selected populations when administered in closely supervised settings with intensive support. Evidence appears strongest for MDMA. By contrast, randomised findings for psilocybin are largely limited to short-term follow-up data prior to cross-over (Kisely et al., 2021). Larger studies of more representative participants are required to provide robust dose-response evidence of the clinical benefit of MDMA and/or psilocybin-assisted psychotherapy in people with relevant psychiatric disorders.
Supplemental Material
sj-docx-1-anp-10.1177_00048674221083868 – Supplemental material for A systematic literature review and meta-analysis of the effect of psilocybin and methylenedioxymethamphetamine on mental, behavioural or developmental disorders
Supplemental material, sj-docx-1-anp-10.1177_00048674221083868 for A systematic literature review and meta-analysis of the effect of psilocybin and methylenedioxymethamphetamine on mental, behavioural or developmental disorders by Steve Kisely, Mark Connor, Andrew A Somogyi and Dan Siskind in Australian & New Zealand Journal of Psychiatry
Footnotes
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship and/or publication of this article: S.K. is a member of both the Advisory Committee on Medicines of the Therapeutic Goods Administration and one of the committees that provided comments on the clinical memorandum on the therapeutic use of psychedelic substances from the Royal Australian and New Zealand College of Psychiatrists. His University was paid a consultancy fee by the Therapeutic Goods Administration for S.K. to provide an independent report into the risks and therapeutic benefits of the MDMA and psilocybin that informed the TGA’s decision to not down-schedule either agent. A.A.S. is a co-investigator on two National Health and Medical Research Council funded clinical trials on ketamine treatment for depression and an associate investigator on an application to the Medical Research Future Fund for a clinical trial using ketamine or psilocybin in mood and stress disorders. He is also a member of the Controlled Substances Advisory Council of South Australia. His University was paid a consultancy fee by the Therapeutic Goods Administration for A.A.S. to provide an independent report into the risks and therapeutic benefits of the MDMA and psilocybin that informed the TGA’s decision to not down-schedule either agent. M.C.’s University was paid a consultancy fee by the Therapeutic Goods Administration for M.C. to provide an independent report into the risks and therapeutic benefits of the MDMA and psilocybin that informed the TGA’s decision to not down-schedule either agent. D.S. chairs one of the committees that provided comments on the clinical memorandum on the therapeutic use of psychedelic substances from the Royal Australian and New Zealand College of Psychiatrists.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: Some of the work in this study was supported by the Government of Australia’s Therapeutic Goods Administration.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
