Abstract
Background:
Brexanolone (BRX) injection was approved by the United States Food and Drug Administration in 2019 for the treatment of adults with postpartum depression (PPD) based on two Phase 3 clinical trials.
Materials and Methods:
Data from the three trials were combined. PPD-specific 17-item Hamilton Rating Scale for Depression (HAMD-17) group-level minimal important difference (MID) and patient-level meaningful change (meaningful change threshold [MCT]) were estimated and applied to differences in BRX versus placebo (PBO) at hour 60 (primary endpoint) and day 30 (end of trial follow-up). Likelihood of HAMD-17 response and remission and Clinical Global Impression of Improvement (CGI-I) response for BRX versus PBO were assessed at hour 60 and as sustained through day 30 using relative risk. Associated number needed to treat (NNT) and number needed to harm (NNH) values were also estimated.
Results:
Two-hundred nine patients were included. The average HAMD-17 MID estimate was −2.1; the least-squared mean difference between BRX and PBO exceeded this at hour 60 and day 30. Minimal, moderate, and large MCTs were estimated to be −9, −15, and −20 points, respectively. Significantly more BRX-treated than PBO-treated patients achieved minimal, moderate, and large change (all ps < 0.05) at hour 60 and large meaningful response at day 30 (p < 0.05). BRX-treated patients were more likely to sustain HAMD-17 remission and CGI-I response through day 30 versus PBO. NNTs ranged from 4 to 8, with NNH of 97.
Conclusions:
BRX provided meaningful changes relative to PBO, rapid (hour 60), and sustained improvements (day 30) in PPD symptoms, low NNT, and large NNH. These results may help inform treatment decision-making.
Clinicaltrials.gov registration numbers: NCT02614547, NCT02942004, and NCT02942017.
Introduction
Postpartum depression (PPD) is one of the most common medical complications during and after pregnancy. 1 –7 In the United States, an estimated 11.5% of women giving birth experience symptoms of PPD, with global estimates of 17.7%. 6,8 Suicide is strongly associated with depressive symptoms and is a leading cause of pregnancy-related mortality. 9 –16 Mothers with PPD may experience difficulties with physical functioning and bonding with their infants. 17,18 Maternal PPD is associated with poorer cognitive and physical development of the child as well as higher rates of depression and poorer academic performance as the child grows into adolescence. 19 –22
In 2019, the Food and Drug Administration approved brexanolone (BRX) injection, a neuroactive steroid chemically identical to allopregnanolone, for the treatment of PPD in adults based on two Phase 3 randomized clinical trials (RCTs) (NCT02942004 and NCT02942017) following Breakthrough Therapy Designation in 2016. 23 BRX is administered via a 60-hour intravenous infusion in a monitored health care setting, and is available only through a restricted program under a Risk Evaluation and Mitigation Strategy (REMS) requiring patient enrolment, a restricted distribution program, and administration in a certified health care facility with monitoring by a health care provider for excessive sedation or sudden loss of consciousness. 24 In the Phase 3 trials, BRX achieved the primary endpoint of significantly greater reduction in depressive symptoms, as measured by least-squares mean reduction in 17-item Hamilton Rating Scale for Depression (HAMD-17) total score at the end of infusion (hour 60) compared with placebo (PBO). 23 Pooled analyses of two Phase 3 trials and a Phase 2 trial conducted under the same umbrella protocol showed a significant difference in HAMD-17 total score for BRX versus PBO as early as 24 hours and through the duration of the study period (day 30). 23 Nearly twice as many BRX-treated patients achieved HAMD-17 remission (score ≤7) at hour 60 (primary endpoint) compared to PBO-treated patients (50% vs. 26%, p < 0.001). The most common adverse reactions (incidence ≥5% and at least twice the rate of PBO) were sedation/somnolence, dry mouth, loss of consciousness, and flushing/hot flush.
While the HAMD-17 is considered the gold standard assessment in clinical trials for major depression and PPD, there is no PPD-specific estimate of meaningful change to facilitate patient-centered evaluation of the data. 25 –28 Meaningful change thresholds (MCTs) relate to patient-level meaningful change, rather than group change, and are estimated using both distribution-based methods and anchor-based methodologies. 29 Distribution-based methods alone can be used to estimate group-level meaningful difference or change, known as a minimal important difference (MID). MCTs and MIDs are population- and disease-specific and should be established within the specific population of interest. 30 While a range of meaningful differences have been reported for HAMD-17 in populations of patients with major depressive episode (MDE), none has previously been reported based on a PPD population. 31,32
In addition to the core trial endpoints, more applied, patient-centric secondary analysis of key trial data could further inform health care decision-making. For example, the evaluation of relative risk (RR), number needed to treat (NNT), and number needed to harm (NNH) specifically, can aid clinicians and health care decision-makers as they seek to understand the likelihood of favorable treatment outcomes. 33 –36
Thus, the current analyses aimed to explore outcomes relevant to clinicians, patients, and health care decision-makers in the evaluation of treatments for PPD. More specifically, the objectives of this study were to (1) estimate and apply PPD-specific HAMD-17 MCTs at hour 60 and day 30 and (2) further evaluate the proportion of patients treated with BRX versus PBO achieving HAMD-17 remission and response and Clinical Global Impression of Improvement (CGI-I) response by estimating the associated risk (RR) and NNT at hour 60 and through trial follow-up (day 30).
Materials and Methods
Study design and participants
The current report presents a post hoc analysis of clinician-reported outcomes data pooled together from previously published phase 2 and phase 3 clinical trials (hereafter “combined dataset”), which examined the safety and efficacy of BRX injection compared with PBO injection in patients with moderate to severe PPD (Study A: NCT02614547; Study B: NCT02942004; and Study C: NCT02942017). 23,37
Full descriptions of the trial designs, inclusion and exclusion criteria have been published previously. 23,37 In brief, one phase 2 (Study A) and two phase 3 (Studies B and C) multicenter, randomized, double-blind, PBO-controlled trials were conducted across the United States of America under an umbrella protocol (multiple, similar clinical trials conducted under one Institutional Review Board application). The clinical trials included women 18–45 years old, ≤6 months postpartum, with a diagnosis of moderate-to-severe PPD and a qualifying HAMD-17 (Studies A and B: ≥26; C: 20–25) who were either randomized 1:1:1 to receive 60 hour infusion of BRX 90 μg/kg/h, BRX 60 μg/kg/h, or PBO (Study B), or 1:1 to receive BRX or PBO (Studies A and C).
Once randomized, patients were treated in a medically supervised setting for 72 hours: 60 hours of continuous study drug infusion and 12 hours for completion of assessments. Patients were followed until day 30, with clinical and safety assessments on days 7 and 30. The primary endpoint in each trial was the least-squares mean difference in change from baseline in HAMD-17 total score at hour 60.
The studies were conducted with adherence to and compliance with the Declaration of Helsinki and Good Clinical Practice Guidelines. The study protocols were reviewed and approved by relevant Institutional Review Boards or independent Ethics Committees, and all patients provided written informed consent before study inclusion.
Outcomes
The HAMD-17 is a clinician-reported scale that evaluates core symptoms of depression. 25 Items are scored on a 0–4 (0 = none/absent and 4 = severe) or a 0–2 (0 = absent/none and 2 = clearly present) scale. Individual item scores are summed to compute the total score, which ranges from 0 to 52, with higher scores indicating more severe depression. 25 Clinical response on the HAMD-17 is defined as a ≥50% reduction from baseline; remission is defined as a HAMD-17 total score of ≤7.
The CGI-I is also a clinician-reported scale that uses a seven-point Likert scale to measure the overall improvement in the patient's condition. 38 Response choices include the following: 1 = very much improved, 2 = much improved, 3 = minimally improved, 4 = no change, 5 = minimally worse, 6 = much worse, and 7 = very much worse. 38 CGI-I responders are defined as patients receiving a rating of 1 (very much improved) or 2 (much improved).
Statistical analyses
The analyses were conducted on the BRX and PBO combined efficacy dataset. The dataset included all randomized patients who started infusion and had a valid baseline HAMD-17 and at least one postbaseline HAMD-17.
To estimate HAMD-17 total score patient-level MCT, distribution and anchor-based approaches were used. The distribution-based methods were as follows: ½ standard deviation (SD) at baseline and one standard error of measurement (SEM; SD × sqrt of [1 − reliability coefficient]). The reliability coefficient used to calculate the SEM was the intraclass correlation coefficient (ICC) between baseline and hour 2 HAMD-17 total scores for those subjects rated as “no change” using the CGI-I. The anchor-based approach used the CGI-I: the mean HAMD-17 total change from baseline was calculated for each CGI-I response at hour 60 and day 30, independent of the treatment arm. A level of minimal, moderate, and large meaningful change was estimated based on minimal, much, and very much improved CGI-I response categories, respectively. Mean data calculated across both timepoints for each CGI-I response was rounded to the nearest integer to yield a level of change achievable by an individual patient. The anchor-based values were compared to the distribution-based values to ensure that patient-level MCTs were greater than distribution-based MID estimates. Fisher's exact test was used to explore the difference in the proportion of subjects reporting minimal, moderate, and large meaningful change at hour 60 and day 30 across treatment groups.
To evaluate the proportion of patients demonstrating sustained HAMD-17 response and remission and CGI-I response, “sustained” was defined as the continued categorization of a responder and remitter at hour 60, day 7, and day 30. The 95% confidence intervals (CIs), and corresponding p-values, for BRX versus PBO were estimated using Fisher's exact test at hour 60 and for sustained responders and remitters at day 30. The RR and NNT were calculated based on the probability of being a responder and remitter (RR; Ptx/Ppbo) or the proportion of responders and remitters (NNT; 1/[Ptx − Ppbo]) in each treatment arm at hour 60, and sustained responders and remitters at day 30. The NNH was estimated based on the proportion of subjects in each treatment arm discontinuing the study drug due to an adverse event (AE). 34 Detailed safety and discontinuation information has been previously reported by Meltzer-Brody et al. 23 Asymptotic confidence limits were calculated for the RR; 95% CI were calculated for NNT and NNH based upon Wilson score intervals. 33
Due to the exploratory nature of the analysis, all statistical tests were two-sided and p-values ≤0.05 were considered significant. Analyses were post hoc in nature and were not adjusted for multiplicity. All analyses were performed using SAS® software Version 9.4.
Results
The combined efficacy data set included 209 patients: 102 treated with BRX and 107 treated with PBO. Demographics and baseline characteristics were generally well-balanced across treatment groups (Table 1). The mean age in each group was 27 and the mean HAMD-17 baseline score (preinfusion) was 25. Of the 209 patients included, 126 were white (60%), 77 (37%) were black or African American, and 38 (18%) were Hispanic or Latino. Across both groups, a higher proportion of patients had onset of PPD during 4 weeks postpartum than during the third trimester. A similar proportion of patients were taking antidepressants at baseline in both groups.
Baseline Demographics and Characteristics for Patients Treated with Brexanolone or Placebo
Weight, height, and body mass index data were measured at screening.
Assessed before injection on day 1.
BRX, brexanolone 90 μg/kg/h injection; HAMD-17, 17-item Hamilton Rating Scale for Depression; PBO, placebo injection; PPD, postpartum depression; SD, standard deviation.
PPD-specific meaningful patient level change
The distribution-based MID estimates based on ½ SD at baseline and 1 SEM were 1.8 and 2.4, respectively, with an average MID of 2.1. The HAMD-17 ICC used to evaluate the SEM was r = 0.55. Significant least-squares mean difference in HAMD-17 change scores between BRX and PBO at the end of hour 60 (−4.1 [95% CI: −6.0 to −2.3], p < 0.0001), as well as hours 24 (−3.0 [95% CI: −4.8 to −1.2], p = 0.0012), day 30 (−2.6 [95% CI: −4.7 to −0.4], p = 0.0213) have previously been reported. 23 Figure 1 applies the ½ SD at baseline and 1 SEM MID estimates to the combined BRX clinical trial data. The HAMD-17 group differences between BRX and PBO remain meaningful when applying estimates for the ½ SD MID, mean MID, and 1 SEM MID to the mean differences from hour 24 onward. The 95% CI estimates for hour 60 are also greater than the ½ SD MID and mean MID estimates.

Application of ½ SD and 1 SEM minimal important difference estimates to BRX versus PBO clinical trial data. The HAMD-17 group differences between BRX and PBO remain meaningful when applying estimates for the ½ SD estimate of minimal important difference and 1 SEM estimate of minimal important difference to the mean differences from hour 24 onward. LS, least-squares; BRX, brexanolone injection 90 μg/kg/h; HAMD-17, 17-item Hamilton Rating Scale for Depression; PBO, placebo; SD, standard deviation; SEM, standard error of measurement.
The patient-level HAMD-17 meaningful change estimates for minimal, moderate, and large change were −9, −15, and −20, respectively. These were based on the CGI-I mean change scores at hour 60 and day 30 for those classified as minimally improved (mean [M] hour 60 = −9.65; M day 30 = −8.73), much improved (M hour 60 = −15.31; M day 30 = −15.56), or very much improved, (M hour 60 = −20.12; M day 30 = −20.89), respectively. The highest change score reported for patients showing no change on the CGI-I was −4. As the MCT estimates all exceeded both distribution-based MID estimates and the highest level of improvement observed in patients considered stable, these were considered valid and applied to the hour 60 and day 30 data.
Figure 2 applies patient-level meaningful change values to hour 60 and day 30 and demonstrates that compared to PBO, BRX had a consistently higher level of response with a significantly greater proportion of patients demonstrating minimal (87% vs. 68%, Δ19%, p < 0.01), moderate (67% vs. 42%, Δ25%, p < 0.001), and large meaningful change (30% vs. 15%, Δ15%, p < 0.05) at hour 60, and large meaningful change at day 30 (41% vs. 26%, Δ15%, p < 0.05).

HAMD-17 patient-level meaningful change from baseline at hour 60
Response and remission: RR, NNT, and NNH
There were significantly higher proportions of HAMD-17 (74% vs. 56%, p < 0.01) and CGI-I (81% vs. 54%, p < 0.001) responders, and HAMD-17 remitters (50% vs. 26%, p < 0.001) in the BRX arm compared with PBO at hour 60 (Table 2). The proportion of sustained HAMD-17 remitters and CGI-I responders from hour 60 through day 30 was also significantly higher in the BRX arm compared with PBO (Table 2). The likelihood of being a responder or remitter, as measured by RR, was significantly higher in the BRX arm relative to PBO for both HAMD-17 and CGI-I responders and HAMD-17 remitters at hour 60, and for sustained CGI-I responders and HAMD-17 remitters at day 30. The proportion of sustained HAMD-17 responders was numerically, but not statistically higher at day 30 (p = 0.065, Table 3). As shown in Table 3, the RR in the BRX arm for responders was 1.34–1.50 times higher than PBO at hour 60 and 1.37–1.72 times higher than PBO at day 30. Depending on scale used, at hour 60, the NNT for BRX ranged from 4 to 6 and was 4–8 as sustained at day 30 (Table 3). The proportion of subjects who discontinued the studies due to an AE was n = 2/102 (2%) 2/102 in the BRX arm and n = 1/107 (0.9%) in the PBO arm, with a nonsignificant NNH of 97 (95% CI: 17 to −30).
Proportion of Patients Achieving Response and Remission at Hour 60 and Sustained Response and Remission from Hour 60 to Day 30
HAMD-17 response: ≥50% reduction from baseline.
HAMD-17 remission: score ≤7.
CGI response: very much/much improved.
BRX, brexanolone 90 μg/kg/h injection; PBO, placebo injection; CGI-I, Clinical Global Impression of Improvement.
Relative Risks and Number Needed to Treat for Brexanolone Versus Placebo at Hour 60 for Patients Achieving Response and Remission and Hour 60 to Day 30 for Sustained Response and Remission
HAMD-17 response: ≥50% reduction from baseline.
HAMD-17 remission: score ≤7.
CGI response: very much/much improved.
PBO: n = 106–107 at hour 60 and n = 104–106 at hour 60 to day 30; BRX: n = 98 at hour 60; n = 95–101 at hour 60 to day 30.
CI, confidence interval; NNT, number needed to treat; RR, relative risk.
Discussion
Clinicians, patients, and health care decision-makers (e.g., payers, policy makers) evaluating treatment options for PPD need a clear understanding of possible treatment options and their benefit profiles. Management of patients with PPD generally follows a stepped-care approach. Patients with mild symptoms are treated through low-intensity interventions (e.g., group therapy). However, for patients with symptoms that do not respond, are more severe, or present acutely, pharmacologic interventions, either alone or adjunctive to low-intensity interventions, are typically offered. 39 Before the approval of BRX, antidepressants, such as selective serotonin reuptake inhibitors (SSRIs), were the primary pharmacologic treatment option available for patients with PPD despite not being indicated specifically for PPD. Although there are many published RCTs and analyses evaluating SSRIs for the treatment of patients with MDE, 40,41 the paucity of evidence within PPD makes it difficult to evaluate these pharmacological treatment options in the PPD population. 42 Barriers to effective treatment with SSRIs include delay of 6–12 weeks to achieve optimal improvement, frequent subtherapeutic dosing, poor patient adherence due to side effects, withdrawal effects experienced by the majority of patients, and inadequate clinician follow-up. 43 –49 While the rapidity of effect, hypothesized mechanism of action, and modality of BRX may allow it to circumvent many of these barriers, its intravenous administration, associated with excessive sedation and loss of consciousness requiring a REMS, and the evolving reimbursement landscape surrounding its administration are sizeable obstacles that will likely initially confine its real-world placement in the treatment cascade for PPD to be primarily among patients requiring rapid resolution of symptoms, or those who have not responded to other treatments.
To our knowledge, this is the first article to propose PPD-specific estimates of HAMD-17 meaningful differences and patient-level meaningful change. While the primary objective of our analysis was to estimate and apply PPD-specific MCTs, the distribution-based analysis also provides an estimate of PPD-specific meaningful group difference (MID). Two different distribution-based approaches were taken: the ½ SD (MID 1.8) and 1 SEM (MID 2.4). However, given the instability of the sample in an active treatment setting, as illustrated by the relatively low HAMD-17 ICC of 0.55, the ½ SD MID of 1.8 may represent the best estimate of a HAMD-17 MID in a moderate to severe PPD population. The previously reported significant least-squares mean differences between BRX and PBO are meaningful when all MID estimates are applied to the means, as well as when the ½ SD MID and mean MID are applied to the 95% CI estimates for hour 60. 23
The MID estimates reported here appear lower than other published MID associated with the HAMD-17 (MID = 11). 31,32 However, previous estimates were calculated in MDE populations using CGI-I anchor-based methods for patient-level meaningful change and are thus inappropriate for interpreting group differences, being instead more comparable to the reported MCT estimates (no meaningful change ≤4; minimal change −9; moderate −15; large −20). In this context the previously reported values are reasonably well-aligned with those reported here, but the PPD-specific MCTs provide more granularity across different levels of response.
Once applied, a significantly higher proportion of patients demonstrated minimal, moderate, and large levels of meaningful improvement with BRX relative to PBO as rapidly as 60 hours after treatment. In addition, BRX provided a significantly higher proportion of patients with a large meaningful change (≥ −20) at day 30. All these estimates provide valuable additions to the interpretation of HAMD-17 (symptom) improvement in PPD, permitting population-specific exploration of HAMD-17 data with greater granularity regarding levels of improvement than the universal response and remission definitions.
In line with previous published findings, in these post hoc analyses, BRX demonstrated rapid (by hour 60) and high rates of response and remission (HAMD-17 and CGI-I response: 74% and 81% respectively, HAMD-17 remission: 50%) and sustained to day 30 (HAMD-17 and CGI-I response: 50% and 61%, respectively, HAMD-17 remission: 32%) compared with PBO. 23,37 Notably, this rapid effect was also observed despite the variable PBO response, which is well documented in depression and can result in high PBO responses. 50 –52
The RR, NNT, and NNH data presented here can be used by health care decision-makers to understand the magnitude and the relevance of BRX treatment effects. 34 –36 Patients treated with BRX showed significantly higher probabilities of response and remission at 60 hours (RR: 1.34–1.50) and sustained response and remission at day 30 (RR: 1.37–1.72) compared with PBO. Single-digit NNTs for both the rapid (hour 60) and sustained (through day 30) outcomes support the robustness of the observed BRX effect in PPD. To contextualize these results beyond the current dataset, the NNT estimates observed for BRX at hour 60 and through day 30 were lower than, or equal to, the NNTs (6–10) reported by Citrome et al. 34 in an indirect comparison of 34 antidepressant clinical trials in MDE, with response measured after 6–10 weeks of treatment. In addition, the data presented herein demonstrate that BRX had low discontinuation rates due to an AE with an associated NNH estimate of 97. The NNH for BRX was nonsignificant but higher than most trials included in the same indirect comparison (NNH: 7–43). 34
Although no head-to-head trials of BRX versus SSRIs for the treatment of PPD have been conducted, a recent evaluation of the efficacy of these treatments using match-adjusted indirect comparisons found that BRX demonstrated larger improvement in PPD symptoms compared to SSRIs. 53 A recent cost-effectiveness analysis also indicates that treatment of adults with PPD using BRX is cost-effective compared to SSRIs over an 11-year time horizon, particularly for patients with severe symptoms, based on a United States health care payer perspective. 54
There are limitations to this study. As this analysis was conducted on clinical trial data, estimating MCT values was not an objective at study onset, therefore the estimates reported here should be validated on an external dataset, and ideally additional anchors, including patient-reported anchors, should also be included. The MID values may be improved in a nonclinical trial sample, as a noninterventional study may provide a more accurate estimate of reliability (ICC), but the ½ SD value provides a good initial estimate to be further validated. The maximum follow-up of 30 days after the end of study treatment may be considered a limitation, however, this follow-up duration is similar to that used after primary endpoint assessment (from 4–24 weeks) in clinical trials of antidepressants in PPD. 42 Additional research should evaluate the real-world efficacy of BRX compared to SSRIs to provide the optimal inputs to inform clinical decision-making.
Conclusions
The results of these analyses provide estimates of meaningful group differences and patient-level change specific to PPD populations. Applying these estimates to BRX, the first treatment indicated specifically for women with PPD, using combined trial data indicates that both group- and patient-level improvements for BRX-treated patients relative to PBO-treated patients were clinically meaningful as early as 60 hours and were sustained to day 30. These estimates may also be used by future researchers and health care decision-makers in application to other PPD treatments. Patients treated with BRX demonstrated a higher likelihood of rapid and sustained HAMD-17 and CGI-I improvements. The data reported herein further support the positive clinical profile of BRX and provide valuable insights for health care decision-makers evaluating its placement within the PPD treatment cascade.
Footnotes
Author Disclosure Statement
Drs. Gerbasi, Bonthapally, Kanes, and Eldar-Lissai are employees of Sage Therapeutics, Inc. and own stock or stock options in the company. Dr. Hodgkins was an employee of Sage Therapeutics, Inc. at the time the study was conducted and owns stock/stock options in the company. Dr. Meltzer-Brody reports personal fees from MedScape and grants from Sage Therapeutics, Inc., awarded to the University of North Carolina (Chapel Hill, NC, USA) during the conduct of the brexanolone injection clinical trials and grants from Janssen, PCORI, and the NIH outside the submitted work. Ms. Acaster and Dr. Fridman are employees of Acaster Lloyd Consulting, Ltd. and AMF Consulting, respectively, which were paid by Sage Therapeutics to conduct the research reported in this manuscript.
Funding Information
This study was funded by Sage Therapeutics, Inc., Cambridge, MA.
