Abstract
Background:
Successful treatment of major depressive disorder (MDD) can be challenging, and failures ("treatment-resistant depression" [TRD]) are frequent. Steps to address TRD include increasing antidepressant dose, combining antidepressants, adding adjunctive agents, or using nonpharmacological treatments. Their relative efficacy and tolerability remain inadequately tested. In particular, the value and safety of increasingly employed second-generation antipsychotics (SGAs) and new esketamine, compared to lithium as antidepressant adjuncts remain unclear.
Methods:
We reviewed randomized, placebo-controlled trials and used random-effects meta-analysis to compare odds ratio (OR) versus placebo, as well as numbers-needed-to-treat (NNT) and to-harm (NNH), for adding SGAs, esketamine, or lithium to antidepressants for major depressive episodes.
Results:
Analyses involved 49 drug-placebo pairs. By NNT, SGAs were more effective than placebo (NNT = 11 [CI: 9–15]); esketamine (7 [5–10]) and lithium (5 [4–10]) were even more effective. Individually, aripiprazole, olanzapine+fluoxetine, risperidone, and ziprasidone all were more effective (all NNT < 10) than quetiapine (NNT = 13), brexpiprazole (16), or cariprazine (16), with overlapping NNT CIs. Risk of adverse effects, as NNH for most-frequently reported effects, among SGAs versus placebo was 5 [4–6] overall, and highest with quetiapine (NNH = 3), lowest with brexpiprazole (19), 5 (4–6) for esketamine, and 9 (5–106) with lithium. The risk/benefit ratio (NNH/NNT) was 1.80 (1.25–10.60) for lithium and much less favorable for esketamine (0.71 [0.60–0.80]) or SGAs (0.45 [0.17–0.77]).
Conclusions:
Several modern antipsychotics and esketamine appeared to be useful adjuncts to antidepressants for acute major depressive episodes, but lithium was somewhat more effective and better tolerated.
Limitations:
Most trials of adding lithium involved older, mainly tricyclic, antidepressants, and the dosing of adjunctive treatments were not optimized.
Introduction
Major depressive disorder (MDD) is a highly prevalent, episodic, or sometimes chronic illness associated with potentially severe functional impairment, co-occurring psychiatric and general medical morbidity, and excess mortality from suicide as well as from general medical conditions (Baldessarini and Tondo, 2020; Celano et al., 2018; Seligman and Nemeroff, 2015). The lifetime prevalence of MDD is approximately 4%–14% of the general population and mixed features are present in a quarter of patients with MDD (Ferrari et al., 2013; Tondo et al., 2018; Vázquez et al., 2018; Zimmermann et al., 2009). Major mood disorders generally produce high illness burdens, with substantial risks of sustained disability (Ferrari et al., 2013; World Health Organization, 2012).
Modern antidepressants, with or without psychotherapy, are the leading form of treatment provided to MDD patients (Baldessarini, 2013; Bauer et al., 2013; Kennedy et al., 2016). However, response rates with commonly employed antidepressants for acute episodes of major depression are moderate (40%–60%), and remission rates are even lower (30%–45%) (Baldessarini, 2013; Rush et al., 2006; Yuan et al., 2020). Moreover, long-term levels of treatment-unresponsive depression in MDD and bipolar disorder (BD) are surprisingly high and typically involve more than 40% of the time in follow-up, despite treatment by community standards (Forte et al., 2015). The limited efficacy of antidepressant therapy, with correspondingly prevalent "treatment-resistant" depression (TRD), encourages clinical trials of alternatives, including increased doses of antidepressants, changing to different antidepressants, adding other drugs, or use of nonpharmacological (psychological and physical) treatments (Bauer et al., 2013; Davies et al., 2019; MacQueen et al., 2017; Milev et al., 2016; Parikh et al., 2009). Particularly striking is the relatively infrequent use of lithium in acute unipolar depression, despite its prolonged clinical acceptance and extensive support for use in nonbipolar major depression, particularly as an adjunct to antidepressants, in addition to representing a fundamental treatment for BD (Bauer et al., 2013; Kennedy et al., 2016; Undurraga et al., 2019). Also, addition of second-generation antipsychotic drugs (SGAs) to antidepressants has been increasing (Mulder et al., 2018), and esketamine is emerging as a novel, rapidly acting agent that can be added safely to antidepressants (Bahji et al., 2020, 2021).
Despite extensive clinical experience in the use of adjunctive treatments with antidepressants, greater clarity is required regarding the relative efficacy and tolerability of specific drug combinations and their doses for major depression. This need led us to evaluate trials testing short-term efficacy and tolerability of a currently prevalent option: SGAs, and their comparison with adjunctive esketamine as another innovative option, and with lithium as one of the oldest such adjunctive options (Haddad et al., 2015; Undurraga et al., 2019). Our assessments and comparisons are based on meta-analytic estimates of odds ratio (OR) as well as number-needed-to-treat (NNT) to indicate efficacy, and number-needed-to-harm (NNH) arising from commonly clinically encountered adverse effects. NNT and NNH are convenient and clinically readily interpretable measures that also can express relative risk-benefit relationships as the NNH/NNT ratio (Citrome and Ketter, 2013) ].
Methods
Aims and eligibility criteria
We carried out a systematic review and meta-analysis and prepared this report adhering the preferred reporting items for systematic reviews and meta-analyses (PRISMA) (Liberati et al., 2009). We limited inclusion to peer-reviewed reports of randomized, nominally double-blinded, short-term (⩽12 weeks), placebo-controlled trials of selected agents of interest, including SGAs (those encountered were: aripiprazole, brexpiprazole, cariprazine, olanzapine+fluoxetine, risperidone, quetiapine, or ziprasidone; all inhibitors of serotonin 5-HT2 and dopamine D2 receptors), or intranasal esketamine, for comparison with lithium (usually as the carbonate), all combined with standard antidepressants to treat mainly unipolar major depressive episodes in adults diagnosed by modern criteria. We excluded reports involving special populations, such as juveniles, the elderly, or persons with major general medical or neurological illnesses.
Information sources and search
We systematically searched research literature in three electronic databases (PubMed, Google Scholar, and Medline) through October 2020 with combinations of the search terms: “major depression,” “controlled,” “randomized,” “clinical trial,” and “efficacy” (Appendix 1). We also examined previously published, partially relevant, systematic reviews (Bahji et al., 2020, 2021; Nelson and Papakostas, 2009; Ruberto et al., 2020; Spielmans et al., 2013; Undurraga et al., 2019) and references identified in them.
Of 4631 initially identified potential studies based on review of titles and abstracts, 124 required more detailed examination by two coauthors (GHV and RJB), resulting in 43 trials (with 49 drug-placebo pairs) meeting study inclusion criteria (Appendix Figure A1).
Summary measures
To combine the results of studies, we used a random-effects meta-analysis to pool effect sizes to obtain OR with 95% confidence intervals (CI), based on previously described methods (Bahji et al., 2020). We measured heterogeneity using the I2 statistic (Higgins and Thompson, 2002). Most identified studies defined “response” as at least a 50% reduction in scores with standardized depressive symptom rating-scales, commonly the Hamilton depression rating scale [HDRS17] or Montgomery-Åsberg Depression Rating Scale [MADRS]) (Hamilton, 1960; Montgomery and Åsberg, 1979). We summarized response rates using pooled ORs and their CIs.
We also computed initial depression severity ratings as the percentage of maximum possible scale scores (52 with HDRS17, 60 with MADRS), and tested for their similarity between subjects randomized to active treatments versus to placebo, using paired-t tests.
In addition to response rates, clinical efficacy of individual agents and drug types was expressed semi-quantitatively as the estimated “NNT” (with CI), computed as the reciprocal of meta-analytically pooled differences in proportions of patients responding to an active drug versus placebo. NNT indicates the approximate number of patients treated to encounter a patient with superior benefit with a test treatment over a control condition (smaller NNT demonstrating greater efficacy), typically based on response rates with a drug versus with placebo.
To assess the acceptability and tolerability of treatments, we used NNH, which is the reciprocal of differences in proportions of patients reporting a common adverse effect with drug versus placebo; larger NNH values indicate greater tolerability (Andrade, 2015). The most prevalent adverse effects with antipsychotics were excessive sedation or somnolence, weight gain, extrapyramidal neurological symptoms, and akathisia; with intranasal esketamine the most commonly noted adverse effect was dizziness; and with lithium, tremor. NNT and NNH values for individual drugs and drug types were computed by random-effects meta-analysis and reported with 95% CI. They were compared statistically by contingency tables (χ2) based on pooled responder rates and pooled rates of experiencing specified adverse effects.
Finally, we computed the likelihood to be harmed or helped (LHH) as the ratio of NNH to NNT (Citrome and Ketter, 2013). LHH reflects the balance between harm and benefits (risk/benefit ratio) and is reported for each drug and drug type for which data were available. Other measures are reported as means with 95% CI. Statistical significance required two-tailed p < 0.05. Analyses employed commercial software: Statview.5 (SAS Institute, Cary, NC, USA) for spreadsheets, and R Studio (RStudio PBC, Boston, MA, USA) and Stata.13 (StataCorp, College Station, TX, USA) for analyses.
Results
Overall findings
The PRISMA-guided process of selecting reports for inclusion is summarized in Appendix Figure A1. Of the 49 included trials (from 43 reports), four (with SGAs) involved more than one drug-arm, yielding 28 trials for SGAs, 14 for lithium carbonate, and 7 for intranasal esketamine, for a total of 49 drug-placebo pairs.
A total of 8104 subjects were included in the 28 add-on SGA trials: 4030 randomized to combination with an SGA, and 4074 (3008 unique participants owing to repeated use of some controls) with added placebo. Trial-duration averaged 7.07 (6.49–7.65) weeks, subject-age averaged 44.7 (44.3–45.1) years, and 67.2% (67.1–67.3) of participants were women (Appendix Table A1). Mean baseline depression severity ratings, expressed as percentage of maximum attainable score, ranked: 51.5 (44.7–47.4) with lithium, 46.0 (44.7–47.4) with SGAs, and 37.6 (36.3–47.4) with esketamine. These initial scores differ highly significantly (overall t = 3.68, p < 0.0001), and each Scheffé post-hoc pairwise comparison also differs significantly (lithium vs. esketamine, p < 0.0001; esketamine vs. SGA, p = 0.001; lithium vs. SGA, p = 0.04).
In the seven trials for intranasal esketamine as an add-on to antidepressant treatment, there were 1287 subjects: 711 randomized to added esketamine and 576 to added placebo. Trial duration was 4 weeks for all trials of esketamine, subject-age averaged 46.0 (40.1–46.0) years, and 63.0% (55.5–76.3) of participants were women (Appendix Table A2).
Of the 14 trials for lithium carbonate as an add-on to antidepressant treatment, there were 640 subjects: 292 randomized to added lithium and 348 to added placebo. Trial duration averaged 3.4 (2.0–4.8) weeks, subject-age averaged 43.7 (40.0–47.0) years, and 63.0% (55.5–76.3) of participants were women (Appendix Table A3).
Meta-analyses
Random-effects meta-analysis of trials of adding SGAs versus placebo to antidepressants yielded highly significant superiority of SGAs overall (OR = 1.59 [CI: 1.44–1.75]; z = 9.16, p < 0.0001; Figure 1(a)). The efficacy of intranasal esketamine was intermediate between SGAs and lithium (OR = 1.94 [1.52–2.46]; z = 4.98, p < 0.0001; Figure 1(b)), and the efficacy of lithium was highest (OR = 2.22 [1.44–3.43]; z = 3.59, p=0.0003; Figure 1(c)).

Forest plots of random-effects meta-analyses for clinical trials testing the efficacy of supplementing antidepressants with active agents or placebo for major depression: (a) second-generation antipsychotics (SGAs, 28 trials), (b) intranasal esketamine (7 trials), or (c) lithium carbonate (13 trials). SGAs tested were: APZ, aripiprazole; BRX, brexpiprazole; CAR, cariprazine; OFC, olanzapine+fluoxetine combination; QTP, quetiapine; RSP, risperidone; ZPS, ziprasidone. Adding all three types of active treatments were much more effective than adding placebo: (a) SGAs: pooled OR = 1.59 [CI: 1.44–1.75]; z-score = 9.16, p < 0.0001; (b) esketamine: pooled OR = 1.85 [1.45–2.35]; z-score = 4.98, p < 0.0001; Lithium: pooled OR = 2.12 [1.46–3.09]; z = 3.92, p < 0.0001. Heterogeneity ratings (I2) all were <1.0%.
NNT
NNT values for response among individual drugs or types (Table 1) did not differ significantly (overlapping CIs), but tended to be lower (more favorable) with lithium (NNT = 5 [4–10]) than with esketamine (NNT = 7 [5–10]) or SGAs overall (11 [9–15]). NNT among particular SGAs ranked: risperidone (6 [3–13]) = olanzapine/fluoxetine (which includes an antidepressant; 6 [4–19]) ⩽ ziprasidone (7 [3–∞]) ⩽ aripiprazole (9 [5–24]) ⩽ cariprazine (16 [8–52]) = brexpiprazole (16 [10–34]; Table 1). Based on responder rates, lithium was significantly superior to SGAs (χ2 = 19.6, p < 0.0001), as was esketamine (χ2 = 30.9, p < 0.0001), whereas lithium and esketamine did not differ significantly (χ2 = 0.340, p = 0.561).
Efficacy of lithium or second-generation antipsychotics (SGAs) versus placebo (PBO) added to antidepressants for major depression.
Data are ranked in ascending order of number-needed-to-treat (NNT).
NNH
NNH for lithium was highest (lowest risk) at 9 [5–106], and greater than with intranasal esketamine (5 [4–6]) or all SGAs pooled (5 [4–6]). For individual SGAs, NNH ranged from 19 with brexpiprazole to 3 with quetiapine (Table 2). Based on adverse event rates, lithium was safer than either SGAs or esketamine (χ2 = 1567 and 158, respectively; both p ⩽ 0.0001), and risk was lower with esketamine than with SGAs (χ2 = 13.0, p = 0.0003).
Relative risk of adverse events associated with second-generation antipsychotics (SGAs, esketamine or lithium versus placebo (PBO) added to antidepressants for major depression.
Data are ranked by descending NNH.
CI: confidence interval; EPS: extrapyramidal signs or symptoms; LLH: likelihood of help or harm or risk/benefit ratio (NNH/NNT); NNH: number-needed-to-harm; NNT: number needed to treat; PBO: placebo; SGA: second-generation antipsychotic.
In addition, the LHH or risk/benefit ratio (NNH/NNT) was more favorable (larger) with lithium (LHH = 1.50 [1.08–3.34] than with intranasal esketamine (LHH = 0.71 [0.60–0.80]) or SGAs-combined (LLH = 0.45 [0.17–0.77]; Table 2), with nonoverlapping CIs.
Discussion
This systematic review compared efficacy (as OR vs. placebo in random-effects meta-analyses and as NNT) and tolerability (as NNH) and their risk/benefit ratio (NNH/NNT, or LHH) in placebo-controlled, randomized, add-on trials of SGAs, intranasal esketamine, or lithium to supplement standard antidepressants. Literature searching yielded 43 peer-reviewed reports meeting study criteria, with 49 drug-placebo pairs (Figure 1).
SGAs overall were more effective than placebo (OR = 1.59 [1.44–1.75]; NNT = 11 [9–15]), but esketamine (OR = 1.96 [1.55-2.50]; NNT = 7 [5–10]) and lithium (OR = 2.04 [1.42–2.93]; NNT = 5 [4–10]) were even more effective. Individually, compared to placebo, aripiprazole, olanzapine+fluoxetine, risperidone, and ziprasidone were more effective than placebo in attaining an antidepressant response (all NNT < 10), and more so than quetiapine (NNT = 13), brexpiprazole (NNT = 16), or cariprazine (NNT = 16). However, the CIs of NNTs for individual added SGAs treatments overlapped.
Apparent risk of adverse effects, as NNH (higher value with lower risk) for most frequently reported effects among SGAs versus placebo, was highest with quetiapine (NNH = 3) and lowest with brexpiprazole (NNH = 19). In addition, the NNH was lower (higher risk) with intranasal esketamine (NNH = 5 [4–6] and all SGAs-pooled (5 [4–6]) than with lithium (9 [5–106]). The benefit/risk ratio (NNH/NNT, or LHH; Table 2) was 1.50 [1.08–3.34] for lithium and much lower, or less favorable, with intranasal esketamine (0.71 [0.60–0.80]) and all SGAs (0.45 [0.17–0.77]).
These findings support the efficacy of SGAs, intranasal esketamine, and lithium over placebo in supplementing antidepressant treatment of acute major depression in adults. However, the trials included are heterogeneous, and computed values of NNT for individual SGAs had overlapping CIs, limiting their potential value in guiding recommendations regarding which drug should be used as a first choice. Moreover, initial depression severity ratings normalized as the percentage of maximum scale scores differed significantly and ranked: lithium (51.1%) > SGAs (46.0%) > esketamine (37.6%). The same order was found regarding efficacy as OR in meta-analyses (Figure 1) and as NNT (Table 1). This ranking may suggest preferential efficacy with higher initial depression severity, favoring lithium, or possibly an artifact of contrasts between higher initial to end-point depression ratings.
We also found similar results regarding tolerability as NNH, ranking: lithium ⩾ intranasal esketamine ⩾ SGAs (Table 2). Tolerability and efficacy is essential in deciding which treatment should be used, as adverse effects can reduce subjective well-being and treatment-adherence and adversely affect treatment outcomes (Solmi et al., 2017). Use of SGAs can lead to a range of adverse effects, including excessive sedation, akathisia, and risks of weight gain and adverse cardiometabolic effects with some SGAs (Baldessarini, 2013; Centorrino et al., 2012; Gierisch et al., 2014; Solmi et al., 2017). Use of intranasal esketamine at approved doses can lead to other adverse events, including dizziness, dissociation, headaches, paraesthesia, nausea, vomiting, and somnolence, with even potential risks of psychosis at higher doses (Bahji et al., 2021). Of note, mean duration of lithium trials was shorter than for esketamine and SGAs; this difference may imply a faster antidepressant action with lithium, and might also limit appearance of side effects.
Lithium is effective for treating affective disorders with evidence of reduction of suicidal risk and mortality, but is underutilized (Baldessarini, 2013; Undurraga et al., 2019) especially in MDD. Concerns that limit the use of lithium include a narrow therapeutic index, with risks of intoxication at circulating concentrations only 2–3 times above therapeutic levels, as well as of adverse long-term effects on thyroid and renal function (Baldessarini, 2013). In addition, lithium salts have lacked commercial promotion as unpatentable minerals in competition with other treatments.
NNT is a convenient and clinically readily interpretable measure of therapeutic effect-size and may support comparisons of different treatments given under comparable conditions, but it has important limitations (Andrade, 2015; Mendes et al., 2017). In the present analyses, lithium yielded an NNT of 5 (4–10), indicating that approximately one out of five patients treated would respond to adding lithium to an antidepressant in comparison with adding placebo. Generally, small NNTs are preferable, although larger values (>10) may be acceptable if the outcome is the prevention of mortality or severe morbidity (Katsanos et al., 2015). Also, meaningful interpretation of NNT requires consideration of response rate: a relatively low response rate with an active treatment may be significantly superior to that with placebo, but not be clinically valuable. NNH and the risk/benefit ratio (NNH/NNT) are also subject to limitations, notably including the clinical significance of the adverse effect being considered.
Limitations
Most trials of adding lithium to antidepressants for major depressive episodes involved older, mainly tricyclic, antidepressants, and dosing of all reported adjunctive treatments were not optimized. We also found poor systematization of the adverse effect profile in some trials, especially involving lithium. Use of NNT and NNH to compare treatments is limited by the comparability of different trials and further limited by the rarity of desirable, head-to-head comparisons of different active treatments under identical conditions.
Conclusions
Based on meta-analyses to determine the OR and NNT, several modern drugs developed as antipsychotics as well as intranasal esketamine were effective as adjuncts to antidepressants for acute major depressive episodes, but lithium was somewhat more effective and better tolerated. The findings encourage clinical consideration of lithium as a particularly attractive adjunct in the treatment of major depression.
Footnotes
Appendix
Characteristics of lithium versus placebo (PBO) add-on trials for major depressive episodes.
| Study | ADs | Age (years) | Females (%) | Lithium dose | Duration (weeks) | Subjects (N) |
Responders (%) |
Response OR | ||
|---|---|---|---|---|---|---|---|---|---|---|
| Lithium | Placebo | Lithium | Placebo | |||||||
| Lingjærde et al. (1974) | TCAs | 49 | 78 | 0.8–1.3 mM | 6 | 20 | 25 | 8 (40.0) | 5 (20.0) | 2.67 |
| Heninger et al. (1983) | TCAs, MIA | 51 | 80 | 900–1200 mg/day (0.5–1.1 mM) | 2 | 8 | 17 | 5 (62.5) | 9 (52.9) | 1.48 |
| Zusky et al. (1988) | TCAs, MAOIs | 45 | 81 | 900 mg/day | 2 | 8 | 8 | 3 (37.5) | 2 (25.0) | 1.80 |
| Schöpf et al. (1989) | TCAs | 54 | 70 | 600–800 mg/day (0.6–0.8 mM) | 1 | 14 | 13 | 9 (64.3) | 0 (0.00) | 57.0 |
| Browne et al. (1990) | TCAs | 42 | 59 | 900 mg/day | 2 | 15 | 15 | 4 (26.7) | 3 (20.0) | 1.45 |
| Joffe et al. (1993) | TCAs | 37 | 55 | 900 mg/day | 2 | 17 | 16 | 9 (52.9) | 3 (18.8) | 4.88 |
| Stein and Bernadt (1993) | TCAs | 47 | 79 | 250 mg/day | 3 | 16 | 34 | 7 (43.8) | 7 (20.6) | 3.00 |
| Katona et al. (1995) | FLX, LFP | 40 | 57 | 800 mg/day (0.6–1 mM) | 6 | 29 | 32 | 15 (51.7) | 10 (31.3) | 2.36 |
| Baumann et al. (1996) | CTP | 41 | 71 | 800 mg/day (0.5–0.8 mM) | 1 | 10 | 32 | 6 (60.0) | 8 (25.0) | 4.50 |
| Bloch et al. (1997) | DMI | 47 | 55 | 0.7–1.0 mM | 5 | 16 | 15 | 9 (56.3) | 10 (66.7) | 1.50 |
| Cappiello et al. (1998) | DMI | 40 | 66 | 900 mg/day | 4 | 14 | 15 | 4 (28.6) | 0 (0.0) | 13.3 |
| Januel et al. (2003) | CMI | 44 | 62 | 750 mg/day | 2 | 74 | 75 | 42 (56.8) | 34 (45.3) | 1.58 |
| Nierenberg et al. (2003) | NRT | 38 | 46 | 900 mg/day | 6 | 18 | 17 | 2 (11.1) | 3 (17.6) | 0.58 |
| Saini et al. (2016) | IMI | 37 | 23 | 0.6–0.8 mM | 4 | 20 | 20 | 20 (100.0) | 17 (85.0) | 8.20 |
| Totals/averages [95%CI] | – | 43.7 [40.6–46.8 | 63.0 [53.8–72.2] | 0.6–1.3 mM | 3.29 [1.57–5.01] | 279 | 364 | 143/279 (51.3%) [45.2–57.3] | 111/364 (30.5%) [25.8–35.5] | 2.23 [1.53–3.24] |
OR was determined from random-effects meta-analysis. Of the 14 trials, three (21.4%) significantly favored Li over placebo independently.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Supported in part by an award from the Aretæus Foundation of Rome and Centro Bini Private Donors Research Fund (to LT), by ANID-PIA-ACT192064, ANID-FONDECYT 1180358, 1200601, Clínica Alemana de Santiago ID 863 (to JU), and by a grant from the Bruce J. Anderson Foundation and by the McLean Private Donors Research Fund (to RJB).
