Abstract
Keywords
Cognitive behavioural therapy (CBT) has been shown to be an effective treatment for depression and panic disorder in many randomized controlled trials [1, 2] and is recommended in evidence-based clinical practice guidelines as a first-line treatment for these disorders [3, 4]. However, there are many factors that may affect the efficacy of CBT that have not been adequately investigated. Until they are, it is difficult to make recommendations about how CBT should be administered in clinical practice to achieve maximum efficacy. One pertinent example is whether the type and amount of training of the health professional administering the therapy influences efficacy. In Australia, incentives have been introduced by the government to encourage general practitioners to administer CBT after some additional training [5]. Although this move has the potential to make CBT more widely available in the publicly funded health care system, it is not known whether general practitioners are likely to achieve the same effectiveness as psychologists, for example.
There has also been debate about the suitability of CsBT as a mono-therapy for severe depression and American Psychiatric Association Clinical Practice Guidelines advise against it based on the results of one large randomized controlled trial [6, 7]. However, more recent Australian Guidelines do recommend CBT as a suitable first-line mono-therapy for severe uncomplicated depression [4] and this is supported by more recent analyses of large randomized controlled trials (RCTs) that show that CBT is as effective (if not more effective) as antidepressant medication for severe depression [8]. Other issues of interest include the effect of different modes of CBT (e.g. group
Thus, we decided to conduct a meta-regression to investigate the effect of these factors plus others on the size of the response. Although meta-analyses of CBT for depression, panic and generalized anxiety disorder (GAD) have been conducted before [1, 2, 9], this is the first study to investigate a wide range of factors that may impact on its efficacy and explain the heterogeneity reported in previous meta-analyses, for example, Gloaguen
Method
The study aims are:
To use the technique of meta-analysis to determine the efficacy of CBT for depression, panic disorder and GAD; and
To determine the effect of various factors, such as the intensity and provider of CBT, on the efficacy of CBT.
Selection of studies
Existingmeta-analyses of CBT for depression [1,10–12], panic disorder [2, 3] and GAD [13] were used to identify appropriate studies. These were supplemented by additional searches of Medline and the Cochrane Collaboration Controlled Trials Register (up to November 2002). These were then selected for inclusion in the meta-regression if they met criteria relating to study type, participants, intervention and outcomes. Studies had to be RCTs with one of the following control groups: wait list (or no treatment), pill placebo or attention/psychological placebo. Study participants had to be 18 years and older and all have depression, panic disorder or GAD. The following diagnoses were considered valid: ‘major depression’ or ‘dysthymic disorder’ according to the Research Diagnostic Criteria, DSM-III or DSM-III-R criteria, with the exclusion of psychotic disorder and bipolar affective disorder; panic disorder with or without agoraphobia; and DSM-III-R or DSM-IV-defined GAD (DSM-III was considered a less strict definition). All trials had to be studies of CBT, or the behavioural (exposure) component alone or cognitive restructuring alone. Outcomes of interest included symptom, functioning and health-related quality of life measures, reported as continuous variables. Studies were excluded if means and standard deviations (or standard errors) were not reported, as these statistics are required to calculate the effect size. Disagreements between the two reviewers were resolved by discussion.
Extraction of data
Mean results from each treatment and control condition (some studies examined multiple conditions) were extracted for use in the later effect size calculations. Only results from continuous outcome measures that measured symptoms, functioning or quality of life were extracted for use in effect size calculations. Most commonly, functioning or quality of life was not directly measured in the RCTs and effect sizes are largely calculated from symptom measures, which are known to have a close relation with disability in anxiety and depression [14].
In addition to efficacy data, other factors that may impact on efficacy were investigated. These included factors relevant to clinical practice: disorder; treatment type (CBT, behavioural therapy, cognitive therapy), duration (weeks) and intensity of treatment (total contact hours);mode of therapy (individual, group, book, telephone, computer), type of therapists employed (psychologist, psychiatrist, social worker, general practitioner) and whether they were specifically trained to provide the treatment; a statement that severe patients were included; and inclusion of inpatients. For RCTs of depression, themean Beck Depression Inventory (BDI) score at baseline was also extracted. We could not identify a similar measure of anxiety severity that could be extracted from most of the panic and GAD trials. Other factors that may impact on efficacy are related to the conduct of the trials and include: year of study; country of study; type of control group (wait list, pill placebo, attention placebo); language (English, other); number of patients randomized to control and treatment groups; number of patients completing the trial; and percentage of dropouts from the trial. All data were separately extracted from each study by two reviewers and entered into Excel. Disagreements in data extracted between the two reviewers were resolved by discussion and reference to the original paper.
Analysis
The effect size (standardized mean difference) for each study was calculated in Excel using Hedges' adjusted g. This quantifies the magnitude of the difference between the intervention and control groups at post-treatment in a metric-free unit, by expressing the mean difference in standard deviation (SD) units. We use Hedges' g [15] because it includes an adjustment to correct for small sample bias and is used in Cochrane Collaboration systematic reviews. An effect size was calculated for each study by averaging across the relevant outcome measures within the study. This differs from the way meta-analyses are done by the Cochrane Collaboration but is consistent with meta-analyses of the psychiatric literature [2, 16]. A spreadsheet containing the extracted study data and the calculated effect sizes was imported into Stata 8.0 [17] to perform the additional analyses.
First, effect sizes were pooled across studies to produce an overall effect size for all studies and for each disorder (‘meta’ command in Stata). Studies were weighted by the inverse of their variance and the random effects model is reported. Heterogeneity was indicated by the Q-statistic and referred to a chi-squared distribution on k−1 degrees of freedom (df), where k is the number of studies/comparisons.
A meta-regression was then performed to test the effects of different factors on the efficacy of CBT (‘metareg’ command in Stata). Metaregression is a useful tool for analysing the associations between treatment effect and study characteristics and is particularly useful where heterogeneity in the effect of treatment between studies is found [18]. The primary aim of the analysis was to decrease the between-study variance. This was approached by first performing a univariate regression analysis for each factor being examined. A multivariate model was then built up interactively by adding one factor at a time in order of the amount of between-study variance it explained – from highest to lowest – rather than using an automatic procedure such as forward selection. The between-study variance (
Results of meta-regression
†Coefficients refer to the effect size compared to the referent category. Unless otherwise stated, the referent category is absence of the factor; ‡to determine the effect size for a specific set of characteristics, the following example may be useful: For a study of CBT that went for a duration of 10 weeks, did not include severe patients, was conducted in 1993 in the US, used a wait-list control group with the CBT conducted in English and 10% of the control group had dropped out of treatment, the effect size is:
−45.58 +0+0.021 ×10 +0×−0.265 +0.023 ×1993 +0+0+0.761 ×1−0.008 ×0.10 =1.229, for the same study but with an attention-placebo control group, the effect size would be 1.229 −0.516 =0.713; §referent categor y; ¶these p-values refer to the significance of the group of variables, for example, country of study: US, UK and other countries; ††for face-to-face therapy only; ‡‡based on whether a specific statement was made in the paper that severe patients were included. CBT, cognitive behavioural therapy; GAD, generalized anxiety disorder; UK, United Kingdom; US, United States of America.
Results of meta-regression for depression
†Coefficients refer to the effect size compared to the referent category. Unless otherwise stated, the referent category is absence of the factor; ‡referent category; §these p-values refer to the significance of the group of variables.
For a study where treatment with CBT was compared with a wait-list control group and patients had an average BDI score of 21, the effect size would be 3.284 − 21 × 0.085 + 0= 1.499. For a BDI score of 36, the effect size would be 3.284 − 36 × 0.085 = 0.224. BDI, Beck Depression Inventory; CBT, cognitive behavioural therapy.
In the analysis, each CBT versus control comparison is assumed to be independent but many studies provided more than one comparison. Ideally, some adjustment for non-independence should be made but we could not find an appropriate method for doing this. Thus, it is possible that we have underestimated the standard errors around the effect sizes.
Results
A total of 64 studies were collected; of these, 33 were retained for inclusion and 31 were excluded [19–51]. We excluded a large number of studies that were included in the Gloaguen meta-analysis [1] in particular (n=16 out of 22 included in the comparison of CBT to wait-list or placebo). Most commonly, this was due to an inadequate diagnosis of depression. Details of excluded trials are given in Table 1.
Excluded trials
BDI, Beck Depression Inventory; CBT, cognitive behavioural therapy; SDs, standard deviations; HRSD, Hamilton Rating Scale for Depression.
Some details of the 33 included studies are shown in Table 2. Nineteen studies representing 30 treatment versus control comparisons were in patients with panic disorder with or without agoraphobia [52–70], 11 studies (17 comparisons) were in patients with depression [6,71–80] and three studies (five comparisons) were in patients with GAD [81–83]. Most of the comparisons were with a wait-list control group (n=33), followed by an attention placebo (n=16) and pill placebo (n=3) control group. None of the studies included inpatients.
Details of included trials
†Mode of CBT: individual (face-to-face), group, book, phone, computer; ‡number randomized to treatment group. AP, attention or psychological placebo; CBT, cognitive behavioural therapy; SE, standard error.
The pooled effect size for all 52 comparisons of CBT with any type of control group is 0.68 (95% confidence interval (CI)=0.51–0.84). However, there was a significant amount of heterogeneity (Q=127.48 on 51 df, p<0.001) suggesting caution in the interpretation of the effect size (Fig. 1). Effect sizes were also calculated for each disorder separately giving a random-effects effect size of 0.77 (95% CI=0.44–1.10) for depression (Q=50.75, df=16, p<0.001), 0.64 (95% CI=0.43– 0.86) for panic disorder (Q=70.99, df=29, p<0.001) and 0.64 (95% CI=0.28–1.00) for GAD (Q=5.47, df=4, p=0.24). Apart from GAD, the effect sizes displayed a significant amount of heterogeneity. For panic disorder, the random-effects effect size was similar to that given by the fixed effects model (0.61, 95% CI=0.48–0.75). For depression, the random effects model gave a higher effect size than the fixed effects model (0.67, 95% CI=0.49–0.85).
Effect size for cognitive behavioural therapy for depression, panic disorder and generalized anxiety disorder: meta-analysis of randomized controlled trials. Pooled effect size=0.68, 95% confidence interval=0.51−0.84, random effects model). There is significant heterogeneity in the effect sizes (Q=127.48, df=51, p<0.001).
From Fig. 1, it is apparent that two of the depression studies (D7 and D8) have unusually large effect sizes and appear to be outliers. These two studies are by the same author [76, 77] and investigate the effects of a particular type of CBT based on problem solving. We recalculated the depression effect size with these two studies removed, which gave a random-effects effect size of 0.54 (95% CI=0.29–0.79) and resulted in less heterogeneity (Q=22.26, df=13, p=0.051).
The results of the meta-regression are shown in Table 3. The middle three columns shows the univariate coefficients. The regression coefficients are the estimated increase in the effect size per unit increase in the predictor variable compared to the referent category. For example, for disorder: depression is the referent category and has an effect size of 0.75. Panic has an effect size of 0.11 SD units lower than depression and GAD has an effect size of 0.10 SD units lower than depression but neither of these differences are significant. For duration of therapy, a continuous variable, the effect size decreases by 0.037 SD units for each increase in duration of therapy of 1week, but again, this difference is not significant. The multivariate model shown in the last two columns includes: treatment, duration of therapy, inclusion of severe patients, year of study, country of study, control group, language and number of dropouts from the control group. Not all of these variables were significant in the model but, together, they reduced the between-study variance to zero. The regression coefficients for the multivariate model are the estimated increase in the effect size per unit increase in the predictor variable, while accounting for the effect of the other variables in the model. So, in Table 3, the effect size is estimated to increase by 0.021 for each extra week of therapy, for example. As can be seen from Table 3, only the type of control group and the inclusion of severe patients were significant predictors of the effect size. The other variables in the model helped explain the between-study variance but were not significant predictors of the effect size.
It is important to note that most studies (40 comparisons) were conducted in the US and in only three studies (four comparisons) was therapy conducted in a language other than English. In most studies the CBT was provided by psychologists (31 comparisons) or ‘therapists’ (nine comparisons) and in 41 of the 50 comparisons, the paper specified that the person conducting the therapy was trained in CBT in general or in the specific form of CBT being studied (Table 3). It was not always clear from the papers how much training the therapist had undergone nor what professional group ‘therapists’ belong to. ‘Therapist’ may be a generic term for psychologist or for a mix of CBT providers. In none of the studies was the therapy conducted by general practitioners or solely by psychiatrists, social workers or any other professional group.
To further investigate the effect of severity on the effect size, we repeated the meta-regression with the 11 depression trials (17 comparisons) because a continuous measure of depression severity at baseline was available for all studies. The results are shown in Table 4, although it is important to note that there is still some remaining heterogeneity that could not be explained by any of the possible predictors investigated (τ 2 =0.159). In univariate analyses the BDI score at baseline in the treatment group was not significantly related to the effect size (coefficient=−0.06, p=0.19). Further, a statement in the paper that patients with severe depression were included in the study was not a significant predictor of the effect size (coefficient=−0.49, p=0.16). However, the meta-regression showed that when the type of treatment and control group studied were included in the model, the effect size decreased significantly with increasing BDI score (Table 4). For each unit increase in BDI score the effect size decreases by 0.085 SD units (p=0.037).
Discussion
Overall, CBT is an effective treatment for depression, panic disorder and GAD with a moderate to large effect size of 0.68 (95% CI=0.51–0.84). However, there is a significant amount of heterogeneity present suggesting caution in the interpretation of this effect size. The factors that explained all of the variation in the effect size are: treatment, duration of therapy, inclusion of severe patients, year of study, country of study, control group, language and number of dropouts from the control group. However, the only factors that were significant predictors of the effect size are the type of control group and the inclusion of severe patients.
The effect sizes found for each disorder are consistent with those found in previous meta-analyses but this is not surprising given that these meta-analyses were used as a source of studies. For GAD, we found an effect size of 0.64 compared to 0.66 found in a previous meta-analysis conducted by our colleagues [13]. The difference is due to exclusion from the current meta-analysis of one study that was not an RCT and one that did not include a control group. For panic disorder, we found an effect size of 0.64, compared to 0.68 found by Gould
The type of control group to which CBT was compared had a significant impact on the effect size. The biggest effect size is seen with wait-list control groups (Table 3), followed by pill-placebo control groups (effect size reduces by 0.025 SD units), although the difference is not significant. Comparison of CBT with attention-placebo control groups significantly reduces the effect size by 0.516 SD units compared to a wait-list control group. One explanation for this is that relaxation training, which has actually been shown to be an effective treatment for both panic disorder and GAD [52, 54, 81], was used as an attention control group for several studies. This finding of the effect of control group is not surprising. However, the lack of a significant difference with pill-placebo control groups, once other factors are controlled for, was not expected but should be taken cautiously as there were only three studies in which CBT was compared to pill placebo [6, 55, 60].
From the multivariate model we can see that the year of study, country of study, duration of therapy, language and per cent of dropouts from the control group helped explain some of the heterogeneity in the model but none were significant predictors of the effect size when the other factors in the model were controlled for. Disorder was not a significant predictor of the effect size nor did it explain any of the heterogeneity in the model so was not included in the final model. This suggests that the efficacy of CBT is similar between these disorders.
Factors that did not predict the effect size or explain any of the heterogeneity in the results included the intensity of treatment (tested only for face-to-face therapy), mode of therapy (group
We attempted to determine whether severity was a significant predictor of the effect size. For studies in which a specific statement was made in the paper that severe patients were included, the effect size was, on average, significantly lower by 0.265 SD units when other factors in the model were controlled for (Table 3). However, this should not be interpreted as meaning that CBT is not effective in severe patients. The studies in which severe patients were included still had a mean effect size of 0.531 (95% CI=0.345–0.717) – calculated by performing a random effects meta-analysis limited to the 16 studies that stated they included severe patients. Also, the way in which we were able to measure severity is very limited. The analysis would be greatly improved if an objective continuous measure of patient severity had been included in each of the studies. When we limited the analysis to studies of depression we did find a significant decrease of 0.085 in the effect size with each point increase in the mean BDI score at baseline. The mean BDI score at baseline ranged from 21 to 36 in the papers included in the meta-regression. These scores indicate a mean severity of patients of moderate to severe, although the variation would have been much greater in the individual studies. Thus, for studies of CBT and a wait-list control group, the effect size ranges from 0.224 to 1.499 (Table 4). It would be interesting to conduct a similar analysis of antidepressants for depression. However, the analysis conducted by DeRubeis and colleagues suggests that CBT is likely to perform as well as (if not better than) antidepressants in severely depressed patients [8].
In considering the results of these analyses, it is important to be mindful of the limitations of systematic reviews and meta-analysis. A meta-analysis is only as good as the individual randomized controlled trials that go into it. It is also limited by the need to use study level rather than patient level data, which reduced the power of the analyses. However, the strength of systematic reviews and meta-analysis is that they can provide a means to make sense of the vast amount of literature on CBT (in this case) that is already available [84]. They can be used to determine whether, and what, further research is needed. The technique of meta-regression enables multivariate analysis of study characteristics that may be responsible for heterogeneity in the effect sizes.
So, what can we confidently conclude from our examination of the RCT literature on CBT for depression, panic disorder and GAD? We can make several conclusions: CBT is an effective treatment for these disorders, with a moderate to large effect size of 0.68. However, the size of the effect is dependent on the type of control group it is compared to and the baseline severity of the patients. Most studies have used psychologists as providers so more studies are needed to determine its efficacy in other professional groups. The mode of administration (individual or group setting; face-to-face or through a book, telephone or computer) does not impact on the effectiveness of CBT, although the evidence for telephone and computer administered CBT is more limited. Also, these modes of delivery may not be suitable for many patients as the trials are limited to volunteers, and therefore more interested patients. More studies are needed to determine CBT's efficacy in countries outside of the US and UK and its usefulness for non-English-speaking patient groups.
This study adds to our knowledge by explaining the heterogeneity found in previous meta-analyses. It also confirms that the severity of the patients treated with CBT and the type of control group used are independent predictors of the effect size.
Footnotes
Acknowledgements
We thank John Carlin and Jonathan Sterne for statistical advice and Kristy Sanderson for methodological advice.
