Abstract
Objective:
To investigate the effectiveness of a freely available computerised cognitive behavioural therapy programme (MoodGYM) for depression (primary outcome), anxiety and general psychological distress in adults.
Method:
We searched PsycINFO, CINAHL Plus, MEDLINE, EMBASE, Social Science Citation Index and references from identified papers. To assess MoodGYM’s effectiveness, we conducted random effects meta-analysis of identified randomised controlled trials.
Results:
Comparisons from 11 studies demonstrated MoodGYM’s effectiveness for depression symptoms at post-intervention, with a small effect size (g = 0.36, 95% confidence interval: 0.17–0.56; I2 = 78%). Removing the lowest quality studies (k = 3) had minimal impact; however, adjusting for publication bias reduced the effect size to a non-significant level (g = 0.17, 95% confidence interval: −0.01 to 0.38). Comparisons from six studies demonstrated MoodGYM’s effectiveness for anxiety symptoms at post-intervention, with a medium effect size (g = 0.57, 95% confidence interval: 0.20–0.94; I2 = 85%). Although comparisons from six studies did not yield significance for MoodGYM’s effectiveness for general psychological distress symptoms, the small effect size approached significance (g = 0.34, 95% confidence interval: −0.04 to 0.68; I2 = 79%). Both the type of setting (clinical vs non-clinical) and MoodGYM-developer authorship in randomised controlled trials had no meaningful influence on results; however, the results were confounded by the type of control deployed, level of clinician guidance, international region of trial and adherence to MoodGYM.
Conclusions:
The confounding influence of several variables, and presence of publication bias, means that the results of this meta-analysis should be interpreted with caution. Tentative support is provided for MoodGYM’s effectiveness for symptoms of depression and general psychological distress. The programme’s medium effect on anxiety symptoms demonstrates its utility for people with this difficulty. MoodGYM benefits from its free accessibility over the Internet, but adherence rates can be problematic and at the extreme can fall below 10%. We conclude that MoodGYM is best placed as a population-level intervention that is likely to benefit a sizeable minority of its users.
Keywords
Introduction
Despite extensive evidence for its effectiveness, cognitive behavioural therapy (CBT) is difficult to access, due to limited healthcare resources and insufficient numbers of trained therapists (Bower and Gilbody, 2005). One way of increasing access to CBT is to deliver it using Internet-based or computerised CBT (cCBT) programmes that require less therapist involvement. Various meta-analyses indicate that cCBT is effective for the most common mental health problems – depression and anxiety – indicating its suitability as a low-intensity treatment (Andersson and Cuijpers, 2009; Andrews et al., 2010; Spek et al., 2007). However, several concerns have been raised, such as high drop out rates (Andersson and Cuijpers, 2009), evidence for efficacy in experimental conditions rather than effectiveness in routine clinical settings (Andersson et al., 2009) and the marked superiority of clinician-guided cCBT over unguided (or self-help) cCBT (Andersson and Cuijpers, 2009; Spek et al., 2007).
An issue that has received less attention is the heterogeneity of available cCBT programmes. There exists at least 50 Internet-based programmes for depression (at various stages of development), most of which are based on CBT (Lintvedt et al., 2013a). Due to the unregulated nature of the Internet, the quality of these programmes is likely to vary (Christensen et al., 2010). Moreover, cCBT programmes differ in terms of their evidence-base, content and duration (Twomey et al., 2013). Despite this heterogeneity, these programmes have not been evaluated separately in previous meta-analyses. Separate evaluations would add precision to the findings, and allow clinicians to make more informed decisions regarding the provision of specific cCBT programmes to their service users.
In this meta-analysis, we investigate the effectiveness of the cCBT programme, MoodGYM, for the reduction of symptoms of depression (primary outcome), anxiety and general psychological distress in adults. Developed by researchers at the Australian National University, MoodGYM is a freely available Internet-based cCBT programme for depression that can be provided with or without clinician guidance (Christensen et al., 2004). It has five core CBT-based sessions consisting of written information, animations and interactive exercises and quizzes (Table 1) (Christensen et al., 2006a; Farrer et al., 2011). We chose to evaluate MoodGYM for three main reasons. First, it is probably the most widely used cCBT programme in the world, with over 850,000 registered users (https://moodgym.anu.edu.au/welcome/faq). Second, MoodGYM’s free-to-use nature makes it accessible to people across sociodemographic divides, and a cost-effective treatment option for clinical services. Third, several randomised controlled trials (RCTs) conducted on MoodGYM are available for meta-analysis. In general, these RCTs seem to provide qualified support for MoodGYM’s effectiveness (Twomey et al., 2014). However, doubts have been cast by a recent large-scale RCT set in primary care which did not support this effectiveness (Gilbody et al., 2015). This trial attracted extensive media coverage, and sparked debate among prominent researchers over various unresolved issues in the evaluation of cCBT (e.g. the predominance of developer-led trials; control conditions in RCTs) (www.bmj.com/content/351/bmj.h5627/rapid-responses). The secondary objective of this meta-analysis is to perform sub-group analyses addressing these and other issues concerning cCBT.
Structure and content of MoodGYM.
CBT: cognitive behavioural therapy.
The structure and content of MoodGYM is updated on a periodic basis.
Methods
Eligibility criteria for study selection
Participants. Although MoodGYM may also be an appropriate treatment for adolescents (Calear et al., 2009), the vast majority of RCTs conducted on the programme have involved adults. For precision, we only included studies involving (a) adults with elevated mental ill health symptoms or (b) adults seeking mental health interventions.
Intervention. MoodGYM with or without clinician guidance.
Comparison conditions. Waiting lists, delayed treatment, no treatment, routine clinical care/treatment-as-usual (TAU) or computerised control conditions.
Outcomes. Self-report or clinician-rated measures of depression (primary outcome), anxiety or general psychological distress.
Studies. RCTs published in peer-reviewed journals. RCTs minimise the influence of error and bias on findings and offer the most rigorous method of determining whether a cause–effect relation exists between treatment and outcome (Sibbald and Roland, 1998; Spring, 2007). Their sole inclusion safeguarded the validity of the findings and ensured methodological consistency. As we were interested in determining MoodGYM’s effectiveness versus control conditions, we excluded ‘non-inferiority’ RCTs.
Literature search and data extraction
Using ‘MoodGYM’ as the sole keyword, the first author searched five databases: PsycINFO, CINAHL Plus, MEDLINE, EMBASE and Social Science Citation Index. The final search was conducted on 20 January 2016. Additional records were identified from hand-searching reference lists of included studies. Independent screening of all abstracts was undertaken by the second author. When we disagreed regarding the screening outcome of an abstract, it was included in screening at ‘full-text’ level. Data were managed using EndNote X7 (Thomson Reuters Corp.,) and word processing software. For each study, we recorded information concerning setting, participant demographics, clinical screening, comparison conditions, clinician guidance, outcome measures, data collection timepoints, trial dropout, adherence, study quality and treatment outcomes.
Quality assessment
We used three of the seven criteria from the Cochrane Collaboration’s tool for assessing risk of bias (Higgins and Green, 2011): random sequence generation, allocation concealment and completeness of outcome data (such data were deemed complete when intention-to-treat analysis was used). Regarding the remaining criteria, blinding from knowledge of an allocated intervention was not used because experimental conditions in included studies made such blinding impossible. Similarly, blinding of outcome assessment was not used because all the measures included in the meta-analyses were self-report measures. In addition, both selective reporting bias and ‘any other’ bias were not used because these biases were deemed too ambiguous in nature to objectively detect.
Statistical analysis
All statistical analyses were performed using Comprehensive Meta-Analysis (version 2.0, Biostat Inc.). For comparisons of outcomes between MoodGYM and controls, we calculated pooled mean effect sizes (Hedge’s g) with 95% confidence intervals (CIs), using random effects models that took into account the considerable between-study variability. Effect sizes of 0.2, 0.5 and 0.8 refer to small, moderate and large effect sizes, respectively (Cohen, 1988). To test the heterogeneity of effect sizes across studies, we calculated Higgin’s I2 percentages: scores of 25%, 50% and 75% indicate low, moderate and high heterogeneity, respectively (Higgins et al., 2003). For all outcomes, data from the post-intervention data collection point were used for statistical calculations. For the primary outcome (depression symptoms), data from the first follow-up collection point were also analysed. An insufficient amount of available data prevented similar ‘follow-up’ analyses of secondary outcomes (symptoms of anxiety and general psychological distress). Additional analyses were carried out on the primary outcome at post-intervention. Sensitivity analyses were adjusted for study quality, and publication bias – using the ‘Trim and Fill’ procedure (Duval and Tweedie, 2000) within Comprehensive Meta-Analysis. Based on inspection of the funnel plot, this procedure assumes that effect sizes from individual studies should be distributed symmetrically around the pooled mean effect size, and that an asymmetric distribution indicates the presence of publication bias. To achieve the desired symmetry in the funnel plot and produce an unbiased pooled mean effect size, the procedure adjusts for extreme positive effect sizes in smaller studies.
Sub-group analyses investigated the influence of the following potentially confounding variables: type of control deployed (no treatment vs ‘active’), level of clinician guidance (in-session vs remote vs none), setting (clinical vs non-clinical), country of trial (Australia vs Europe), conflict of interest (MoodGYM-developer as study author vs no conflict of interest) and adherence to MoodGYM (participants, on average, completed >50% of sessions vs completed <50% of sessions).
Results
Study selection
The literature search flow is displayed in Figure 1. After duplicates were removed, 74 studies were screened at ‘abstract’ level. For screening of abstracts, there was a 93.2% agreement rate between the two authors. After abstract screening, 27 studies were assessed for eligibility at ‘full-text’ level. In the end, 12 studies were included in the 3 main meta-analyses (with some studies suitable for more than one analysis): 11 for depression symptoms, 6 for anxiety symptoms and 6 for general psychological distress.

Literature search flow.
Study characteristics
Study characteristics are summarised in Table 2. Seven studies (58%) took place in non-clinical settings. Six studies were undertaken in Australia (50%), with the rest set in Europe. Sample sizes ranged from 39 to 3070, all studies had mostly female participants, and mean ages ranged from 19 to 42 years. ‘No treatment’ control conditions were used in eight studies (67%). MoodGYM was delivered with some form of clinician guidance in 11 studies (92%). The proportion of MoodGYM group participants who withdrew from studies before the post-intervention data collection point ranged from 0% to 64%. Full intervention adherence ranged from 10% to 100%. The quality of included studies was mixed: eight met at least two of the three quality criteria (67%), and four met one criterion (33%).
Study characteristics.
Ctr: Control; DEP: depression; F: % female; GP: general practitioner; mo: months; MG: MoodGYM; Qual.: study quality; wks: weeks. Measures – ATQ: Automatic Thoughts Questionnaire; BDI-II: Beck Depression Inventory–II; BAI: Beck Anxiety Inventory; BDQ: Brief Disability Questionnaire; CES-D: Centre for Epidemiologic Studies–Depression scale; CORE-10: Clinical Outcomes in Routine Evaluations–10; DASS-21: Depression, Anxiety and Stress Scale–21; EQ-5D: EuroQol Group 5-Dimension Self-Report Questionnaire; GAD-7: Generalised Anxiety Disorder Scale–7; K10: Kessler Psychological Distress scale–10; PHQ-9: Patient Health Questionnaire–9; RCMAS: Revised Children’s Manifest Anxiety Scale; SPHERE-12: Somatic & Psychological Health Report; SF-36: 36-Item Short Form Health Survey; SWLS: Satisfaction with Life Scale; WBS: Warwick-Edinburgh Mental Well-being Scale; WSAS: Work and Social Adjustment Scale. Quality – A: allocation concealment; C: completeness of data; R: random sequence generation. +/−: procedure to minimise bias reported/not reported.
Dropout (%) from study at post-intervention.
Percentage of participants who completed all MoodGYM sessions.
MoodGYM-developer is an author.
Access to depression information website also provided with MoodGYM.
Results from two MoodGYM treatment arms (with, and without, weekly phone tracking) averaged together for current analysis.
Reported elsewhere (Littlewood et al., 2015).
GP care also provided with MoodGYM.
Initial phases reported elsewhere (Christensen et al., 2004).
MoodGYM’s effectiveness for the primary outcome of depression symptoms
As per the forest plot (Figure 2), comparisons from 11 studies demonstrated MoodGYM’s effectiveness for depression symptoms at post-intervention, with a small effect size (g = 0.36, 95% CI: 0.17–0.56) and high heterogeneity (I2 = 78%). Removing the lowest quality studies (k = 3) had minimal impact on this finding (g = 0.39, 95% CI: 0.18–0.62; I2 = 85%). However, adjusting for publication bias indicated in the funnel plot (Figure 3) reduced the effect size to a non-significant level (g = 0.17, 95% CI: −0.01 to 0.38). Comparisons from six studies demonstrated MoodGYM’s effectiveness for depression symptoms at the first follow-up data collection point, with a small effect size (g = 0.27, 95% CI: 0.09–0.46) and high heterogeneity (I2 = 78%).

Forest plot for MoodGYM’s effectiveness for depression symptoms at post-intervention.

Funnel plot for publication bias analysis. High asymmetry is a sign of the presence of such bias.
MoodGYM’s effectiveness for symptoms of anxiety and general psychological distress
Comparisons from six studies demonstrated MoodGYM’s effectiveness for anxiety symptoms at post-intervention, with a medium effect size (g = 0.57, 95% CI: 0.20–0.94) and high heterogeneity (I2 = 85%). Although comparisons from six studies did not yield significance for MoodGYM’s effectiveness for general psychological distress symptoms at post-intervention, the small effect size approached significance (g = 0.34, 95% CI: −0.04 to 0.68; I2 = 79%).
Sub-group analyses of MoodGYM’s effectiveness for depression symptoms at post-intervention
Type of control deployed. When MoodGYM was compared with no treatment controls (k = 8), the effect size was medium (g = 0.53, 95% CI: 0.23–0.83); however, with ‘active’ controls (general practitioner [GP] care or mental health websites; k = 3), the effect size was small and not significant (g = 0.12, 95% CI: −0.82 to 0.31). Heterogeneity was medium in the former analysis (I2 = 62%) and high in the latter analysis (I2 = 80%).
Level of clinician guidance. When MoodGYM was provided with face-to-face guidance (k = 4), the effect size was large (g = 0.75, 95% CI: 0.02–1.48); however, with remote (telephone or email) guidance (k = 6), the effect size was small (g = 0.23, 95% CI: 0.05–0.41). Heterogeneity was high in both analyses: face-to-face (I2 = 81%) and remote (I2 = 73%).
Setting. When MoodGYM was provided in clinical settings (k = 4), the effect size was small (g = 0.38, 95% CI: −0.05 to 0.81). When MoodGYM was provided in non-clinical settings (k = 7), a similarly small effect size was yielded (g = 0.36, 95% CI: 0.12–0.60). Heterogeneity was high in both analyses: clinical setting (I2 = 83%) and non-clinical setting (I2 = 78%).
Country of trial. When MoodGYM RCTs were undertaken in Australia (k = 5), the effect size was large (g = 0.73, 95% CI: 0.19–1.27); however, in Europe-based RCTs (k = 6), the effect size was small (g = 0.17, 95% CI: 0.04–0.30). Heterogeneity was high in the former analysis (I2 = 82%) and low in the latter analysis (I2 = 46%).
Conflict of interest. In studies with a MoodGYM-developer as an author (k = 5), the effect size was small (g = 0.39, 95% CI: 0.19–0.59). In studies not involving MoodGYM-developers (k = 6), a similarly small effect size was yielded (g = 0.37, 95% CI: −0.02 to 0.76). Heterogeneity was medium in the former analysis (I2 = 67%) and high in the latter analysis (I2 = 83%).
Adherence to MoodGYM. In studies with ‘high adherence’ (when participants, on average, completed at least 50% of sessions; k = 5), the effect size was large (g = 0.64, 95% CI: 0.15–1.14); however, in studies with low adherence (k = 6), the effect size was small (g = 0.22, 95% CI: 0.42–0.41). Heterogeneity was high in the former analysis (I2 = 79%) and medium in the latter analysis (I2 = 72%).
Discussion
Summary of main findings
The main analysis (k = 11) demonstrated MoodGYM’s effectiveness for depression symptoms at post-intervention, with a small effect size (g = 0.36, 95% CI: 0.17–0.56). However, adjusting for publication bias reduced the effect size to a non-significant level (g = 0.17, 95% CI: −0.01 to 0.38). On the other hand, significance was yielded at the first follow-up data collection point (k = 6; g = 0.27, 95% CI: 0.09–0.46). Regarding secondary outcomes (at post-intervention), a medium effect size was yielded for anxiety symptoms (k = 6; g = 0.57, 95% CI: 0.20–0.94). MoodGYM’s effectiveness for general psychological distress symptoms was not demonstrated, but the small effect size approached significance (k = 6; g = 0.34, 95% CI: −0.04 to 0.68).
Both the type of setting (clinical vs non-clinical) and MoodGYM-developer authorship had no meaningful influence on results; however, several other factors had a considerable influence. On average, larger effect sizes were yielded for studies with no treatment controls (vs active controls), face-to-face guidance (vs remote guidance) and high adherence (vs low adherence). Interestingly, Australia-based RCTs yielded a markedly larger pooled effect size (g = 0.73, 95% CI: 0.19–1.27) than those based in Europe (g = 0.17, 95% CI: 0.04–0.30).
Limitations and strengths
Relatively few studies were available for analysis, but adequate statistical power was ensured by a combined sample size of 5745. The high heterogeneity limits the validity of comparisons made but it is worth noting that I2 statistic is less precise in meta-analyses with relatively few studies (Von Hippel, 2015). All studies had mostly female and young adult participants; thus, the findings have limited generalisability beyond these populations. Although safeguarding the validity of our analyses, our inclusionary criteria meant that relevant research relating to MoodGYM’s effectiveness was not analysed. Of particular relevance to people with mood problems and their families, a recent RCT involving medical interns in the United States (n = 199) found that MoodGYM was more effective than a psycho-educational email control condition in reducing suicidal ideation (Guille et al., 2015). Moreover, RCTs have demonstrated MoodGYM’s effectiveness for reducing personal stigmatising attitudes to depression in community-dwelling adults (n = 525) (Griffiths et al., 2004), and anxiety symptoms (but not depression symptoms) in adolescent schoolchildren (n = 1477) (Calear et al., 2009). Quasi-experimental trials also provide support for MoodGYM’s effectiveness for depression symptoms in the latter population (O’Kearney et al., 2006, 2009). For methodological consistency, we did not seek to analyse non-experimental MoodGYM website data. Although these data are routinely collected on tens of thousands of users, the utility of these data in evaluating MoodGYM’s effectiveness is limited by very low adherence: a previous study (n = 38,791) found that less than 7% of users progressed beyond the first two sessions of the programme (Christensen et al., 2006a). In contrast to previous meta-analyses of (heterogeneous) cCBT programmes for depression, our sole focus on MoodGYM adds precision to our findings. The study also benefits from its use of sub-group analyses addressing various unresolved issues in cCBT research. The findings can be used by clinicians in decisions relating to the provision of MoodGYM, and by people all over the world who are considering accessing MoodGYM via the Internet.
Comparisons with other studies
The small effect size yielded for depression symptoms corresponds with findings from previous cCBT meta-analyses that did not evaluate individual programmes (Andersson and Cuijpers, 2009; Andrews et al., 2010; Spek et al., 2007). MoodGYM’s larger effect on the secondary outcome of anxiety symptoms (g = 0.57) is counter-intuitive given that the programme was designed to treat depression; however, the content of MoodGYM incorporates general CBT principles that can be applied to the treatment of anxiety. An alternative explanation is that cCBT may be more effective for anxiety than depression – as tentatively indicated in a 2007 meta-analysis (Spek et al., 2007).
Given that previous research indicates that cCBT is more effective in non-clinical settings than routine settings (Andersson et al., 2009), the absence of a similar finding in the current analysis is surprising. On the other hand, the larger yielded effect sizes for studies with no treatment controls, face-to-face guidance and high adherence concur with the findings of previous cCBT meta-analyses (Andersson and Cuijpers, 2009; Andrews et al., 2010; Spek et al., 2007). It is interesting that studies reporting low adherence to MoodGYM also yielded significant treatment effects. This finding is in line with a previous component analysis (n = 2794) which showed that shorter versions of MoodGYM (e.g. three sessions) can be effective for depression symptom reduction (Christensen et al., 2006b). MoodGYM-developer authorship did not have a meaningful impact on results, and this shows that assumed ‘conflicts of interests’ are not always indicative of bias. This is currently a ‘hot topic’ – authors of a recent high-profile MoodGYM RCT highlighted that the evidence for cCBT may have limited validity due to the understandable initial predominance of ‘developer-led’ trials (Gilbody et al., 2015).
To our knowledge, this was the first cCBT meta-analysis that accounted for the effect of international region. Therefore, the finding of a large effect size for Australia-based MoodGYM RCTs in comparison to a small effect size for Europe-based RCTs is particularly noteworthy. Speculating on reasons for this finding, it could be that the cultural references in MoodGYM resonate more with people living in the country where it was developed, than Europeans. Moreover, Australia has long been a world leader in the development of Internet-based interventions for mental health (Orman et al., 2014) – it is possible that MoodGYM’s superiority within this country is partly explained by greater acceptance of cCBT by Australian participants and better service infrastructure relating its delivery.
Implications for clinical practice and future research
Tentative support is provided for MoodGYM’s effectiveness for symptoms of depression and general psychological distress. The programme’s medium effect on anxiety symptoms demonstrates its utility for people with this difficulty. MoodGYM benefits from its free accessibility over the Internet, but adherence rates can be problematic and at the extreme can fall below 10%. We conclude that MoodGYM is best placed as a population-level intervention that is likely to benefit a sizeable minority of its users.
If resources and time permitted, developers could consider making a separate MoodGYM programme for anxiety, or altering the programme to gear it more towards the treatment of both depression and anxiety simultaneously – indeed a recent meta-analysis indicated that such ‘transdiagnostic’ cCBT programmes may be effective (Newby et al., 2016). The provision of MoodGYM with face-to-face guidance yielded a large effect size and should be explored further. Paradoxically, although more face-to-face guidance may be associated with greater effectiveness, it takes away from the practical benefits of cCBT, namely, the saving of the clinician’s time and the associated increasing of access to clinical services. Striking the right balance between guidance and self-direction remains a challenge in the provision of cCBT. Our findings also point to the need to examine cross-cultural factors that may influence the effectiveness of cCBT programmes. Finally, further research should be directed towards the effectiveness of MoodGYM for additional outcomes (e.g. suicidal ideation) and in under-studied populations such as males and older adults.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
Conal Twomey is a recipient of funding from the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme (FP7/2007–2013; REA grant agreement no. 316795).
