Abstract
Objective:
Impairment in the retrieval of specific episodes from autobiographical memory is commonly observed in major depression. However, it is unclear whether impairment in retrieval processes is a general characteristic of major depression or is confined to the recollection of personal memories. This study examined the time course of the retrieval of words from semantic memory.
Method:
A letter fluency test was administered to 65 inpatients with major depression and 50 healthy controls. A two-parameter model was fit to the decay curve representing the production of words over a 90-second period. One parameter, N, is an estimate of the total number of words that would be generated if the respondent was given unlimited time. The other, tau, is the average of the difference in time between the first word generated and each subsequent word.
Results:
There was evidence of a deficit in the retrieval of words from long-term memory in depressed patients. The significant difference between groups suggested that even if given an extended period of time in which to respond to compensate for possible slowness, the depressed group would not retrieve as many words as the controls. The retrieval failure could not be attributed solely to cognitive slowing or the effects of antidepressant medication.
Conclusions:
The results extend findings of a deficit in the process of retrieving specific episodes from autobiographical memory and suggest that a generalised impairment in memory retrieval may be characteristic of major depression.
Introduction
Cognitive impairment is a core feature of major depression, including difficulties retrieving memories from the recent and remote past (Austin et al., 2001; Burt et al., 1995; Herrmann et al., 2007; Porter et al., 2003; Raes et al., 2006; Thomas et al., 2009). A significant focus of research with depressed patients has been on autobiographical memory, which is the ability to recall personal knowledge or important episodes from the past. Autobiographical memory is of particular interest because it assesses the ability of patients to recall episodes that they experienced and stored in memory at times when they were presumably not experiencing a depressive episode, such as during childhood. Thus, any deficit in remembering, relative to healthy controls, is likely to be primarily the consequence of retrieval failure rather than impairment in encoding and storage.
A consistent finding in the study of autobiographical memory in depressed patients is their propensity to produce more general than specific memories than do controls when instructed to recall event-specific autobiographical episodes. This is often termed the overgeneral memory phenomenon (Evans et al., 1992; Williams and Dritschel, 1988; Williams et al., 1996). The procedure used to determine autobiographical specificity typically uses a cueing methodology (referred to as the Autobiographical Memory Test, AMT), first employed by Williams and Broadbent (1986), in which participants are asked to respond with event-specific memories to a series of positive and negative cue words. In this paradigm, given a cue word such as ‘pleasant’, depressed patients are more likely than controls to provide a general memory (‘When I was in London, I used to like walking in Hyde park’) than a time-limited autobiographical episode (‘I remember being at the first day of the Ashes cricket test in 2005’), when asked to do so. In a review of autobiographical memory specificity and emotional disorder, Williams et al. (2007) identified 11 controlled studies of memory in major major depression and computed an average between-group (depression versus control) effect size of Cohen’s d = 1.12 on measures of the probability of producing general rather than specific memories. Overgeneral memory has also been reported in individuals with trauma-related stress and has been proposed as a possible ‘marker’ of vulnerability to major depression in patients in remission (Mackinger et al., 2000; Williams and Dritschel, 1988).
What might explain overgeneral autobiographical memory in major depression? In general, there are two competing explanations: it could be a psychological phenomenon (e.g. avoidance of re-experiencing the past) or it could result from transient or persistent biological changes in the brain that mediate depressive symptoms. The latter explanation suggests that overgeneral memory results from a reduction in the resources needed to complete a search for a specific episode successfully, as a consequence of inefficiencies in the neural networks responsible for memory retrieval. When considering the possible psychological basis of overgeneral memory, Williams and colleagues draw on hierarchical models of autobiographical memory (Burgess and Shallice, 1996; Conway and Pleydell-Pearce, 2000); they propose that failure to retrieve specific episodes results from the premature termination of the depressed patients’ search-through memory. Following Conway (2005), they conceive of a structure for autobiographical memory that proceeds from the most general level, comprising knowledge of self, through a hierarchy of memories organised around themes, time periods and general events, to a level comprising specific episodic memories. The latter, when brought to mind, are accompanied by a re-experiencing of the context of the original episodes, including visual and other sensory memories of the scene and any individuals involved. Williams et al. (2007) propose two possible psychological mechanisms to explain overgeneral memory in major depression: (a) functional avoidance of specific memories that, if activated, would be painful; and (b) distraction or the ‘capture’ of memory processes at a general-conceptual level by powerful negative ruminations about the self.
The aim of the present study was to look for evidence that the retrieval deficit causing overgeneral autobiographical memory has a biological basis; that is, is attributable at least in part to a failure of neural networks responsible for memory retrieval. If that were the case then failure would be seen on other memory retrieval tests that do not involve re-experiencing personal autobiographical episodes that are potentially emotionally charged. In other words, if there is a biological basis to autobiographical memory retrieval dysfunction in major depression, then the effects will be more widespread, and seen in failures of episodic retrieval (i.e. the recall of items from studied word lists) and semantic retrieval (i.e. conscious recollection of language and factual knowledge), as well as autobiographical recall. The rationale for this hypothesis is predicated on the assumption that there is a common underlying neural network for memory retrieval across different memory tests. This view runs counter to memory research with neurological patients that has typically focused on differences between memory processes and has proposed that separate memory systems underlie, for example, episodic, semantic and autobiographical memory. This work has been important in understanding the functions of anatomically distinct regions of the brain, but ignores the possibility of functional overlap across memory tests. In particular, it has been proposed (Burianova and Grady, 2007; McIntosh, 1998) that the complex process of retrieval of memories is unlikely to be strongly localised, and might be mediated by a functionally related neural network. In a test of this possibility, Burianova et al. (2010) used an event-related functional magnetic resonance imaging (fMRI) paradigm to record regional activation during retrieval tests using a common set of stimuli but different retrieval demands requiring either autobiographical, episodic or semantic memory recall. They conducted a functional connectivity analysis and found that a large-scale common neural network, involving the regions bilaterally in the frontal and temporal lobes, and the left temporoparietal junction, was involved in retrieval regardless of memory content. The authors concluded that this functional network involved the higher-order control functions such as working memory, error monitoring, and response verification necessary to coordinate retrieval.
Verbal fluency was selected here as the test to be used to study memory retrieval. Laboratory-based episodic memory tests commonly used in neuropsychological practice, such as paragraph recall, are not suitable for the present purpose since they are typically administered while the person is depressed, and so confound learning/encoding deficits with retrieval. In contrast, verbal fluency tests require retrieval of learned information from the past. Typically, verbal fluency is measured by instructing patients to generate words from specific categories (semantic fluency) or beginning with specific initial letters (phonemic or letter fluency) (Lezak et al., 2004; Randolph et al., 1993; Spreen and Strauss, 1998; Tombaugh et al., 1999) for a limited time, usually 60 seconds. A letter fluency procedure was employed for the present study for a number of important reasons. First, producing words in response to a designated letter is a relatively neutral task, unlikely to engage any defensive avoidance, and so provides a test of retrieval efficiency uncomplicated by psychological factors. Second, the bulk of vocabulary acquisition occurs during the first 20 years of life; that is, before the typical age of onset of major depression. Thus, there is no reason to assume that a failure of word generation results from a failure of encoding. In this regard, verbal fluency tests are similar to autobiographical tests in minimising the effects of encoding deficits on retrieval proficiency. Third, verbal fluency allows analysis of the time course of recall. This is important because it is possible that any deficit seen in the performance of depressed patients is entirely attributable to a generalised slowing of cognition, and there may be no specific impact of major depression on retrieval.
Analysing the total number of words generated in a verbal fluency paradigm does not, however, allow any conclusion to be drawn about the mechanism responsible for any impairment. Thus, although it is well-documented that depressed patients produce fewer words in time-limited (60 seconds) letter fluency tests (Zakzanis et al., 1998), it is not possible to determine from total scores why this might be. In particular, it is not possible to rule out the hypothesis that poor performance is the result of generalised cognitive slowing; it may be that depressed people actually can generate as many words as healthy controls if given enough time. To examine this possibility, it is necessary to examine the time course of word retrieval. This is relatively easy to do in a letter fluency paradigm because there are a large number of potential words that can be generated in response to a specific letter, and these are generally produced without recourse to higher-order strategies, and thus do not tend to be produced in clusters. This allows the accurate fitting of curves that model the decay of search outputs over time.
In the present study, a curve-fitting procedure was used to distinguish between two possible reasons why depressed patients may be less efficient at retrieving words in a verbal fluency paradigm. One reason is cognitive slowness; that is, individuals with major depression may retrieve words slowly from memory and so total recall after 60 seconds is an underestimate of their ability. The other more interesting reason is accessibility-loss; depressed patients may have a biologically mediated dysfunction of the executive system that makes words in semantic memory less available. This hypothesis implies that there is an underlying neurocognitive basis to the retrieval failure that inhibits access to previously stored semantic information. Evidence for accessibility-loss on a letter fluency test would provide evidence that retrieval failure was more general than just a failure to recall specific autobiographical memories, and suggest a common biological pathway for all forms of retrieval failure in major depression.
To distinguish between these two explanations it is necessary to examine the time course of word retrieval. Rohrer and colleagues (1995, 1999) developed a model of the rate of retrieval of words from memory that can be used for this purpose. The model has two parameters: N, which is an estimate of the number of words that would be retrieved if a person were given enough time to recall all the exemplars of a category they have available, and tau, an estimate of the mean latency to retrieve each word after the first. In this model, an increased mean latency (tau) in the depressed group, combined with comparable values of N, would indicate the patients were taking longer to search for and retrieve each word from memory (i.e. support for cognitive slowness). Conversely, if N were significantly reduced in the presence of comparable tau values, this would suggest a reduced availability of words in response to letter cues (i.e. support for the accessibility-loss hypothesis). The distinction between these two hypotheses is illustrated in Figure 1a. In this diagram, the line AB represents the size of the verbal fluency deficit after 60 seconds, and shows that B could be reached by a depressed person whose word generation either has reached an asymptote or is much slower, but will eventually reach the same level as controls. Consideration of the parameters of these curves answers the question: If given enough time, can a depressed person produce as many words as a matched control?

(a) Idealised cumulative retrieval curves illustrating the effects of cognitive slowing (tau reduced, N normal) and reduced accessibility (N reduced, tau normal). The difference between points A and B represents the impaired performance of individuals with major depression after 60 seconds on a verbal fluency test. The extension of the curves after 60 seconds shows the hypothesised outcome with no time limit. (b) The cumulative number of words generated for each 5-second period during the 90-second period, averaged across three trials, for depressed patients and controls.
In summary, the objective of this study was to determine whether the retrieval failure seen in overgeneral autobiographical memory in major depression generalised to the performance on another retrieval test, namely word generation on a letter fluency test. The rationale for this was that both autobiographical memory and letter fluency involve effortful retrieval in response to designated cues, and share a common underlying neural network that provides executive control of recall (Burianova et al., 2010). Evidence of a retrieval failure on this test would rule out the possibility that overgeneral memory is a purely psychological phenomenon since the letter fluency test does not involve recall of affect-laden material. Further, if the time-course analysis provided evidence for the accessibility-loss hypothesis, this would support a neurocognitive explanation for retrieval failure and reduce the likelihood that such deficits were the consequence of generalised cognitive slowing.
Methods
Participants
Consecutive patients with a primary diagnosis of major depression according to DSM-IV criteria (American Psychiatric Association, 1994), who were admitted to Hillmorton Hospital (Christchurch, New Zealand), were approached to take part in the study. Exclusion criteria included current significant alcohol or substance abuse or dependence, endocrinological, neurological or chronic medical conditions, pregnancy, previous serious head injury or electroconvulsive therapy in the 12 months prior to admission. Sixty-five depressed patients were eligible (42 female, 23 male), gave informed consent, and completed the study. When tested, 22 patients were prescribed no medication, 16 were on serotonin–norepinephrine reuptake inhibitors (SNRIs), 20 were on selective serotonin reuptake inhibitors (SSRIs), six were on tricyclic antidepressants (TCAs), and one was taking a monoamine oxidase inhibitor (MAOI). For 45 of the patients, this was a recurrent episode, and for 20 it was a single-episode presentation. The patients’ scores on the Montgomery–Asberg Depression Rating Scale (MADRS; Montgomery and Asberg, 1979) ranged from 15 to 51 (M = 35.85, SD = 8.63). The proportion of unipolar to bipolar depressed patients was 58:7, non-psychotic to psychotic features was 58:7, and non-melancholic to melancholic features was 31:34. The mean age of onset of depression was 30.3 years (SD = 11.5), the mean time since onset was 8.9 years (SD = 8.7), and the mean number of hospitalisations was 0.7 (SD = 1.4). Comorbid anxiety disorder was present in 21% (n = 14) of the depressed sample, with the most prevalent anxiety disorders being post-traumatic stress disorder (n = 8; 12.3%) and panic disorder with agoraphobia (n = 7; 10.7%).
Fifty healthy controls (32 female, 18 male) were recruited from the general population in Christchurch with the same exclusion criteria. In addition, healthy controls were excluded for a personal or immediate family history of major mental illness (screened using the Mini International Neuropsychiatric Interview (MINI; Sheehan et al., 1998). Demographic characteristics of the two groups are given in Table 1. There were no significant differences between the groups in age (t113 = 0.7), years of high school education (t113 = 1.8), years of tertiary education (t113 = 0.4), sex (χ2 = 3.6), and estimated verbal IQ (t113 = 0.5), as measured using the National Adult Reading Test (NART; Nelson, 1982). The study was approved by the National Health and Disability Ethics Committee (New Zealand).
Demographic characteristics of depressed (n = 65) and control (n = 50) groups.
Measures
All participants completed a letter fluency test (Benton and Hamsher, 1989), in which they were instructed to produce as many words as possible in 90 seconds for each of the letters C, F and L. This letter fluency test was part of a battery of tests administered during a larger study of changes in neuropsychological function in response to treatment for major depression. Other tests administered included the Rey Auditory-Verbal Learning Test, the computerised Stroop Test, the Groton Maze Learning Test, and measures of facial emotion processing and psychomotor speed (see Douglas et al., 2011 for a description of the test protocol). Neuropsychological findings from this battery of tests have been published elsewhere, including overall words generated on the letter fluency test (Douglas et al., 2011). This paper examines multiple variables from the letter fluency test in more detail.
Procedure
All participants were tested individually by the same investigator (KD). Participants’ responses were audio-taped and the number of responses in each 5-second period were determined and summed across the three letters. For the subsequent analysis, the cumulative totals for each 5-second period were calculated and an exponential equation (Rohrer et al., 1995, 1999) was used to fit curves to the data of each individual:
In this equation, t denotes time and R(t) is the cumulative number of words generated at time t. N is an estimate of the total number of words that would be generated if the participant was given unlimited time. Tau is the average of the difference in time between the first word generated and each subsequent word. The between-group comparisons of the estimates of N and tau were based on the values derived from the curves for each individual.
Data analysis
Statistical procedures were conducted using the Statistical Package for the Social Sciences (SPSS), version 13 for Windows. For demographic and clinical data, categorical variables were analysed with chi-squared tests and continuous variables with parametric t-tests. Differences between verbal fluency variables were examined using parametric t-tests, and estimates of effect size were calculated using the formula (X— depressed group – X— control group)/S pooled. In order to determine whether antidepressant medication was influencing the findings, a series of one-way analyses of variance (ANOVA), with the medication regimen (none, SNRI, SSRI and TCA) as the between-subjects factor, were conducted on demographic and verbal fluency variables. Pearson’s correlations (two-tailed) were computed between demographic and clinical variables of the depressed group and the verbal fluency variables N and tau.
Results
Table 2 shows the means and standard deviations of the results from the letter fluency test for depressed and control groups. There was a significant difference between groups for the total number of words generated after 60 seconds, averaged across the three letters (t113 = 4.8, p < 0.001). For the 90-second cumulative data (see Figure 1b), N was significant (t113 = 4.7, p < 0.001), as was tau (t113 = 2.1, p < 0.05). The effect size for N (d = 0.81) was substantially greater than for tau (d = 0.36).
Verbal fluency scores and parameter estimates for depressed (n = 65) and control (n = 50) groups.
N is an estimate of the total number of words that would be generated if the respondent was given unlimited time and tau is the average of the difference in time between the first word generated and each subsequent word.
VF score: verbal fluency score; SSRI: selective serotonin reuptake inhibitors; SNRI: serotonin–norepinephrine reuptake inhibitors; TCA: tricyclic antidepressants.
The possibility that retrieval in the verbal fluency test was influenced by the effects of medication was examined by comparing subgroups within the depressed sample that were on different treatment regimens or were not on any medication. Equivalence of the treatment groups was examined with a series of one-way ANOVAs with one between-subjects factor: medication regimen (none, SRNI, SSRI and TCA). There was no difference (F3,60 = 0.3, p = 0.8) between these subgroups in depression severity (MADRS score: no medication, M = 35.4; SSRI, M = 36.3; SNRI, M = 36.2; and TCA, M = 33.8). With α = 0.05, there were no significant differences between the medication groups on any of the demographic variables in Table 1. One-way ANOVAs revealed that on the verbal fluency variables in Table 2, there was no significant subgroup effect for N (F3,60 = 1.27), tau (F3,60 = 0.43), or total verbal fluency score after 60 seconds (F3,60 = 2.17; all p < 0.10). There was no evidence that differences in retrieval efficiency between groups were a medication effect.
Correlations were computed between N and tau, and a number of the demographic and clinical variables. MADRS ratings were not correlated with either N or tau (r < 0.05 in each case); however, as might be expected, estimated verbal IQ (NART) was correlated with N (r = 0.38, p < 0.001) but not tau for the total sample, and for both N (r = 0.46, p < 0.001) and tau (r = 0.26, p < 0.05) for the depressed group. There were no significant correlations between the parameter estimates and age. In addition, there were no significant differences between unipolar and bipolar patients, or between patients with and without comorbid anxiety disorders, on either N or tau (t < 0.1, p > 0.5 in all cases).
Discussion
The underlying rationale for this study was that both autobiographical memory and letter fluency tests, although appearing to be quite different tests, have in common the effortful retrieval of specific knowledge that can reasonably be assumed to have been acquired premorbidly, using a shared functional neural network (Burianova and Grady, 2007; Burianova et al., 2010; McIntosh, 1998). The primary outcome of this study is evidence for accessibility-loss on a letter fluency test, which, combined with previous reports of overgeneral memory in major depression, implies a generalised failure of the executive processes controlling retrieval from memory in depressed patients. Time-course analysis showed that the deficit in the retrieval of words from long-term memory in letter fluency cannot be explained by cognitive slowing alone. The significant difference between depressed and control groups on N, which was larger than the difference between groups on tau, suggests that even if given an extended period of time in which to respond to compensate for generalised cognitive slowness, the depressed group would not retrieve as many words as the control group. Of significance was that since word retrieval does not involve access to the kind of personal memories that may invoke an emotional response in depressed patients, functional avoidance or negative distraction do not provide a satisfactory explanation for retrieval impairment. Instead, the deficit is likely to be a function of changes in the neural activity in regions of the brain associated with the performance of verbal fluency tests (Baldo et al., 2001; Birn et al., 2010; Henry and Crawford, 2004; Pujol et al., 1996). If retrieval failure had been entirely a consequence of, for example, functional avoidance, it would be seen on autobiographical tests but not on word retrieval; instead, it is seen on both tests. Finally, reduced efficiency on a letter fluency test, as on tests of autobiographical specificity, cannot be attributed to a failure of encoding since most words, like memories from early life, are acquired and stored premorbidly, during childhood and adolescence. The possibility that retrieval dysfunction might be the consequence of medication was considered by comparing those participants who were on no medication with those treated with SNRIs, SSRIs, or TCAs; there was no support for this proposal.
The present study strengthens the view that memory deficits seen in depressed patients have their origin in changes in the functioning of neural networks that coordinate complex cognitive abilities. More specifically, the current finding of retrieval failure in patients with major depression on a letter fluency test provides evidence that the interesting phenomenon of overgeneral memory (Williams and Dritschel, 1988; Williams et al., 1996, 2007) may have its origin in a generalised reduction in the efficiency of the executive control mechanisms mediated by a complex neural network distributed in the frontal and temporal lobes. Further evidence that functional neural changes explain overgeneral memory comes from Young et al. (2011), the first report of the use of imaging technology to study autobiographical memory in patients with major major depression. This study confirmed the tendency of patients to report fewer specific and more general memories than controls and, more significantly, using fMRI, differential activity in the medial temporal and prefrontal structures associated with retrieval. This evidence is consistent with a neurological basis for the overgeneral memory phenomenon in major depression; however, this conclusion is limited to personal memories, many of which are likely to be emotionally charged. Even if there are differences in neural activation during autobiographical recall, these could be a function of top-down psychological control (i.e. be the consequence of functional avoidance), and not reflect generalised or persistent biological changes. To date, comparable studies of depressed patients, assessing neural activation using fMRI, and employing neutral recall stimuli or verbal fluency tests, have not been reported. There is, however, evidence of impaired executive control that comes from other imaging studies. A study of depressed patients performing a verbal fluency test using positron emission tomography found abnormalities of neural activity in the prefrontal cortex that they concluded were suggestive of inefficient functional connectivity of frontal and cingulate networks rather than changed activation (Videbech et al., 2003). Similar findings have been reported for working memory tests (Vasic et al., 2009), and from studies of hippocampal functioning during episodic memory (Fairhall et al., 2010).
One possible limitation of this study is that the depressed group were all hospitalised, and the results may not generalise to samples recruited from community treatment centres. Typically, inpatient samples report higher levels of depressive symptomatology and show greater cognitive deficit, with evidence being mixed as to whether severity of depressive symptomatology and cognitive deficit are related (McDermott and Ebmeier, 2009; Naismith et al., 2003). While the potential link between depression severity and cognitive deficit was not the focus of this study, analyses suggested that severity of depression was not significantly correlated with deficits in measures of verbal fluency. A further limitation of this study was that patients were a relatively heterogeneous group of individuals with major depression, with differences in depression subtype (bipolar versus unipolar), comorbid anxiety and medication status. It is possible that these factors contributed to the significantly different verbal fluency parameters between the depressed and control groups in this study. Our analyses did not find evidence of this; however, small numbers in these comparisons reduced the power to show any meaningful differences between depression subtypes (e.g. bipolar versus unipolar depression, and with or without comorbid anxiety). Future research directly investigating the association between these variables and verbal fluency in more depth would be helpful. It also remains to be determined whether this retrieval failure persists in remission, or is a function of the depressed state. Nevertheless, the results from this analysis of the verbal fluency performance of patients with major depression showed a consistent retrieval deficit, suggesting that a general impairment in the retrieval of specific memories may be a feature of major depression.
Footnotes
Acknowledgements
We thank the patients and staff at Hillmorton Hospital, Christchurch for their cooperation in this study.
Funding
We thank the New Zealand Tertiary Education Commission for their financial support by providing KD with a Bright Future Doctoral Scholarship for this study.
Declaration of interest
The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper.
