Am J Psychiatry 165:6, June 2008 731
Article
ajp.psychiatryonline.org
This article is featured in this month’s AJP Audio.
Toxic Effects of Depression on Brain Function: Impairment
of Delayed Recall and the Cumulative Length of Depressive
Disorder in a Large Sample of Depressed Outpatients
Philip Gorwood, M.D., Ph.D.
Emmanuelle Corruble, M.D.,
Ph.D.
Bruno Falissard, M.D., Ph.D.
Guy M. Goodwin, D.Phil.,
F.Med.Sci.
Objective: An important current hypoth-
esis suggests that the relationship be-
tween severe depression and the hippo-
campus is essentially toxic. The purpose
of this study was to assess the generaliz-
ability of the impact of depression on
hippocampal function.
Method: Participants were 8,229 outpa-
tients who 1) fulfilled DSM-IV criteria for
major depressive disorder based on clini-
cal assessment and 2) were tested for de-
layed recall, a memory function that is
particularly related to hippocampal integ-
rity in humans, during two visits several
weeks apart.
Results: As expected, at presentation
with depression, the subjects’ current ill-
ness severity was the major determinant
of performance, as opposed to the inten-
sity of their previous depressive history
(the number and length of past episodes).
However, following clinical response at
the second visit, the length of previous de-
pressive history became more significant
than current symptoms. The following fac-
tors had significant, independent impact:
age, education level, and profession.
Conclusions: Previous studies of small
samples assessed for memory function,
more or less specific to the hippocampus,
have shown great variability in age, gen-
der, education level, and the length and
intensity of depressive episode. Hence, a
very large sample was required to disen-
tangle the central effect of previous de-
pressive history. As demonstrated in a
general practice sample in this study, the
hypothesis that the length of past depres-
sion impairs memory performance is sup-
ported, suggesting that there is a toxic
link between the burden of depression
and cognition. This finding has important
implications for public health.
(Am J Psychiatry 2008; 165:731–739)
It is a widely accepted current hypothesis that the rela-
tionship between severe depression and the hippocampus
is essentially toxic. Hence, the more intense the history of
depression, the smaller the hippocampus. However, there
is currently little evidence to consider this to be a general
effect that is present across the spectrum of depression di-
agnoses, which are regarded as highly heterogeneous. This
is particularly true in primary care, in which there is often
skepticism that depression represents anything more than
normal human distress. Accordingly, we have sought to
provide evidence of the generalizability of depression’s
impact on hippocampal function. A large sample size
would be required to account for many potentially major
confounds, with an assessment that could be repeated be-
fore and after treatment in order to control for the impact
of the acute depressive state and provide a simple assay of
hippocampal function.
The majority of studies linking the hippocampus and
major depressive illness have employed quantitative
structural magnetic resonance imaging (MRI), functional
MRI (fMRI), or positron emission tomography (PET),
methods that are not easy to access in ordinary clinical
practice. However, hippocampal size has usually corre-
lated inversely with illness duration. This was demon-
strated in a chronically depressed sample (1) and in pa-
tients with variable illness durations seen in secondary
care (2). Hippocampal size may also be related to other
measures of illness intensity, such as the number of past
hospitalizations (3) and recurrence of the disorder (4).
Moreover, hippocampal abnormalities have been ob-
served in the early years after illness onset (5). Meta-anal-
yses have confirmed hippocampal volume reduction, and
the total number of depressive episodes may be particu-
larly correlated with right hippocampal volume (6, 7).
The samples in imaging studies have often been rather
small and potentially unrepresentative of the majority of
outpatients with depression. We have been interested in
measures of cognitive function that assay the function of
the hippocampus but are much easier to conduct on a
very large scale in everyday medical care. The choice of
measures can be informed by imaging evidence of funda-
mental hippocampal involvement. Thus, activation of the
YYL
加亮
YYL
加亮
YYL
加亮
732 Am J Psychiatry 165:6, June 2008
TOXIC EFFECTS OF DEPRESSION ON BRAIN FUNCTION
ajp.psychiatryonline.org
hippocampus has been observed with tasks such as word-
stem completion (8–10), success of word retrieval (11–14),
emotional valence (15), and encoding (16–22). Even more
consistent hippocampal activations have been shown in
healthy subjects with a paragraph encoding task (23) that
involves the encoding of complex and integrated informa-
tion, which is hypothesized to be a core role for the hip-
pocampus (24) and classically impaired in patients with
known hippocampal lesions. Thus, there is considerable
evidence to suggest that delayed paragraph recall is par-
ticularly related to hippocampal function in humans (e.g.,
9, 25, 26).
While memory impairment may reasonably be taken to
assay hippocampal function, its relationship to any current
depressive episode has at least two (nonexclusive) aspects.
1) If the characteristics of the present depressive episode
(such as symptom severity) predominate, memory func-
tion is likely to be simply a “state marker,” reflecting the di-
rect cognitive impact of current mood. 2) Alternatively, if
the dominant factor is a lifetime cumulative impact of
mood disorder (such as the total length or number of past
episodes), memory impairment will act as a “trait marker”
of enduring toxic effects of depression on brain function.
In depressed individuals, both would potentially contrib-
ute. In recovered individuals, deficits would most likely re-
flect the enduring brain changes seen with brain imaging.
The present study is of a large sample of outpatients
who fulfilled DSM-IV criteria for major depressive disor-
der and were assessed for memory function during two
visits several weeks apart. We anticipated that some clini-
cal factors, such as mood state markers (the length and se-
verity of current depressive episode), choice of antide-
pressant, and variation of testing methodology in a
nonspecialist setting (as well as other factors such as older
age, lower education level, and unemployment), would
potentially confound the results at both visits. We hypoth-
esized that the length and number of past episodes would
also be involved in correct delayed recall but were much
more likely to be discernible at a second clinic visit after
symptom resolution, when the severity of current depres-
sion would be reduced.
Method
Participants
A list of 4,849 medical doctors were contacted via mail in France
and asked to participate in a short-term follow-up protocol of de-
pressed patients. Of these, 3,375 physicians (69.6%) agreed to par-
ticipate. At least two contacts (usually via telephone) were made to
each participating investigator: at the beginning of the protocol
(in order to ensure that the protocol was clear and to explain how
to assess memory recall) and at the close of trial entry (in order to
verify the data received). By the end of the study, 1,844 clinicians
(38.0%) had included at least one patient (a maximum number of
five patients was requested to avoid center effects), who was fol-
lowed up with a delay of at least 6 weeks between the two visits.
The participating clinicians were experienced (mean age=49.9
years [SD=11.8]), with 7.8% practicing in a hospital, 46.6% in pri-
vate practice, and 45.6% in a group practice.
Clinicians were asked to include consecutive patients for
whom a new (or different) antidepressant had to be prescribed for
a major depressive episode. In addition, the patients were re-
quired to be older than 18 years, speak fluent French, possess a
social security number, and give informed consent. Patients had
to be included by their clinicians during a 3-month interval. After
complete description of the study was given to the subjects, writ-
ten informed consent was obtained. A total of 9,515 patients were
included in the study.
Exclusion criteria were a diagnosis of bipolar disorder and the
use of a mood stabilizer during treatment. All antidepressants (in
accordance with the French Food and Drug Administration) were
accepted in order to reflect usual clinical practice. Any change of
antidepressant, an increase in the dosage, or the addition of a
benzodiazepine was recorded at the second visit.
Instruments
The Hospital Anxiety and Depression Scale was chosen as a self-
report instrument to measure symptom severity because of its ra-
pidity and simplicity of rating. The scale was completed by all pa-
tients at the first and second visits. A score above 8 for the depres-
sion domain was required as an inclusion criterion. Hospital
Anxiety and Depression Scale anxiety scores (score above 8 also re-
quired for inclusion) were analyzed separately because anxiety
may also have an impact on hippocampal volumes, according to
animal models (27, 28) as well as studies in humans (29).
The criteria for a major depressive episode were examined by
the clinician, and the duration of each symptom was recorded
during the two face-to-face visits. The presence of five or more
symptoms (i.e., a DSM-IV diagnosis of major depressive disorder)
was required for inclusion. The initial assessment also included
the number of past depressive episodes, either treated or not
treated with an antidepressant, and the cumulative length of past
mood disorder.
The delayed paragraph recall index from the Wechsler Memory
Scale—Revised (30) was employed as a valid (8), sensitive (31, 32)
measure of verbal declarative memory and a surrogate marker of
hippocampal function. This subtest was administered according
to the standardized protocol outlined in the Wechsler Memory
Scale–Revised Manual (1987). A different story at each visit was
read aloud to the subject by the clinician. After hearing the story,
the subject was asked to repeat it using as many of the same
words as he or she could recall from memory. One point was given
for each verbatim or acceptable alternative response phrase. After
at least a 10-minute delay (when the subject was distracted to
complete the Hospital Anxiety and Depression Scale and provide
other information), the subject was asked to recall the story again.
The clinician tabulated the scores for each story as well as the to-
tal score for the immediate and delayed trials.
The existence of a practice effect was indirectly assessed in the
subgroup of patients with identical levels of depression at both
visits (the same Hospital Anxiety and Depression Scale global
scores and number of DSM-IV symptoms of major depressive ep-
isodes). In this subgroup of 33 patients, the difference between
the number of correct delayed recall responses for the two visits
(mean=0.029 [SD=3.6967]) was not significantly different from 0
(t=0.046, p=0.963).
The level of memory impairment can be stratified according to
the original description by Russel (33), who suggested that de-
layed recall can be classified, out of the initial 25 elements, as fol-
lows: 24 to 25=“better than normal,” 20 to 23=“normal,” 15 to 19=
“mild,” 9 to 14=“mild to moderate,” 4 to 8=“moderate to severe,”
and 0 to 3=“severe.” An age-corrected standard score can be com-
puted on the basis of the Wechsler delayed paragraph recall in-
dex, but we preferred to treat age as one of several potential con-
Am J Psychiatry 165:6, June 2008 733
GORWOOD, CORRUBLE, FALISSARD, ET AL.
ajp.psychiatryonline.org
founding factors in our analysis. Nevertheless, age-corrected
versus age-uncorrected scores of the Russel classification were
very similar and identified the same group three standard devia-
tions below the mean (kappa=0.829, var[kappa]<0.001).
Statistics
Variables were examined for the normality of distribution be-
fore using parametric statistics. Given the sample size, rejection
of a normal distribution (Kolmogorov-Smirnov test) for the fac-
tors analyzed was expected for all variables. However, even if sig-
nificantly different from 0 (p<0.001), the statistic was small for
major covariables such as age (0.048), Hospital Anxiety and De-
pression Scale depression score at visit 1 (0.085) and visit 2
(0.057), Hospital Anxiety and Depression Scale anxiety score at
visit 1 (0.075) and visit 2 (0.089), and the number of correct de-
layed recall responses at visit 1 (0.076) and visit 2 (0.073). A graph-
ical appreciation was used to assess the normality of distribution
of delayed recall (at the first and second visits). Thus, Q-Q (quan-
tile) plots showing that dependent variables were close enough to
the normal distribution were created.
Parametric correlation (Pearson test) was used to compare two
continuous variables, and analysis of variance was used to ana-
lyze the role of a qualitative factor to explain continuous parame-
ters. Since some parameters directly influenced correct recall in
our sample (such as age and Hospital Anxiety and Depression
Scale scores), we further analyzed the correlation between the
number of potential recall responses and past episodes (their
nonindependence being our main hypothesis) in different
ranges.
Structural equation modeling was used on the basis of SPSS
and SAS PROC CALIS to disentangle the respective role of clus-
tered variables, including state-dependent variables (such as the
number of symptoms and Hospital Anxiety and Depression Scale
scores) and state-independent variables (such as the total length
of depressive history and number of past major depressive epi-
sodes) (34).
Patients with “severe” impairment (according to the Russel [33]
classification) are at higher risk of neurological defects, and thus
structural equation modeling analyses at the first and second vis-
its were conducted, omitting this subgroup of patients. A further
structural equation modeling analysis separated anxiety and de-
pression scores on the Hospital Anxiety and Depression Scale and
omitted the number of DSM-IV symptoms in order to avoid rein-
forcing the role of depression versus anxiety.
Last, we further investigated our main finding, the relationship
between delayed recall and the number of past major depressive
episodes in patients with treatment response, using a linear re-
gression analysis.
Results
Sample
A total of 9,515 depressed patients entered the study. The
final sample for analysis consisted of 8,229 patients
(86.48%). Subjects were excluded if the Hospital Anxiety
and Depression Scale score was below 8 for depression or
anxiety (644 patients) or data characterizing the patient
were not correctly or completely saved. The subsample of
subjects excluded 1) had fewer DSM-IV symptoms of de-
pression (t=4.25, df=9512, p<0.0001); 2) had a shorter length
of the present episode (t=2.15, df=6798, p=0.0159); 3) were
more frequently men (χ2=172.9, df=8, p<0.0001); and 4) had
better final Hospital Anxiety and Depression Scale scores
for depression (t=3.974, df=9512, p<0.0001) and anxiety (t=
5.212, df=9512, p<0.0001), and thus they were more fre-
quently responders (χ2=3.71, df=1, p=0.0124).
Women comprised 70.37% of the final sample, and the
average age of the sample was 48.02 years (SD=14.09). In
this population, 1,115 patients were 65 years old or older,
and 407 patients were over 75 years old. Civil status among
these patients was as follows: married, 48.37%; single,
15.41%; divorced, 15.57%; and widowed, 7.31%. Education
levels (middle=high school graduate) were as follows: low,
49.10%; middle, 30.06%; and high, 20.84%. Employment
status was as follows: active employment, 57.86%; unem-
TABLE 1. Clinical and Cognitive Characteristics of Patients With Major Depressive Disorder Before and After Antidepressant
Treatment
Characteristic
Assessment
AnalysisVisit 1 Visit 2
Mean SD Mean SD t df p
Number of DSM-IV criteria for depression 6.62 1.14 3.25 1.77 142.86 16, 456 <0.0001
Hospital Anxiety and Depression Scale score for depression 14.62 3.24 9.94 4.22 79.83 16, 456 <0.0001
Hospital Anxiety and Depression Scale score for anxiety 14.12 3.01 8.92 3.40 103.86 16, 456 <0.0001
Number of correct delayed recall responses 9.98 4.57 12.04 4.85 27.63 16, 456 <0.0001
TABLE 2. Level of Memory Impairment for Delayed Recall in Patients With Major Depressive Disorder Before Visit 1 and 6
Weeks After Antidepressant Treatment (visit 2)a
Classification
Assessment
Visit 1 (N=8,221) Visit 2 (N=8,221) Visit 2 Responders (N=1,895)b
N % N % N %
Better than normal (24–25 elements) 19 0.24 74 0.93 26 1.37
Normal (20–23 elements) 180 2.25 520 6.52 162 8.55
Mild (15–19 elements) 1,239 15.47 1,961 24.61 554 29.23
Mild to moderate (9–14 elements) 3,267 40.80 3,279 41.13 754 39.79
Moderate to severe (4–8 elements) 2,793 34.88 1,979 24.82 340 17.94
Severe (0–3 elements) 510 6.37 160 2.01 14 0.74
a According to Russel classification (33).
b Subsample of patients after 6 weeks of treatment who had treatment response (i.e., at least a 50% decrease of their initial Hospital Anxiety
and Depression Scale score for depression and less than five diagnostic criteria).
734 Am J Psychiatry 165:6, June 2008
TOXIC EFFECTS OF DEPRESSION ON BRAIN FUNCTION
ajp.psychiatryonline.org
ployed, 12.43%; retired, 19.07%; student, 0.11%; and an-
other type of professional activity, 10.53%.
The Hospital Anxiety and Depression Scale score for de-
pression was below 11 in only 10.48% of patients. The du-
ration of current major depressive episode was on average
8.37 weeks (SD=10.84). Among the patients, 18.38% ful-
filled five DSM-IV criteria for depression, 31.82% fulfilled
six criteria, 27.76% fulfilled seven criteria, 15.44% fulfilled
eight criteria, and 6.60% fulfilled nine criteria (Table 1).
This was the first episode of depression for 49.64% of pa-
tients, the second episode for 25.30%, the third episode for
6.05%, and between the fourth and 13th episode for the re-
maining patients (4.67%). The length of unipolar depres-
sion, except the present episode, was on average 12.55
weeks (SD=34.91), and the lifetime duration (or number of
weeks depressed) was 21.23 weeks (SD=33.91).
The second visit was on average 42 days (SD=8.93) after
the first visit (between 3 and 20 weeks). At the second visit,
the number of depressive symptoms from the list of DSM-
IV criteria decreased (Table 1). According to these criteria,
76.39% of patients no longer fulfilled the diagnosis of ma-
jor depressive episode (i.e., fulfilled less than five criteria).
According to Hospital Anxiety and Depression Scale self-
rating, 76.44% of patients were responders (i.e., patients
who had at least a 50% decrease of the depression score
between the two visits).
The number of correct answers for immediate recall of
the Weschler paragraph recall index tested at the first visit
was between one and 24, with an average of 11.75 appro-
priate answers (SD=4.72) (see the table in the data supple-
ment accompanying the online version of this article). The
delay between immediate and delayed recall was on aver-
age 14.15 minutes (SD=7.73; range=5–56). Accordingly, the
delay between immediate and delayed recall was intro-
duced into multivariate analyses, as was the delay be-
tween the first and second visits. The patients recalled ap-
proximately 10 correct details of the paragraph (i.e.,
85.89% of the initial immediate recall) (Table 1).
During the second visit, a global improvement of imme-
diate recall was observed (13.24 correct answers; [SD=
4.79; range=1–24]) (see the table in the data supplement
accompanying the online version of this article). On aver-
age, 13.58 minutes later