Abstract
In therapy, clients sometimes repeatedly recall their traumatic memories to, among other things, resolve the incoherence said to underlie distress. But the literature is silent on the extent to which people’s memories for traumatic and nontraumatic memories cohere over repeated recall compared with similar “control” memories not repeatedly recalled. We asked people to watch two films portraying traumatic or nontraumatic events and then to repeatedly describe their memory for one of those films over 5 days. Our data suggest repeatedly recalling traumatic and nontraumatic memories prevents the loss of coherence that occurs when memories are not repeatedly recalled. There was little evidence of incoherent traumatic memories or of a relation between the coherence of traumatic memories and posttraumatic-stress-disorder (PTSD) symptoms. These findings suggest that when it comes to PTSD, the effectiveness of exposure therapy is not easily explained by the notion that therapy resolves incoherent traumatic memories.
Keywords
People’s memories for their traumatic experiences are sometimes portrayed as fragmented, jumbled, and recalled in bits and pieces, a feature attributed to a unique encoding process (Ehlers & Clark, 2000; Foa & Riggs, 1993; Sachschal et al., 2019; but see also Fishere & Habermas, 2023; Rubin et al., 2016; Taylor et al., 2020, 2022). According to this view, the incoherence is not benign—it is a symptom of incomplete encoding and contributes to both the development and maintenance of posttraumatic stress disorder (PTSD; Brewin et al., 2010; Ehlers & Clark, 2000; Foa et al., 2006). But the incoherence and associated distress resolve over the course of therapy as “dissociative amnesia” fades and people develop a coherent narrative of the experience (Foa et al., 1995).
In fact, some long-standing therapeutic procedures explicitly target incoherent traumatic memories by having clients repeatedly recall their memories in specific ways to help resolve incoherence (Ehlers et al., 2005; Foa & Rothbaum, 1998; Grand, 2013; F. Shapiro, 1989). Clearly, then, the idea that repeated recall turns incoherent traumatic memories into coherent traumatic memories has important implications for an understanding of theory, mechanism, and practice. For example, in one prevalent form of treatment for PTSD—prolonged-exposure therapy (Foa et al., 2019; Foa & Rothbaum, 1998)—clients repeatedly recall the memory of their traumatic experience in a particular way so that they habituate to it. Over the course of this “imaginal exposure,” clients are encouraged to recall the memories vividly and in the present tense, provide as much detail as possible, and focus on sensory details, including what they are seeing, hearing, feeling, and thinking (Foa et al., 2019). Therapists are instructed to explain the benefits of this recall by telling clients that “repeatedly retelling your memory will help you organize the memory” (Foa et al., 2019, p. 97; see also Hooyer et al., 2024). Likewise, one of the main goals of cognitive therapy for PTSD is for the client to develop a “coherent narrative account” that “places the series of events during the trauma in context, in sequence, and in the past” (Ehlers et al., 2005, p. 416). To achieve this coherent account, clients might (a) repeatedly recall the memory, (b) write a detailed account of the memory, and/or (c) revisit the site of the trauma.
Likewise, during eye-movement-desensitization-and-reprocessing (EMDR) therapy—in which clients repeatedly recall a traumatic memory while completing lateral eye movements—therapists work to facilitate access to supposedly obstructed memories (F. Shapiro, 1989, 2018). In EMDR protocols for recent traumatic experiences, traumatic memories are assumed to be incoherent, stored “as a fragmented experience that has not yet consolidated” (E. Shapiro, 2009, p. 142). 1 Clients repeatedly recall aspects of the event in chronological order, and therapists work to process each aspect to “facilitate integration and consolidation” (E. Shapiro, 2009, p. 142). Offshoots of EMDR, such as brainspotting, which is said to help with the “opening up” of painful memories (Grand, 2013, p. 113), inherit these same assumptions. Some brainspotting clients are described as having “the veil of dissociation lifted in front of their eyes” as they supposedly begin to remember aspects of the trauma they believed they could not remember because of loss of consciousness (Grand, 2013, p. 117). Yet the scientific literature provides reasons to be concerned about adopting an assumption that memories are somehow obstructed. For example, searching for missing aspects of a memory can lead people to remember things that never happened at all (Hyman & Pentland, 1996; Otgaar et al., 2022; for reviews, see Lindsay & Read, 1994; Otgaar et al., 2019). Therefore, it is important that these assumptions are grounded in solid scientific evidence.
But there is little in the way of good scientific evidence to support the idea that incoherence resolves over successful therapy—or even that traumatic memories are particularly incoherent to begin with. First, this idea begins from the assumption that traumatic memories are incompletely encoded because encoding mechanisms operate differently for traumatic versus nontraumatic events (see e.g., Brewin et al., 1996, 2010; Ehlers & Clark, 2000). This incomplete encoding produces incoherent memories, which exacerbate symptoms of posttraumatic stress (Brewin et al., 1996, 2010; Foa et al., 2006). Indeed, one of several widely held beliefs about traumatic memories, shared by the general public and practicing therapists, is the notion that traumatic memories are special on a number of dimensions (Ost et al., 2017; Patihis et al., 2014; see also Hayne et al., 2006; Lynn et al., 2014; Mangiulli et al., 2021). The research, however, does not support this belief; a growing body of scientific evidence suggests that traumatic memories are, in absolute terms, fairly coherent—even in people experiencing distress (Halligan et al., 2003; Jones et al., 2007; Rubin et al., 2016; Taylor et al., 2020, 2022). More to the point, an equivalence-testing approach reveals that traumatic memories are statistically equivalent on coherence to their nontraumatic counterparts (Lakens, 2017; Taylor et al., 2022). These findings call the idea of incompletely encoded, incoherent traumatic memories into question.
Second, when it comes to therapies said to target coherence, only a handful of published studies have attempted to measure the change in coherence among adults in therapy. What is more, most of these studies feature small samples (Ns = 14–77; Bedard-Gilligan et al., 2017; Foa et al., 1995; Kindt et al., 2007; Mundorf & Paivio, 2011; van Minnen et al., 2002). Across these studies, there was no consistent change in coherence from pretreatment to posttreatment; three found no significant change in coherence (Bedard-Gilligan et al., 2017; Kindt et al., 2007; Mundorf & Paivio, 2011), and the remaining two found patterns that differed depending on the measure (Foa et al., 1995; van Minnen et al., 2002). The relation between coherence and posttraumatic stress was also inconsistent. Sometimes, increases in coherence were associated with reductions in symptoms of posttraumatic stress; other times, the opposite occurred; and still other times, there was no relation. In the most thorough peer-reviewed investigation of the issue to date, there was no evidence to suggest people’s traumatic memories were more coherent after 10 sessions of prolonged-exposure therapy than they were before, even among individuals who showed the greatest reduction in PTSD symptoms (Bedard-Gilligan et al., 2017). These studies support the idea that the well-documented benefits of therapies for PTSD do not arise from greater coherence of people’s traumatic memories. Instead, it seems likely that other mechanisms, such as habituation or changes in people’s beliefs about trauma, are responsible for those benefits (de Kleine et al., 2015; Foa et al., 2006; Nacasch et al., 2015; for a review, see Cooper et al., 2017).
It is tempting to conclude the repeated-recall techniques, a hallmark of approaches such as prolonged-exposure therapy, do not increase coherence. But such a conclusion is premature: Any statement of causality requires a carefully controlled experiment (for a similar point, see Nielsen et al., 2020; Taylor et al., 2022). More specifically, that experiment must permit two key comparisons. First, one needs to compare the coherence of memories recalled repeatedly with those not (Ebbinghaus, 1885/1913; Meeter et al., 2005). This comparison would account for the “ordinary” loss of information over time, which might mask any coherence-building effects of repeated recall and make it seem as though recall has no effect on coherence. To our knowledge, this comparison is missing from the literature. Second, one needs to compare traumatic memories with a range of nontraumatic memories, controlling for emotional intensity, valence, and importance (Read & Lindsay, 2000; Rubin, Boals, & Berntsen, 2008; Taylor et al., 2020, 2022). These comparisons would determine the extent to which any effects observed in traumatic memories are unique to trauma. Again, to our knowledge, these comparisons are missing from the literature.
Finally, the literature does not consistently define what it means to say traumatic memories are fragmented, jumbled, and recalled in bits and pieces (for reviews, see Crespo & Fernández-Lansac, 2016; O’Kearney & Perrott, 2006). This inconsistency gives rise to different ways of talking about, let alone measuring, incoherence—all of which make it difficult to reconcile disparate findings. For instance, “incoherence” sometimes refers to different self-reported properties of the memory, such as the degree to which the rememberer thinks the memory is (a) complete, (b) recalled in a temporal sequence, or (c) a set of disconnected fragments (Halligan et al., 2003; Rubin et al., 2016; Sachschal et al., 2019). Other times, incoherence refers to different “externally judged” properties gathered from a narrative account of the memory (Foa et al., 1995; Halligan et al., 2003; Rubin et al., 2016); regardless of whether these external properties are assessed by a trained judge or linguistic software, there tends to be significant agreement both within and across measures (Rubin et al., 2016). It therefore seems prudent to combine both approaches, as Rubin et al. (2016) did, to capture myriad aspects of coherence. That is the approach we take here. Given that there is no good evidence for the idea that encoding mechanisms operate differently on traumatic versus nontraumatic memories, it seems reasonable to speculate that whatever effects we see for traumatic memories we would also see for nontraumatic memories (Rubin, Berntsen, & Bohni, 2008).
Here, then, we answer two crucial questions:
Research Question 1: To what extent do people’s traumatic memories become more coherent when repeatedly recalled (with instructions containing many of the key elements of those provided during therapy)—compared with memories that are not repeatedly recalled?
Research Question 2: To what extent is the change in coherence of repeatedly recalled traumatic memories equivalent to that of their nontraumatic counterparts?
The answers to these questions would have implications for how the field is to understand the efficacy of techniques in widespread use for the treatment of PTSD. We conducted a carefully designed experiment to answer these questions with enough precision to detect statistical equivalence in the coherence of traumatic and nontraumatic memories over repeated recall.
To do so, we drew on the well-established trauma-film paradigm (for reviews, see Holmes & Bourne, 2008; James et al., 2016). On the first day of the experiment, each subject watched two films portraying events of the same valence—either traumatic, negative, positive, or neutral. These films were drawn from published work in which they were normed on factors that might plausibly affect coherence (Taylor et al., 2022). Two days later, we asked subjects to begin repeatedly recalling one of those events. Instructions for this repeated recall contained many of the key elements of those provided during therapy. 2 Incorporating these elements plausibly boosted the strength of our repeated-recall manipulation, giving repeated recall the biggest chance of resolving incoherence. Repeated recall took place over the course of three sessions spanning 5 days. We then examined both self-reported and computer-scored coherence, gathering a range of measures to capture a variety of aspects of coherence. We focused on both the overall coherence of the memory and the organization of different parts in the memory (Brewin, 2016; Sachschal et al., 2019). We examined self-reported coherence by asking subjects to report the coherence of their memories for those events at both the beginning and end of the experiment, drawing on four different aspects of coherence. We examined computer-scored coherence to address the possibility that people’s self-reported coherence might be influenced by attributions they make in the moment and therefore might not accurately reflect how coherent the memory actually is (Taylor et al., 2020). To gather these computer-scored measures, we used two computer programs to analyze the memory descriptions subjects provided. Across both the self-report and computer-scored measures, we examined both coherence and incoherence. That is, for some measures, higher scores indicated a more coherent memory, and for others, higher scores indicated a more incoherent memory (Brewin & Field, 2024). We also asked subjects to report the analogue symptoms of posttraumatic stress they experienced about each event over the final 2 days of the experiment to establish the extent to which increased coherence of traumatic memories was associated with fewer analogue symptoms.
Transparency and Openness
Preregistration
Our preregistration is available on the project’s ResearchBox page: https://researchbox.org/966.
Data, materials, code, and online resources
Our data, code, and materials are available on the project’s ResearchBox page: https://researchbox.org/966.
Reporting
We report how we determined our sample size, all data exclusions, all manipulations, and all measures in the study.
Ethical approval
The experiment was approved by the University of Otago Psychology Departmental Ethics Committee, with reciprocal notification to the University of Waikato. The experiment was conducted in accordance with the ethical provisions of the World Medical Association Declaration of Helsinki. Before subjects provided their consent, we warned them they might be asked to watch films containing emotional content, such as a road-traffic accident or stillbirth, and should not participate if they anticipated being adversely affected.
Method
We took an equivalence-testing approach (Lakens, 2017) to assess support for two null hypotheses: (a) People’s memories for the traumatic and various nontraumatic films would be equivalent on coherence, and (b) people’s memories for the traumatic films would show changes in coherence over the course of repeated recall equivalent to those for the various nontraumatic films. In line with this approach, we preregistered (Transparent Changes) a minimum effect size we considered large enough to be meaningful (our smallest effect size of interest: d = 0.40, a shift of approximately half a point on a 7-point Likert-type scale). We determined this effect size by considering (a) the effect sizes typically considered meaningful in the related literature (e.g., Rubin et al., 2016; Taylor et al., 2020), (b) that any upstream difference (say, coherence) would be greater than any downstream difference (e.g., psychopathology; D. Lakens, personal communication, June 27, 2019), and (c) resource constraints. A significant equivalence test supports the conclusion that the observed effect is too small to be meaningful, and observations can therefore be considered equivalent.
Subjects
We recruited American and British subjects on Amazon’s Mechanical Turk (MTurk; https://www.mturk.com/) using the CloudResearch platform (Litman et al., 2017). We conducted a pilot study to determine our target sample size, and as a result, we set our target sample size at 544 useable responses (136 per valence of film) after exclusions. This sample size allowed us to detect a smallest effect size of interest between subjects of d = 0.40 with 90% power and α = .05 (Lakens, 2017).
For the experiment proper, we collected data in batches. A total of 1,076 MTurk workers completed the first session of the experiment. We progressively excluded subjects after each session of the experiment, ultimately retaining 539 for analyses (age: M = 40.08 years, SD = 13.09, Mdn = 37; 34% men, 65% women, 1% gender diverse). Of these subjects, 93% reported English was their first language, and 92% reported English was their primary language. For a complete breakdown of attrition and exclusions, see the Supplemental Material available online. Subjects received a total of $3 for participating in all sessions of the study, and payment was split across the sessions.
Design
We used a 4 × 2 mixed design. We manipulated the type of event (traumatic, negative, positive, neutral) between subjects and whether a memory was repeatedly recalled (repeatedly recalled, control) within subjects.
Procedure
As Figure 1 shows, the experiment consisted of four experimental sessions over 7 days. The experiment was run entirely online, and subjects completed each session in their own time using the Qualtrics survey platform. In addition to these four sessions, subjects were invited to complete optional check-in sessions on Days 4 and 6. These check-in sessions were designed to serve as experimental analogues to the homework clients do between sessions of prolonged-exposure therapy (Foa et al., 2019; for a full mapping of aspects of our method to aspects of exposure therapy, see the Supplemental Material). 3

Procedure by day and key action.
Day 1
The first session of the experiment comprised three phases.
Phase 1
After subjects provided their consent to participate in the experiment, we asked them to satisfy several environmental conditions while taking part in the study, such as completing the experiment in a quiet place with no distractions. Next, we verified that subjects’ audio was working by asking them to listen to an audio clip directing them to “select the number 9 below” from a variety of numbered options.
Phase 2
Once subjects had completed the audio check, they were randomly assigned to watch films that portrayed either traumatic, negative, positive, or neutral content (Taylor et al., 2022). The negative films were included as counterparts to the traumatic films that also induce negative affect but do not depict death, serious injury, or sexual violence (American Psychiatric Association, 2022). The positive films were included as counterparts to the traumatic films that are also emotionally intense but induce positive affect instead of negative affect (Rubin, Boals, & Berntsen, 2008; Rubin et al., 2016). The neutral films were included as counterparts to the traumatic films that are neither emotionally intense nor valenced (Schaefer et al., 2010). A review of the trauma-film paradigm is beyond the scope of this article—but data gathered using the paradigm suggest that although watching a traumatic film does not meet the Criterion A definition of a “traumatic event,” as defined in the fifth edition, text revision of the Diagnostic and Statistical Manual of Mental Disorders, it does produce a titrated “dose” of symptoms without a high degree of stress (American Psychiatric Association, 2022; for reviews of the trauma-film paradigm, see Holmes & Bourne, 2008; James et al., 2016). Here, we used an existing set of materials that induce positive and negative affect as expected, are equivalent on the extent to which they portray story-like events, and are unfamiliar to the majority of people (Taylor et al., 2021, 2022). The traumatic films depicted (a) a graphic car accident and (b) a stillbirth; the negative films depicted (a) a young boy with a facial disfigurement starting at a new school and (b) a teenage boy learning his mother has terminal cancer; the positive films depicted (a) a gold-medal-winning Olympic ice-skating performance and (b) travelers receiving gifts on Christmas at a luggage carousel; and the neutral films depicted (a) a documentary about parquet flooring and (b) instructions about how to use chopsticks. Each film was approximately 3 min long (M = 2 min 56 s, range = 2 min 26 s–3 min 15 s).
We counterbalanced the order in which subjects watched each of the two films of their assigned type. Immediately after watching each film, subjects were shown a screenshot from the film and asked to give the film “a short title (between 3 and 5 words).”
After providing that title, subjects then read a short filler passage of unrelated educational prose before repeating the process for the other film (Butler et al., 2007).
Phase 3
At the end of this session, subjects completed a basic demographics questionnaire, and for exclusion purposes, they (a) answered questions about their compliance with instructions and (b) described the task they were asked to complete during the session. Finally, we reminded everyone we would contact them 2 days later with a link to the second session, which they should complete within 24 hr.
Day 3
Approximately 48 hr later (M = 50.17, SD = 7.19, Mdn = 48.03), subjects completed the second session. This session consisted of three phases.
Phase 1
First, we reminded subjects that 2 days ago they had watched two films. In a counterbalanced order, we then showed them a still image of each film accompanied by their nominated title for the film. Then, in a counterbalanced order, we returned to each combination of “still image and title” and asked subjects to rate the coherence of their memories for each film. We drew on existing literature and used four items to capture a broad range of aspects of coherence (Rubin et al., 2003, 2004; Taylor et al., 2022): (a) the story-like nature of the memory (the story item; “As I remember the event, my memory comes to me as a coherent story”), (b) the incompleteness of the memory (the pieces item; “As I remember the event, my memory comes to me in pieces, with bits missing”), (c) the disconnections between different parts of the memory (the fragments item; “As I remember the event, my memory comes to me as a collection of disconnected fragments”), and (d) the lack of a chronological sequence in the memory (the jumbled item; “As I remember the event, my memory feels jumbled, with parts of the event coming to me out of order”). The items were presented in a random order, and each was rated on a 7-point Likert-type scale from 1 (not at all) to 7 (completely). We created a coherence sum variable by reverse-coding the pieces, fragments, and jumbled items and calculating subjects’ mean rating of the four items (Taylor et al., 2022). These four items produced Cronbach’s αs ranging from .88 to .90.
Phase 2
We then randomly selected one of each subject’s assigned films to be repeatedly recalled and the other to serve as the control memory. Subjects saw the still image and their nominated title for the film they would subsequently repeatedly recall and read the following instructions adapted from guidelines for imaginal-exposure sessions in prolonged-exposure therapy (Foa et al., 2019). These instructions capture the focus on chronological order, present tense, vividness, and detail present in the instructions given to clients during their first session of imaginal exposure: Please take your time and report everything that you can remember about the event portrayed in this video. Start your description at the very beginning of the event, and give a complete description of the entire event. We would like you to describe the event in the present tense, as if it were happening now, right here. It is important that you recall the memory of this event as vividly as possible, and picture the details of the event in your mind’s eye. Your description should contain as much detail as you can manage so that someone else who hasn’t watched the video would be able to imagine the event accurately.
Each sentence of the instruction appeared on a separate page; a recap was included on the final page above a text box in which subjects were asked to write the description of their memory.
We analyzed these descriptions and all descriptions subsequently provided during the experiment using Linguistic Inquiry Word Count (LIWC; Pennebaker et al., 2015) and Coh-Metrix (Graesser et al., 2004). These computer-scored measures targeted many aspects of coherence, including the nature of links between different parts of the memory, fluency of the description, filler words, concreteness of the language, and continuity of ideas across the narrative. These analyses provided data on 18 computer-scored measures of memory coherence: five from LIWC and 13 from Coh-Metrix. 4 These measures have been well validated and used extensively to measure coherence in other contexts and recently in the PTSD literature (Rubin et al., 2016). With the exception of two LIWC measures (nonfluencies and filler words), higher scores suggest a more coherent memory. For more information about each of these measures, see the Supplemental Material.
Phase 3
Finally, subjects answered questions about their compliance with instructions and were reminded we would contact them 2 days later with a link to the third session, which they should complete within 24 hr.
Day 5
Approximately 48 hr after finishing the second session (M = 49.42, SD = 5.64, Mdn = 48.10), subjects completed the third session. This session consisted of two phases.
Phase 1
First, subjects saw the still image and their nominated title of their repeatedly recalled memory and again described their memory in a text box. To elicit these descriptions, we used the same instructions used on Day 3 with two small additions in line with added instructions about focusing on sensory details and feelings and thoughts in the guidelines for the second session of imaginal exposure in prolonged-exposure therapy (Foa et al., 2019): “Please focus in detail on what you are seeing, hearing, and feeling as you recall the memory” and “Please include what happened, and what you were feeling and thinking as you watched the video.”
Phase 2
Finally, subjects answered questions about their compliance with instructions and were reminded we would contact them 2 days later with a link to the fourth and final session, which they should complete within 24 hr.
Day 7
Approximately 48 hr after finishing the third session (M = 49.99, SD = 6.87, Mdn = 48.39), subjects completed the fourth and final session. This session consisted of six phases.
Phase 1
First, we reminded subjects that 7 days ago they had watched two different films. We then, in a counterbalanced order, showed them the stills and their nominated titles for each of those films. In the subsequent phases of this session, subjects answered a number of questions about their memories for each of their assigned films. Subjects answered a given set of questions about each film in turn before moving on to the next set of questions. We randomly assigned half of the subjects to answer any given set of questions about their repeatedly recalled memory first and the other half to answer about their control memory first. There was one exception to this counterbalanced rating order, which we explain in Phase 4 below.
Phase 2
Next, we asked subjects to provide a description of their memory for each of their assigned films. To elicit these descriptions, we asked people to “Please take your time and report everything that you can remember about this event, starting from the beginning.” We simplified the instructions in this session, compared with those in the previous sessions, to ensure subjects did not experience too much difficulty providing the description, which might in turn disrupt their subsequent coherence ratings (Taylor et al., 2020).
This description comprised subjects’ first time recalling their control memory but their third time recalling their repeatedly recalled memory. We asked them to describe both memories in line with the possibility that providing the description might temporarily inflate their subsequent coherence ratings, making any increase in coherence difficult to interpret (Taylor et al., 2020). That is, if we had asked subjects to describe only their repeatedly recalled memory in this session, they might have then rated that memory as more coherent than their control memory because of momentary attributions about coherence rather than systematic changes in the actual coherence of the memory.
Phase 3
Subjects then completed a word search puzzle for 5 min before rating the coherence of their memories for each of their two assigned films using the same four coherence items as on Day 3.
Phase 4
We then asked subjects to report the symptoms of posttraumatic stress they had experienced over the last 2 days about each film. To provide these symptom reports, subjects completed the revised Impact of Event Scale (IES-R; Weiss & Marmar, 1997) with instructions adapted for use (a) with nontraumatic events (e.g., “emotional life events” instead of “stressful life events” and (b) over a 2-day time frame rather than the typical 7 days. We asked all subjects to complete the adapted IES-R for both of their assigned films—even if these were nontraumatic counterpart films—to ensure the procedure was identical for everyone. In past work, the IES-R has demonstrated high internal consistency when used for symptom reports about both traumatic and negative but nontraumatic events (traumatic: α = .96, Creamer et al., 2003; nontraumatic: αs = .93–.94, Sato et al., 2018; Seki et al., 2021). Likewise, here, the IES-R demonstrated high internal consistency for symptom reports about both traumatic and nontraumatic films (traumatic: αs = .88–.92; nontraumatic: αs = .86–.91).
We fixed the order of these symptom reports: Subjects reported the symptoms they had experienced about their repeatedly recalled memory first, followed by their control memory. We did this for two reasons. First, regardless of order, any comparison of symptoms about repeatedly recalled versus control memories would have been difficult to interpret; the Day 3 and 7 descriptions (and check-in sessions) would have served as reminders of the repeatedly recalled memory, potentially affecting the symptoms subjects experienced about that memory. Second, we collected some pilot data that suggested subjects’ reports of symptoms about the first memory they rate anchors their reports about a subsequent memory. Because we were primarily interested in subjects’ symptoms about their repeatedly recalled memory, we fixed the order so that subjects always rated those symptoms first.
Phase 5
Next, to examine the extent to which subjects believed their memories had become more or less coherent over the course of the experiment, we asked them to complete the following statement: “Over the course of this 7-day study, my memory for this event has become . . . (1 = more coherent, 7 = less coherent).” Subjects then rated each memory on three phenomenological aspects: reliving, emotions, and rehearsal. To measure these aspects, we asked subjects to rate each memory on eight items from the Autobiographical Memory Questionnaire (AMQ; Rubin et al., 2003, 2004). These ratings showed no particular consistent pattern with respect to the type of film people viewed. We report the AMQ data in the Supplemental Material.
Phase 6
Finally, subjects answered questions about their compliance with instructions and told us what they thought the study was about before being debriefed.
Results
Of the 617 subjects who completed all four main experimental sessions, we excluded 75 subjects (12%) based on preregistered (Transparent Changes) exclusion criteria. Of these, 56 subjects failed two or more attention checks, 16 reported they completed at least one of the sessions of the study in more than one sitting, and three provided a description of their control memory on Day 7 that either did not make sense or suggested they could not remember the film. In addition to these preregistered (Transparent Changes) criteria, we excluded one subject who described the same memory twice on Day 7, one who did not describe the repeatedly recalled memory on Day 5, and one who did not describe the repeatedly recalled memory on Day 7. We did not preregister these additional criteria because we did not anticipate the scenarios would happen. In total, then, we excluded the data from 78 subjects (13%), retaining 539 for analyses (traumatic: n = 136; negative: n = 135; positive: n = 139; neutral: n = 129).
Coherence over repeated recall
We now turn to our primary research questions: To what extent do people’s traumatic memories become more coherent when repeatedly recalled compared with memories that are not repeatedly recalled? And to what extent is the change in coherence of repeatedly recalled traumatic memories equivalent to that of their nontraumatic counterparts? We addressed these questions using two approaches.
Reported coherence
Day 3 and Day 7 ratings of coherence
Our first approach addressed these questions with subjects’ own ratings of coherence. We created a coherence sum variable using subjects’ responses to the four coherence items on Day 3; we did the same for Day 7. We then classified these ratings according to whether the ratings were about a control or repeatedly recalled memory and again by type of event. We display these data in Figure 2.

Mean coherence ratings of repeatedly recalled and control traumatic and nontraumatic memories. Day 3 ratings of coherence were made before first session of recall. Error bars represent 95% Cousineau-Morey confidence intervals.
Figure 2 shows at least four interesting patterns. First, focusing on the traumatic memories, we see that people rated their control and repeatedly recalled memories as similarly coherent on Day 3. When we compare the traumatic memories with their nontraumatic counterparts, we see this pattern is consistent across the different types of memories. Put another way, there was no significant interaction between the type of event (traumatic, negative, positive, neutral) and whether that memory was subsequently repeatedly recalled or served as a control on Day 3 coherence ratings, F(3, 535) = 0.94, p = .423. As expected, there was no significant effect of whether a memory was subsequently repeatedly recalled or served as a control, F(1, 537) = 2.37, p = .124.
Second, if we look again at the traumatic memories, we see that on Day 7, people rated their repeatedly recalled memories as more coherent than their control memories. In addition, control memories became less coherent, and repeatedly recalled memories remained largely unchanged. That is, repeatedly recalling a traumatic memory arrested the loss of coherence that would otherwise have occurred over time. This loss of coherence occurred across the board for the control memories. Put another way, there was no significant interaction between the type of event (traumatic, negative, positive, neutral) and whether a memory was repeatedly recalled or served as a control on Day 7 coherence ratings, F(3, 535) = 1.57, p = .196. There was an effect of repeated recall such that people rated their control memory as less coherent on Day 7 than their repeatedly recalled memory; difference: M = −0.92, 95% confidence interval [CI] = [−1.07, −0.78], F(1, 537) = 156.90, p < .001. Moreover, an unplanned analysis shows that the same pattern emerges for each of the four self-reported measures of coherence in our coherence sum variable (see the Supplemental Material).
Third, in absolute terms, Figure 2 shows that traumatic memories—even control memories—were relatively coherent, as were their nontraumatic counterparts. When we conducted equivalence tests on people’s Day 3 coherence ratings (collapsing across the repeatedly recalled and control memories), we found the traumatic and negative memories were equivalent on coherence (difference: M = −0.05, 95% CI = [−0.46, 0.35], lower bound: p = .002, upper bound: p < .001), but the traumatic memories were more coherent than both neutral (difference: M = 1.00, 95% CI = [0.59, 1.41], lower bound: p < .001, upper bound: p = .999) and positive films (difference: M = 0.35, 95% CI = [−0.05, 0.75], lower bound: p < .001, upper bound: p = .165).
For Day 7, we focus first on the repeatedly recalled memories: Traumatic memories were less coherent than negative memories, equivalent to the positive memories, and more coherent than the neutral memories. This pattern was similar for the control memories. More specifically, for repeatedly recalled memories, the mean difference for Traumatic – Negative is −0.45 (95% CI = [−0.87, −0.02], lower bound: p = .374, upper bound: p < .001); mean difference for Traumatic – Positive is 0.10 (95% CI = [−0.32, 0.52], lower bound: p < .001, upper bound: p = .008); and mean difference for Traumatic – Neutral is 0.60 (95% CI = [0.17, 1.04], lower bound: p < .001, upper bound: p = .730). For control memories, the mean difference for Traumatic – Negative is −0.39 (95% CI = [–0.88, 0.11], lower bound: p = .276, upper bound: p < .001); mean difference for Traumatic – Positive is −0.25 (95% CI = [–0.74, 0.24], lower bound: p = .097, upper bound: p < .001); and mean difference for Traumatic – Positive is 0.47 (95% CI = [–0.03, 0.98], lower bound: p < .001, upper bound: p = .444).
Finally, when we examined the difference between Day 3 and Day 7 coherence ratings, we found that traumatic memories lost more coherence than did negative memories (difference: M = −0.37, 95% CI = [−0.74, 0.01]), positive memories (difference: M = −0.43, 95% CI = [−0.80, −0.05]), and neutral memories (difference: M = −0.46, 95% CI = [−0.85, −0.08]), F(3, 535) = 4.41, p = .004. It is hard to interpret these findings as consistent with the idea that traumatic memories were incoherent given that they were more coherent on Day 7 than their neutral counterparts and statistically equivalent to their positive counterparts.
Reports of memories becoming more coherent or less coherent
Recall that on Day 7, we also asked people to rate the extent to which their memory became more or less coherent over the course of the experiment. To what extent, then, did people think their traumatic memories gained or lost coherence? Figure 3 displays their responses. As Figure 3 shows, ratings of the repeatedly recalled memories clustered around the midpoint of the scale, suggesting people thought the coherence of these memories remained stable. But ratings of the control memories suggested people thought the coherence of memories declined, a pattern consistent with the actual changes in coherence people reported. Put another way, there was no significant interaction between the type of event (traumatic, negative, positive, neutral) and whether that memory was repeatedly recalled or served as a control, F(3, 535) = 0.43, p = .732. People reported their control memories lost more coherence than did their repeatedly recalled memories (difference: M = −1.15, 95% CI = [−1.30, −0.99]), F(1, 537) = 198.81, p < .001. The neutral memories (difference: Mdiff = –0.45, 95% CI = [−0.86, −0.04]) lost more coherence than the negative (–0.45) and positive memories (difference: M = −0.47, 95% CI = [−0.88, −0.06]) but not the traumatic memories (difference: M = −0.24, 95% CI = [−0.65, 0.17]), F(3, 535) = 4.04, p = .007. People’s ratings of whether their memories became more or less coherent from Day 3 to Day 7, then, were fairly consistent with the changes in coherence they actually reported.

Mean ratings of the extent to which memories became more coherent or less coherent. Ratings were reverse-coded for ease of interpretation. Error bars represent 95% Cousineau-Morey confidence intervals.
Computer-scored measures of coherence
For our second approach to addressing our primary research questions, we gathered computer-scored measures of coherence by analyzing subjects’ descriptions of their memories (Rubin et al., 2016). Recall that subjects provided three descriptions of their repeatedly recalled memory—one each on Days 3, 5, and 7—and one description of their control memory on Day 7. To determine the coherence of these memory descriptions, we then analyzed them using LIWC and Coh-Metrix (Graesser et al., 2004; Pennebaker et al., 2015). These analyses resulted in 72 data points for each subject (18 computer-scored measures for each of their four descriptions). We then classified the data according to both the type of event and the description it was gathered from and calculated mean coherence for each of the 18 measures (see Table 1; Tables S4–S6 in the Supplemental Material).
Change in Computer-Scored Measures of Coherence in Subjects’ Repeatedly Recalled Memories From Day 3 to Day 7
Note: M difference = traumatic – comparison memory; p represents p value from t test comparing change in measure to zero; maximum p represents highest p value in equivalence test comparing traumatic and comparison memory. We did not correct for multiple comparisons because these equivalence tests were preregistered (Transparent Changes), but applying a correction does not affect the pattern of results in any specific way: Fewer comparisons are equivalent, but the pattern of coherence remains mixed (traumatic memories sometimes gain more, gain less, lose more, or lose less coherence than the nontraumatic memories).
Change significantly different from zero.
Significant equivalence test; change of traumatic and comparison memory can be considered equivalent on the measure of coherence.
Before we calculated the change from Day 3 to Day 7 in each of these computer-scored measures, we first determined the extent to which the traumatic and nontraumatic memories were equivalent on coherence on Day 3. These analyses revealed no consistent pattern of coherence across the 18 computer-scored measures of coherence: Traumatic memories were sometimes more coherent, sometimes less coherent, and sometimes equivalent on coherence to their nontraumatic counterparts. For a full report on these results, see Table S4 in the Supplemental Material.
We then calculated the change in each computer-scored measure of coherence by taking the difference between the scores gathered from the Day 3 and Day 7 descriptions of the repeatedly recalled memories (see Table 1). Positive change scores suggest a description was more coherent on Day 7 than Day 3, and negative change scores suggest the opposite (again except for the nonfluencies and filler-words measures from LIWC). As Table 1 shows, for most measures, the magnitude of change in coherence was not statistically different from zero. More specifically, traumatic memories were essentially unchanged on all 18 measures. Negative memories became more coherent on one and were essentially unchanged on the remaining 17. The positive memories became more coherent on one and less coherent on one and were essentially unchanged on the remaining 16. The neutral memories became more coherent on one and were essentially unchanged on the remaining 17. There was little evidence, then, to support the idea that people’s memories became more coherent over repeated recall. Nonetheless, it is possible that although the changes in coherence of people’s memories tended to be indistinguishable from zero, the degree or direction of changes might have differed between traumatic and nontraumatic memories. To what extent, then, could the change in coherence of the traumatic memories be considered equivalent to that of the nontraumatic memories?
To address this question, we conducted a series of equivalence tests on the change in each computer-scored measure of coherence using a smallest effect size of interest of d = 0.4. We display these data in Table 1. As Table 1 shows, for most measures, the change in coherence of traumatic memories was equivalent to that of the nontraumatic counterparts. First comparing the traumatic and negative memories, we found the change in these memories was equivalent on 14 of the 18 measures. Of the four that were not equivalent, the traumatic memories gained more coherence than the negative counterparts on three and lost more coherence on one. Then, comparing the traumatic and positive memories, we found the change in memories was equivalent on 15 of the 18 measures. Of the three measures that were not equivalent, the traumatic memories gained more coherence than the positive counterparts on two and lost more coherence on one. Finally, comparing the traumatic and neutral memories, we found the change in these memories was equivalent on 15 of the 18 measures. Of the three measures that were not equivalent, the traumatic memories gained more coherence than the neutral counterparts on two and gained less coherence on one.
Taken together, these data suggest repeatedly recalling the memory had little effect on these computer-scored measures of coherence regardless of the type of event recalled. Furthermore, the traumatic and nontraumatic memories showed similar change (or lack thereof) over repeated recall. In the few instances of differences, the traumatic memories tended to gain more coherence than did their nontraumatic counterparts.
Symptoms of posttraumatic stress
We also had a secondary interest: to establish the extent to which people who reported more coherent traumatic memories tended to experience less current posttraumatic stress.
Recall the suggestion in the literature that the traumatic memories of people with PTSD are particularly incoherent but that this incoherence resolves over successful therapy (Foa et al., 1995, 2006). To the extent this suggestion is true, symptoms and coherence should vary together such that the greater the increase in coherence, the fewer symptoms of posttraumatic stress would be reported on Day 7. To address this possibility, we turned to the subset of subjects who watched traumatic films (n = 136) and measured their posttraumatic stress by calculating their mean total IES-R score (Weiss & Marmar, 1997) for their repeatedly recalled memory.
Subjects reported a nontrivial degree of posttraumatic stress about their repeatedly recalled memory (M = 12.35, 95% CI = [10.53, 14.18], Mdn = 9). This finding replicates a finding from previous work using these materials (Taylor et al., 2022). People reported a diverse range of symptoms across all subscales to varying degrees (intrusions: M = 0.56, 95% CI = [0.46, 0.66]; avoidance: M = 0.81, 95% CI = [0.69, 0.93]; hyperarousal: M = 0.23, 95% CI = [0.16, 0.30]). There was no evidence to suggest the symptoms people reported about a traumatic film on Day 7 were associated with the change in coherence they reported (r = −.03, 95% CI = [−.20, .14], p = .740). 5 Having said that, there was a restricted range in the change in coherence ratings; therefore, we also examined the relation between coherence ratings on Day 7 and symptoms of posttraumatic stress and found a similar pattern (r = .00, 95% CI = [−.17, .17], p = .984). There was also little evidence to suggest the symptoms people reported about a traumatic film on Day 7 were associated with computer-scored measures of coherence. More specifically, we detected only two significant relations between reported symptoms and computer-scored measures of coherence, both for the nonfluencies measure: A reduction in the use of nonfluencies from Day 3 to Day 7 and fewer nonfluencies in Day 7 descriptions were associated with fewer symptoms of posttraumatic stress (respectively, r = .22, 95% CI = [.05, .38], p = .010; r = .18, 95% CI = [.01, .34], p = .034; for a full report of all correlations, and note that these analyses were not preregistered (Transparent Changes), see Table S7 in the Supplemental Material). We report these correlations with the caveat that correlations tend to stabilize at roughly 260 people (Schönbrodt & Perugini, 2013, 2018). Taken together, these findings do not fit with the idea that the coherence of people’s traumatic memories is associated with their symptoms of posttraumatic stress.
Discussion
Across 539 people, we found a pattern of data consistent with the idea that when people repeatedly recalled a memory, it prevented the subjective loss of coherence they reported for their control memories. This pattern emerged regardless of whether that repeatedly recalled memory was traumatic or nontraumatic. Moreover, there was no evidence that traumatic memories were, in absolute terms, incoherent. In additional analyses, computer-scored measures of coherence produced a similar pattern. Across 18 computer-scored measures, traumatic memories were a mix: sometimes more coherent, sometimes less coherent, and sometimes equivalent to their nontraumatic counterparts. We found little evidence that traumatic memories behaved differently over repeated recall compared with nontraumatic memories. Finally, there was no evidence of a link between the coherence of people’s traumatic memories and their symptoms of posttraumatic stress. It is nonetheless worth pointing out that coherence is sometimes operationalized in different ways throughout the literature. Although here we have focused on the coherence of a memory itself, coherence can sometimes refer to the extent to which a memory is integrated into one’s life story (see e.g., Berntsen et al., 2003). Without a clear understanding of what the term “coherence” refers to, researchers will have difficulty making theoretical advances and reaching a genuine understanding of coherence’s relation with symptoms of distress (for a similar point, see Crespo & Fernández-Lansac, 2016; O’Kearney & Perrott, 2006; Taylor et al., 2020).
Taken together, our findings are inconsistent with the idea that traumatic memories are incompletely encoded and as a result, incoherent (cf. Brewin et al., 2010; Ehlers & Clark, 2000; Foa et al., 2006). More specifically, when we looked at the traumatic memories in isolation, we found that people rated these memories above the scale midpoint on coherence regardless of whether they were making the ratings on Day 3 or 7. Even the memories we did not prompt people to recall were still subjectively coherent on Day 7. Furthermore, when we compared the traumatic and nontraumatic memories, there was no particular pattern—and certainly no evidence that the traumatic memories were consistently less coherent than their nontraumatic counterparts.
There was also no consistent evidence to suggest the traumatic memories demonstrated a different trajectory of coherence over repeated recall from the nontraumatic memories. More specifically, when we looked at the difference between reported coherence on Days 3 and 7, we found that people who watched traumatic films reported their memories lost more coherence than did people who watched nontraumatic films. When we looked at the extent to which people reported their memories became more or less coherent over the experiment, we found that their reports about traumatic memories did not differ from those about the nontraumatic counterparts. Furthermore, when we examined the computer-scored measures of coherence, we found few differences; when these differences existed, more often than not, they reflected a directionally greater gain in coherence for traumatic memories. Taken together, these findings—in combination with the finding that on Day 7 the traumatic memories were still equivalent to the positive memories and more coherent than the neutral memories—mean we are not inclined to make too much of any one pattern.
Recall that we distinguished between traumatic and negative events by drawing on Criterion A in the diagnostic criteria for PTSD (American Psychiatric Association, 2022); more specifically, the traumatic films depicted both sudden and unexpected death (a car crash and a stillbirth). But still, we found many similarities between the traumatic and negative (and in fact, traumatic and positive) memories. These findings, then, fit a “basic mechanisms” explanation of traumatic memories, in line with prior work (Rubin, Berntsen, & Bohni, 2008; Rubin, Boals, & Berntsen, 2008). That is, our data are consistent with the idea that the coherence of traumatic memories can be explained by basic psychological mechanisms common to all memories, such as availability, rehearsal, emotion, and reconstruction, as opposed to hypothetical mechanisms allegedly unique to traumatic events (see also Rubin et al., 2011; cf. Brewin et al., 1996, 2010; Ehlers & Clark, 2000; Foa et al., 2006). Some might wonder about the extent to which these findings contribute to the ongoing discussions about the utility of Criterion A (Bryant, 2023; Georgescu et al., 2024; Kilpatrick et al., 2009; McNally, 2003, 2009). But even though our data are consistent with a basic-mechanisms explanation of traumatic memories, that should be considered separately from any diagnostic implications regarding Criterion A. Put another way, just because different types of memories are similar on basic psychological principles does not mean they will be similar on subsequent symptoms. Although there are reasons to think Criterion A has shortcomings, we do not think our findings specifically speak to this issue.
We found people’s symptoms of posttraumatic stress did not vary with either coherence on Day 7 or the change in coherence. This finding is puzzling given the emphasis in clinical literature that developing a coherent narrative is associated with a reduction in these kinds of symptoms (Ehlers et al., 2005; Foa et al., 2019; Foa & Rothbaum, 1998). Of course, a critic might argue this experiment was conducted under laboratory settings, which gives rise to at least three differences that might limit generalizability. First, we did not employ trained therapists following therapeutic protocols, although we incorporated these techniques into our experiment as possible. Second, the experiment consisted of only three sessions of repeated recall over 2 weeks compared with the typical six to 16 weekly sessions of repeated recall recommended in guides for prolonged-exposure therapy (Foa et al., 2019). Third, the memories repeatedly recalled were of analogue events that stand apart from the life story as opposed to personally experienced events with implications for the person’s life. Perhaps, then, under these “real-world” conditions, we might have observed an association between coherence and symptoms. A critic might also argue we did not collect data on people’s trauma histories or their responses to past trauma, which limits our ability to generalize to clinical populations. Future research should collect these data. These caveats notwithstanding, our findings nonetheless fit with prior real-world work on clinical populations suggesting that when people with PTSD repeatedly recall their own memories in therapy, coherence does not increase, even among individuals whose symptoms lessen (Bedard-Gilligan et al., 2017; Kindt et al., 2007; Mundorf & Paivio, 2011). In addition, by taking instructions similar to those in therapy and building them into laboratory methods, we now have experimental control. Here, therefore, we can draw conclusions about what happens to different types of memories, otherwise matched on important factors, when they are—or are not—repeatedly recalled.
We did not collect data on subjects’ racial/ethnic identification or their income, education, or socioeconomic status. The absence of these data may limit our ability to generalize the findings. We do, however, know that MTurk provides greater diversity than do other convenience samples, such as college students (Behrend et al., 2011; Weigold & Weigold, 2022). Furthermore, the race, ethnicity, and household income of MTurk workers is fairly representative of the population of the United States; MTurk does, however, overrepresent young males and underrepresent Black Americans and top income earners (Litman & Robinson, 2020; Moss et al., 2020; Nadler et al., 2021). Taken together, then, if we assume our sample is representative of MTurk workers, the demographic makeup of our sample is likely in step with that of the U.S. population. Nonetheless, we cannot verify this proposition and would urge caution in generalizing these findings to populations that vary markedly from those in the United States and the United Kingdom.
What might be happening to the memory over repeated recalls that prevents the sense of lost coherence the memory might otherwise incur? Every session of repeated recall provides an opportunity for the so-called “double-edged sword of memory retrieval” to take hold (Roediger & Abel, 2022). On one “edge,” then, repeated recall might serve to prevent or slow decay of the memory or even facilitate later retrieval (Carrier & Pashler, 1992; Ebbinghaus, 1885/1913; Meeter et al., 2005; Roediger & Karpicke, 2006). But on the other edge, repeated recall also leaves the memory vulnerable to reconstruction, making it possible for erroneous details to become consolidated into the memory (Alberini & LeDoux, 2013; Loftus et al., 1978; Roediger et al., 1996). Repeated recall, then, might simultaneously prevent certain details from decaying while promoting reconstruction of other details.
This tendency toward reconstruction is specifically exploited in imagery-rescripting therapy, a treatment in which clients are encouraged to reconstruct the memory of their trauma but with a more positive outcome (Hackmann, 1998). Incorporating this additional, more positive material into the memory is seen as a way to update intrusive traumatic images, in turn helping to resolve symptoms of posttraumatic stress. But might this “rescripting” be happening unintentionally in therapies in which clients repeatedly recall their memories? After all, when people repeatedly imagine an event, they can develop the illusion of remembering more (Garry & Wade, 2005; Mazzoni & Memon, 2003). Moreover, recent evidence suggests people who complete lateral eye movements while recalling traumatic memories (as happens in EMDR) are more prone to incorporating inaccurate information into their memories than people who do not complete these eye movements (Houben et al., 2018, 2020; but see also, Calvillo & Emami, 2019; Kenchel et al., 2022; van Schie & Leer, 2019). It seems possible, then, that people might display “imagination inflation” over the course of repeated recall in therapy, manufacturing thoughts, images, and feelings that make sense (Garry et al., 1996; Garry & Wade, 2005; Li et al., 2020; Mazzoni & Memon, 2003). It also seems plausible that when people come to therapy for the first time, their memories are not particularly incoherent, but they are asked a number of questions that then produce a feeling of incoherence, which is then resolved by repeated recall (Taylor et al., 2020). Further research should address these important issues.
Our findings, in conjunction with related research, have implications for the development and delivery of therapies for PTSD. More specifically, in some forms of therapy, therapists are encouraged to explain the benefits of repeated recall by telling clients it will boost the coherence of their memory and, in turn, reduce symptoms (Ehlers et al., 2005; Foa et al., 2019; Foa & Rothbaum, 1998; see also Hooyer et al., 2024). But the recent literature suggests traumatic memories are coherent from the beginning and stay coherent over repeated recalls and that their coherence is unrelated to recovery (Bedard-Gilligan et al., 2017; Rubin et al., 2016; Taylor et al., 2020, 2022). Although we do not contest the efficacy of imaginal exposure in treating PTSD, it seems prudent to revisit explanations that tie its efficacy to coherence.
Supplemental Material
sj-docx-1-cpx-10.1177_21677026241306309 – Supplemental material for The Coherence of Analogue Traumatic and Nontraumatic Memories Over Repeated Recall
Supplemental material, sj-docx-1-cpx-10.1177_21677026241306309 for The Coherence of Analogue Traumatic and Nontraumatic Memories Over Repeated Recall by Andrea Taylor, Melanie K. T. Takarangi, Rachel Zajac and Maryanne Garry in Clinical Psychological Science
Footnotes
Transparency
Action Editor: Kelsie T. Forbush
Editor: Jennifer L. Tackett
Author Contributions
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
