Abstract
Wilson, Mickes, Stolarz-Fantino, Evrard, and Fantino (2015) presented data from three well-powered experiments suggesting that a brief mindfulness induction can increase false-memory susceptibility. However, we had concerns about some of the methodology, including whether mind wandering is the best control condition for brief mindfulness inductions. Here, we report the findings from a preregistered double-blind randomized controlled trial designed to replicate and extend Wilson et al.’s findings. Participants (
Wilson, Mickes, Stolarz-Fantino, Evrard, and Fantino (2015) reported that a brief mindfulness induction increased false-memory susceptibility. Media reports soon circulated of “how mindfulness plays havoc with memory” (Knapton, 2015). However, close reading of the research and of two subsequently published research articles—one supporting Wilson et al. and one contradicting them—suggests that this conclusion may be premature.
The Deese-Roediger-McDermott (DRM) paradigm (Deese, 1959; Roediger & McDermott, 1995) is an effective method for eliciting false memories in the laboratory. Participants are presented with lists of words (e.g.,
Wilson et al. (2015) used the DRM paradigm to explore the effect of mindfulness and mind-wandering inductions on false recall. In Experiment 1, they used the relevant mind induction before showing participants words related to the nonpresented critical lure
In Experiment 2, Wilson et al. (2015) explored the possibility that mindfulness increases false recall rather than that mind wandering reduces it. They presented participants with DRM lists before and after the mind inductions and compared pre- and postinduction recall performance. In the mind-wandering condition, false recall was the same before and after induction, but in the mindfulness condition, false recall increased after induction. However, there was again no baseline control condition, so it is difficult to know whether mindfulness and mind wandering increase or reduce false memories relative to no induction, especially given two other minor methodological concerns: (a) The pre- and postmanipulation lists were not counterbalanced, and (b) the backward associative strength (BAS) of the sets was not matched. BAS—defined as “the average tendency for words in the study list to elicit the critical item on a free association test” (Roediger, Watson, McDermott, & Gallo, 2001; p. 387)—is a key predictor of false recall. In Wilson et al.’s Experiment 2, the BAS of the preinduction lists was higher (range = 0.100–0.353,
Using mind wandering as a control condition for mindfulness would be less problematic if we knew what the effect of mind wandering on false recall should be. To our knowledge, no previous work has addressed this. Mrazek, Franklin, Phillips, Baird, and Schooler (2013) define mind wandering as “a shift of attention from a task to unrelated concerns” (p. 776). Most theories of false memory posit that some encoding of the list items needs to take place for false memories to be facilitated—for example, via spreading activation (activation–monitoring account; e.g., Roediger et al., 2001) or gist extraction (fuzzy-trace theory; Reyna & Brainerd, 1995). According to these theories—and previous findings—the fewer items to be encoded, the fewer false memories created (e.g., Robinson & Roediger, 1997). If mind wandering is effectively induced so that participants are shifting their attention from the task (encoding the list items) to unrelated concerns, these theories might reasonably predict a drop in false memories rather than an increase. Alternatively, response bias—for example, when participants believe that they need to provide a certain number of answers but are unable to accurately recall or recognize enough of them—might result in increased false memories.
It is not clear that mind wandering was successfully induced in Wilson et al.’s (2015) study. If participants had shifted their attention away from the task, why was there no difference in their performance on correct recall of presented items between the mind-wandering and mindfulness conditions in either experiment? Risko, Anderson, Sarwal, Engelhardt and Kingstone (2012) provided evidence that correct memory should be impaired by mind wandering by observing that as mind wandering increased during a lecture, memory for the lecture material decreased.
A final methodological observation is that there was no measure of whether the mindfulness and mind-wandering inductions actually induced different mental states in participants. Several scales have been developed to measure either state or dispositional mindfulness, and although it would be nice to believe that an experimental induction works, one cannot assume that it does. Indeed, Brown and Ryan (2003) observed that both dispositional and state mindfulness vary across participants.
Since 2015, two further articles have been published exploring the impact of a brief mindfulness manipulation using the DRM paradigm. Experiment 2 by Rosenstreich (2016) used brief (30-min) mindfulness and mind-wandering manipulations and found that both correct and false recognitions increased in the mindfulness condition, supporting Wilson et al.’s (2015) findings. However, only 40 participants took part in this between-participants study, there was no baseline condition, and the effectiveness of the mindfulness manipulation was not measured. Baranski and Was (2017) explored the effects of 15-min mindfulness and mind-wandering manipulations and warning versus no-warning instructions on false memories. Following Wilson et al.’s method in Experiment 2, Baranski and Was had participants in their second experiment (a) study six DRM lists, each followed by free recall; (b) receive the induction; and (c) then study six more lists with free recall. They used three inductions: mindfulness, mind wandering, and puzzle completion. There was no difference in the amount of false recall between the conditions, and in all conditions, false recall declined after the manipulation. Exploratory analyses suggested that this decline was greatest in the mindfulness condition. These findings thus conflict with Wilson et al.’s; however, Baranski and Was also did not measure whether mindfulness was induced.
In light of the methodological issues identified in Wilson et al. (2015), we preregistered a study to replicate and extend Wilson et al.’s Experiment 1. Specifically, we (a) evaluated participant mindfulness before and after induction; (b) evaluated mind wandering after induction; (c) included mindfulness, mind-wandering, and join-the-dot conditions; (d) measured participants’ performance on 12 DRM word lists—counterbalanced for BAS—rather than on a single list; and (e) measured both free-recall and recognition performance.
This experiment was conducted in a double-blind randomized controlled trial (Gilder & Heerey, 2018). Our hypotheses were as follows:
In the free-recall task, correct recall should be highest in the mindfulness condition and lowest in the mind-wandering condition. False recall should be highest in the mind-wandering condition and lowest in the mindfulness condition. 1 In the recognition task, (a) correct recognition should be highest in the mindfulness condition and lowest in the mind-wandering condition, (b) false recognition should be highest in the mind-wandering condition and lowest in the mindfulness condition, and (c) filler recognition should be highest in the mind-wandering condition and lowest in the mindfulness condition.
On the recognition task, (a) “remember” responses for correct recognition should be highest in the mindfulness condition and lowest in the mind-wandering condition, (b) “remember” responses for false recognition should be highest in the no-manipulation condition and lowest in the mindfulness condition, and (c) “remember” responses for filler recognition should be highest in the no-manipulation condition and lowest in the mindfulness condition. Further, (a) “know” responses for correct recognition should be highest in the mindfulness condition and lowest in the mind-wandering condition, (b) “know” responses for false recognition should be highest in the mind-wandering condition and lowest in the mindfulness condition, and (c) “know” responses for filler recognition should be highest in the mind-wandering condition and lowest in the mindfulness condition.
Method
Participants
Our target sample size was informed by the effect size from Wilson et al. (2015). The effect size (Cohen’s
A total of 302 participants were recruited through Keele University’s School of Psychology research-participation scheme, through social media, and through fliers posted on the Keele University campus. Participants either received course credit or were paid £7 for participating. Participants were at least 18 years of age (
Following our preregistered exclusion plan, we removed data from 15 participants: 6 were nonnative English speakers (violating one of our eligibility criteria); 2 overran the 1-hr time slot that the participants were booked for, so their participation had to be terminated early; 4 did not complete the experiment; 1 did not provide any free-recall data; and 2 completed free recall only after some of the lists, thus violating the stopping rule that required complete sets of data. This left us with 287 participants. A sensitivity analysis (see the Supplemental Material available online) showed that our final sample size gave us 80% power to detect an effect size (
Materials and procedure
The study received ethical approval from the Keele University Ethical Review Panel on May 10, 2017 (Document Number ERP1331). Participants were tested individually in a lab with an experimenter present to ensure full participation (e.g., that the participants were not using their mobile phones). The experimenter was blind to the experimental condition of each participant, as random assignment to condition was done using Qualtrics software (https://www.qualtrics.com/). After providing informed consent, participants completed the State Mindfulness Scale (SMS; Tanay & Bernstein, 2013), which consists of two subscales—a 15-item state mindfulness-of-mind scale (SMS mind) and a 6-item state mindfulness-of-body scale (SMS body). The 21 items were presented in random order. Participants then completed the relevant mindfulness, mind-wandering, or control-condition activity.
Wilson et al. (2015) used 15-min mindfulness and mind-wandering inductions recorded by Marilee Bresciani Ludvik at the Rushing to Yoga Foundation. When we requested these recordings from Wilson, he informed us that since the principal investigator, Edmund Fantino, had recently passed away, the precise recordings were not available (Wilson, personal communication, March 3, 2016). He provided similar recordings by the same person, so we used these instead, with participants listening to them via headphones. In the control condition, participants were asked to complete paper-based join-the-dot puzzles for 15 min (this task was identified by Friese, Messner, & Schaffner, 2012, as being “neither boring nor resource demanding”; p. 1019).
Participants then completed the SMS items in a different random order and the Retrospective Mind-Wandering Scale (the Thinking Content component of the Dundee Stress State Questionnaire; Matthews et al., 2013), which consists of an eight-item task-related-interference (TRI) scale and an eight-item task-unrelated-thought (TUT) scale.
Eighteen lists of 15 words were selected from Roediger et al. (2001). Each participant saw 12 of the 18 lists, and the lists were counterbalanced by dividing them into three sets. The lists were chosen and counterbalanced on the basis of the two factors that predict false recall—BAS and veridical recall—and also on the norms for false recall and false recognition for each list. The 15 words per list were presented individually in the Qualtrics default black font, size 36, for 1.5 s in the middle of the screen. After each list was presented, participants were given 3 min to type as many words as they could remember from the list. Once this was repeated for all 12 lists, participants completed a remember/know/guess recognition test. The recognition test consisted of 72 items: 36 presented items (3 from each list), 12 critical lure items (1 from each list), and 24 filler items (3 list items and 1 lure item from the 6 nonpresented lists, which were counterbalanced across participants). For each item, participants had to identify whether it was old or new, and then for those items identified as old, they had to select from among “remember,” “know,” or “guess” responses. A “remember” response indicated that a participant was able to consciously recollect that the item appeared in the original list, whereas a “know” response indicated that the recognition was based on a feeling of familiarity for the item in that context. The definitions (adapted from Dewhurst & Anderson, 1999) were provided in the instructions and again every time participants had to make a selection.
Results
The analytical approach we used was to present standard null-hypothesis significance tests (NHSTs) together with Bayesian analysis (in the form of Bayes factors). The analysis consisted of a series of one-way between-participants ANOVAs. For all tests, the independent categorical variable was state of mind (three levels: mindfulness vs. mind wandering vs. join-the-dots).
For the NHSTs, omnibus ANOVAs were followed by Tukey’s honestly significant difference (HSD) pairwise comparisons with the criterion for significance (α) set at .05. Although we present all of these tests for completeness, we interpret the HSD tests only when there was a significant omnibus ANOVA. For the Bayesian analyses, we used default Bayes factor tests for ANOVA designs (Rouder, Morey, Verhagen, Swagman, & Wagenmakers, 2017) using the
We conducted three model comparisons using Bayes factors. The first model comparison was a null model (i.e., all three levels of the design are equal) against a full model (i.e., all three levels are not equal), denoted by BFnull–full. This comparison allowed quantification of the degree of support for a model showing some effect versus a model showing no effect. Another model—the order-restricted model—was then constructed. In contrast to the full (unrestricted) model, in which all levels of the design are assumed to be different, order-restricted models test whether the data fit a predicted ordering of the factor-level effects (e.g., the mindfulness score is greater than the join-the-dots score, which in turn is greater than the mind-wandering score). In a second model comparison, then, this order-restricted model was compared with the full (unrestricted) model; this comparison—denoted by BFrestricted–full—allowed quantification of the degree of support for a model showing a specifically ordered (and predicted) effect versus a model showing some (unrestricted) effect. Thus, in the presence of an effect, this model comparison allowed us to test whether the ordering of the factor levels matched our preregistered hypotheses. The third model comparison—BFrestricted–null—compared the order-restricted model with the null model. This model comparison allowed us to compare a null model with a model capturing our preregistered hypotheses. We followed the recommendations set out by Morey (2015) for testing the order-restricted models using the Bayes factor package.
Note that with a Bayes factor for model comparison between model
Manipulation checks
Before turning to the main analysis, we wanted to ascertain that our manipulations of mindfulness and mind wandering worked by assessing their impact on SMS, TRI, and TUT scores. For the SMS scale and its components, we used difference scores as the dependent variable by (postmanipulation score minus premanipulation score). The descriptive statistics for the manipulation checks are in Table 1. The top left panel of Figure 1 shows standardized effect sizes of between-condition comparisons for all scales.
Mean Values for the Manipulation Checks in All Three Conditions
Note: Standard errors of the mean are given in parentheses. The total State Mindfulness Scale (SMS) consists of 21 items scored on a 5-point scale; scores were also calculated for the two subscales: state mindfulness of mind (15 items) and state mindfulness of body (6 items). Means for the SMS measures are difference scores (postmanipulation score minus premanipulation score). The task-related-interference scale consists of 8 items scored on a 5-point scale, and the task-unrelated-thought scale consists of 8 items scored on a 5-point scale.

Standardized effect-size estimates (Cohen’s
For SMS total score (i.e., SMS mind and SMS body scores combined), there was an effect of state of mind,
A similar pattern of results was found for the SMS mind scale,
For the TRI questionnaire, there was a significant effect of state of mind,
Effects on memory
In this section, we present the results on the effects of state of mind on correct and false memory for both recognition and recall data. The descriptive statistics for all tests are shown in Table 2. For ease of exposition, we present the ANOVA results for all tests in Table 3 and the results of the Bayesian model comparisons in Table 4. Plots of standardized effect sizes for all between-condition comparisons can be seen in Figure 1.
Mean Proportions for All Dependent Variables in All Three Conditions
Note: Standard errors are given in parentheses. Correct recognition and correct recall refer to the correct recognition of list items. False recognition and false recall refer to the false recognition of critical lures.
Results of Null-Hypothesis Significance Tests of the Effect of State of Mind on Measures of Memory
Note: Values in brackets are 95% confidence intervals for the η2 effect-size estimates. Post hoc comparisons are interpretable only in the event of a significant omnibus analysis-of-variance (ANOVA) result. Correct recognition refers to the correct recognition of list items. False recognition refers to the false recognition of critical lures. CI = confidence interval.
Numbers in these columns are Tukey’s honestly significant difference (HSD)
Bayes Factors for All Model Comparisons for the Different Memory Measures
Note: In a model comparison of model
For the recognition data, the NHST analysis showed no significant effects of state of mind on any of the measures of correct or false memories (lowest
For the free-recall data, a similar picture emerged. The NHST analysis showed no significant effect of state of mind for either correct or false recall. The Bayesian model comparison BFnull–full provided moderate support for the null model in both cases. For the order-restricted tests, the correct-recall data were better predicted by the null model but only at anecdotal levels, BFnull–restricted = 2.34. The null model was a much better predictor of the data than the restricted model, BFnull–restricted = 15.62, which is strong evidence in favor of the null model. 2
Discussion
In summary, the state-of-mind inductions worked: The mindfulness induction induced mindfulness, and the mind-wandering induction induced mind wandering. However, there was no evidence of a difference in the levels of either correct or false memory for recall or recognition among the mindfulness, mind-wandering, or join-the-dots conditions. Thus, none of our hypotheses were supported; furthermore, neither were the previous findings by Wilson et al. (2015, Experiment 1) or Rosenstreich (2016, Experiment 2), who found that mindfulness increased false memories. Instead, our findings are consistent with those of Baranski and Was (2017), who also found no evidence for a difference in either true or false recall or recognition-memory performance between mindfulness and mind-wandering conditions in their first experiment or in false recall between the conditions in their second experiment, which included a join-the-dots condition. One explanation for the discrepant findings across the five experiments is that the brief inductions used (all 15 min in length, except for Rosenstreich’s Experiment 2, which was 30 min) were not sufficient to consistently induce the relevant state of mind to last throughout the subsequent tasks. Possibly the inductions are not long enough, or one-off brief mindfulness manipulations—unless used with experienced meditators—may just not increase mindfulness. Furthermore, we used a double-blind procedure in which the experimenters did not know which condition participants were assigned to. Gilder and Heerey (2018) demonstrated the impact of a non-double-blind procedure on performance. It is unclear whether Wilson et al. used such a procedure in their Experiment 1 (they did in Experiment 2), whether Baranski and Was (2017) or Rosenstreich (2016) did, and whether such variance in procedures had any impact.
One key methodological difference between our study and previous research was our use of manipulation checks to ensure that the manipulations induced the state of mind they were claimed to. Despite the effectiveness of the inductions, there are additional findings to consider. First, not only did the mindfulness manipulation induce mindfulness, but also so did the mind-wandering manipulation, with higher postinduction scores for the total SMS and the SMS mind subscale. The join-the-dots condition, by comparison, did not induce mindfulness on any of the scales. Second, the mindfulness induction also induced mind wandering, leading to higher postinduction scores for both TRI and TUT. This is perhaps not surprising given that this brief mindfulness induction was likely the first exposure to mindfulness for many participants, and mind wandering is more prevalent in novice meditators (Lutz, Slagter, Dunne, & Davidson, 2008). The join-the-dots condition also induced mind wandering, although only on the TRI and not on the TUT scale. Third, there was no significant difference between the mindfulness and mind-wandering inductions on the total SMS, the SMS mind subscale, or the TRI scale. Previous research contrasting mindfulness and mind wandering (e.g., Mrazek, Smallwood, & Schooler, 2012) has focused on dispositional mindfulness, whereas brief mindfulness inductions induce state mindfulness. It is possible that mind wandering might not be the appropriate control condition for state-mindfulness studies given the likely use of novice meditators. It may instead be that join the dots, or a similar activity, might be able to differentiate more clearly between the components at play during state mindfulness induced through a brief induction, because join the dots increased mind wandering but not mindfulness.
There are two potential problems with measuring states of mind: First, it is possible that demand characteristics distort the measurements; second, it is possible that the manipulation may wear off by the time the questionnaires are completed and the DRM lists are presented. However, as outlined in the introduction, it is not very satisfactory to assume that brief manipulations are sufficient to induce mindfulness or mind wandering, and we would further posit that there is not yet sufficient evidence to indicate that brief mindfulness and mind-wandering instructions do activate different states of mind. Further research is needed to address the longevity and nature of the states of mind induced by brief manipulations.
To conclude, more research is needed into the best control condition to use for state-mindfulness research. Our results are consistent with those of Baranski and Was (2017), showing no evidence for a difference in false-memory susceptibility among mindfulness, mind-wandering, and join-the-dots conditions. This suggests that it is too soon to say that “mindfulness plays havoc with memory” (Knapton, 2015).
Supplemental Material
Sherman_OpenPracticesDisclosure_rev – Supplemental material for Exploring the Impact of Mindfulness on False-Memory Susceptibility
Supplemental material, Sherman_OpenPracticesDisclosure_rev for Exploring the Impact of Mindfulness on False-Memory Susceptibility by Susan M. Sherman and James A. Grange in Psychological Science
Supplemental Material
Sherman_Supplemental_Material_rev – Supplemental material for Exploring the Impact of Mindfulness on False-Memory Susceptibility
Supplemental material, Sherman_Supplemental_Material_rev for Exploring the Impact of Mindfulness on False-Memory Susceptibility by Susan M. Sherman and James A. Grange in Psychological Science
Footnotes
Acknowledgements
Testing and data collection were performed by our research assistants L. James, H. Gilman, and C. Bagnall.
Transparency
Both authors contributed to the study design and preregistration protocol. S. M. Sherman prepared the data for analysis. J. A. Grange analyzed the data, and S. M. Sherman and J. A. Grange interpreted the data. Both authors drafted the manuscript and approved the final version for submission.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
