Abstract
Remembered experienced events are associated with memory qualities, such as clarity and visual details. Recent findings by Gander and Lowe suggest that reading about negative fictional, as opposed to factual, events results in memories with higher clarity. Here, we attempted to replicate these previous findings. We consider the fiction-simulation hypothesis as an explanation: engaging with fiction involves a higher degree of mental simulation and through increased imagery leads to memories with higher clarity. In two preregistered studies (N = 131 and N = 254) we labelled stories as either fact or fiction and measured participants’ experienced memory qualities and mental simulation. The results indicated that the earlier finding was not replicated and no differences in memory qualities or mental simulation were found comparing fact and fiction. The results align with previous research which has not found differences in memory qualities between fact and fiction, and we conclude that the finding from the original study does not hold.
Do memories of events created while reading differ in their experienced qualities, such as clarity, sensory details or emotional intensity, depending on whether the events are factual or fictional? An example is remembering a political protest by an environmental activist from a novel compared to remembering a similar protest from a news account–given that these events differ only on the fictional status, as believed by the one who remembers. If such differences in memory qualities between factual and fictional events exist, how could they be explained?
Phenomenological memory qualities, such as visual imagery of remembered events, have been studied extensively in relation to autobiographical memory (D'Argembeau et al., 2003; Gehrt et al., 2022) and event memory in general (Marsh & Yang, 2020; Rubin & Umanath, 2015). The reality monitoring framework (Johnson & Raye, 1981) and the source monitoring framework (Johnson et al., 1993) propose that memory qualities in the form of, for example, perceptual and emotional detail in the memory trace play an operative role in distinguishing real from imagined events. For instance, a memory of imagining turning off the stove versus actually turning off the stove would differ in that the latter would include higher degrees of visual and kinaesthetic elements. Further, in episodic future thought (Atance & O’Neill, 2001), memories of thoughts about the future have been found to also possess memory qualities, but less vivid compared to memory for autobiographical events (Morton & MacLeod, 2023).
In contrast, the factor of fictionality, whether information is real or fictional, has not been much studied in relation to memory qualities. Gander et al. (2023a) referred to fictionality in memory as: memories formed as a result of a creative act of imagining some states of affairs, events, places, characters, or objects, with either internal or external origin, which are believed by the person remembering to be decoupled from the real world. Decoupling means that a piece of information is not intended to be evaluated against the real world and that real-world truth conditions and claims of existence are irrelevant (pp. 2–3, emphasis in original)
Following this definition, memory of fictional information is different from false memories (Loftus & Pickrell, 1995), memory of false information, and memory of disinformation (Pennycook & Rand, 2021) as these are all coupled with real-world truth conditions.
Furthermore, the relevance of fictionality extends beyond literature and reading research. Fictionality is a significant area of study since it is ubiquitous in many activities and situations in people's lives (Abraham, 2022; Gander et al., 2023b). Contexts that shape memories of fictional information cut across types of media, such as novels, films, and computer games, as well as non-mediated activities such as pretend play (Weisberg, 2015) and role playing (Kapitány et al., 2022; Seppänen et al., 2021). Considering memory of fictional information has theoretical implications for the conceptualization of human memory (Gander et al., 2023a; Marsh & Yang, 2020; Rubin & Umanath, 2015; Rubin, 2022; Yang et al., 2022).
Considering the central role of memory qualities for the cognitive system in reality monitoring, that is, when differentiating memories of internal versus external events, one question is whether memory qualities play a similar role differentiating memories of factual versus fictional events.
A few studies have investigated fiction and memory, with differing approaches and isolating the factor of fictionality to different extents. Gordon et al. (2009) compared memory qualities from text-based media, screen-based media, imagination, and autobiographical memories. They found that memories created from text are more similar to imagination, while memories created from watching a screen are more similar to autobiographical memories. However, the authors did not control the factor of fictionality of the memories, thus they did not have enough data to compare memories of factual and fictional events. Yang et al. (2022) compared autobiographical memories to memories of fiction from novels, movies, and television series. Compared to autobiographical memories, the qualities of memories of fiction were found to be similar, although generally less intense. However, fictionality was not controlled and confounded with media, so that all memories of fiction came from mediated sources (text- and screen-based), while the real events were from lived experience, that is, perceived directly without any media. For this reason, it is not clear whether the resulting memory qualities are consequences of that memories were of fictional events, or that these memories were formed from mediated sources.
One paradigm that addresses these possible confoundings is the manipulation of fictionality by labelling otherwise identical stimuli as either factual or fictional. Altmann et al. (2014) labelled several short stories as either factual or fictional and studied the processing of these stories using neuroimaging methods. They found that when people believed they were reading fact, there was activation in areas such as the anterior cingulate cortex, which they interpreted as action-based reconstruction of the events depicted in a story. In contrast, believing the events were fictional resulted in activation in areas suggesting a constructive simulation of what might have happened in the story. These results also support the labelling paradigm as a viable method because the labelling gave rise to differential processing. Another study by Abraham et al. (2008) found differences when real and fictional faces were processed. Simply put, real faces activated brain areas associated with episodic memory, while fictional faces activated areas associated with semantic memory. Even though both these studies suggest differential neural processing of material a person believes to be real versus fictional, the studies did not include long-term memory, and thus did not address memory qualities.
Some studies have used the labelling paradigm while investigating memory qualities. Hartung et al. (2017) found no differences in visual details of the memories of stories believed to be real versus fictional, even using a very large sample size (N = 1742). Similarly, Gander and Gander (2022) investigated differences in visual perspective in memories of real and fictional events. No differences between fact and fiction were found for visual perspective or for associated memory qualities clarity, visual details, or black/white-colour. Also using the labelling paradigm, Gander and Lowe (2023) addressed a wider range of memory qualities and used four stories differing in emotional valence and found that most memory qualities were similar between real and fictional stories; however, one finding stood out–an interaction effect between fictionality and story emotional valence so that negative stories labelled as fictional were remembered with higher clarity than negative stories labelled as factual (p = 0.007, η2p = .14; 90% CI for the effect size [.02, .28] 1 ). Memory qualities were measured at three time points: immediately after reading, after a ten-minute delay, and after five weeks, but the interaction did not involve the time factor. To explain the interaction between fictionality and story emotional valence, it was suggested that fictional stories may involve a higher degree of mental simulation of the events, characters, and actions in the story. The idea that fiction involves mental simulation has previously been suggested by several researchers (Altmann et al., 2014; Mar & Oatley, 2008; Nissel & Woolley, 2022; Oatley, 1999, 2016). According to the fiction-simulation hypothesis, the explanation of the clarity difference is that mental simulation, through the use of more imagery processing, leads to increased memory clarity. This may also extend to other memory qualities, such as visual details or emotional intensity, but the study may not have been sufficiently powered to detect those differences.
We note that the study by Gander and Lowe (2023) had several limitations. First, the small sample size provided sufficient power to detect only large effects. Second, the emotional valence was not counterbalanced between stories because the positive and negative stories differed not only in emotional valence but also in content. Therefore, the particular content of the story rather than the emotional valence could have produced the higher clarity. Third, the explanatory role of mental simulation could not be supported since it was not measured in the study. Given these shortcomings, it would be pertinent to replicate the study's results while trying to mitigate these limitations. No such replication attempt has yet been made. Thus, it is still unclear whether systematic differences in memory qualities between real and fictional events exist, and if so, why.
The aim of the current study is to enhance the understanding of differences of memory qualities between real and fictional events. The current study makes several contributions. First, it attempts to replicate the finding of Gander and Lowe (2023) that negative fictional stories are remembered with higher clarity. It does so by using a larger sample size, yielding a study with greater power. Second, it attempts to offer an explanation for why these differences occur, by measuring mental simulation. Third, it controls story emotional valence by counterbalancing so that pairs of stories share the same content while differing on emotional valence. Fourth, since the long retention interval of five weeks in Gander and Lowe (2023) casts doubt on how well the participants remembered the stories, the current study uses a shorter interval of one day. Finally, the current study extends the external validity by including different, and a larger number of, stories.
We set up four hypotheses:
Hypothesis 1: ‘Negative fictional stories and clarity’. Hypothesis 1 is a replication of Gander and Lowe (2023) and predicts an interaction between story emotional valence and fictionality, so that memories of negative stories labelled as fictional are rated with higher clarity compared to negative stories labelled as factual and positive stories labelled either as factual or fictional. For Hypothesis 1, we do not expect main effects or interactions for the other memory qualities measured in the present study (colour, visual detail, other senses, emotional intensity, or bodily reaction). We expect participants to rate negative stories with more negative emotional valence than positive stories.
Hypothesis 2: ‘The role of mental simulation for Hypothesis 1’. Hypothesis 2 concerns mental simulation as an explanation for the difference expressed by Hypothesis 1–a higher degree of mental simulation may create memories with higher clarity. We anticipate a positive correlation between clarity and simulation that can further be explained by valence and fictionality in a multilevel model. We hypothesise that, in the best fitting model, negative and fictional stories will show stronger clarity-simulation correlation. If so, the higher clarity of negative fictional stories (Hypothesis 1) can be explained by simulation.
Hypothesis 3: ‘Fictional stories and intensity of all memory qualities’. This hypothesis is an alternative to Hypothesis 1, expressing a stronger claim. Reasoning that reading fiction involves mental simulation to a larger degree than reading fact, we predict a main effect of fictionality, so that ratings of memory qualities in general are higher in the fictional condition (apart from rated emotional valence of the memory, which should be associated with the factor story emotional valence).
Hypothesis 4: ‘The role of mental simulation for Hypothesis 3’: Hypothesis 4 concerns mental simulation as an explanation for the differences expressed in Hypothesis 3. We anticipate relationships between these six dependent variables (the different memory qualities) with simulation. We further hypothesise that the magnitude of this relationship can depend on the fictional status of the story. If so, the higher ratings for each variable can be explained by simulation.
Study 1
Method
Preregistration
The study's desired sample size, as well as variables, hypotheses, and planned analyses were preregistered on Open Science Framework (https://doi.org/10.17605/OSF.IO/8SW5D) prior to data collection. Materials, data, and analysis scripts are available on OSF at https://doi.org/10.17605/OSF.IO/9S5YU.
Participants
We estimated the required sample size using MorePower v. 6.0.4 (Campbell & Thompson, 2012) with power = .8 and α = .05. Although Gander and Lowe (2023) found an interaction between fictionality and story emotional valence with an effect size of η2p = .14, we opted to carry out a replication with increased power. We considered effect sizes of η2p = .04 (medium to small), following recommendations by Brysbaert (2019). Considering the effects of interest requiring the largest sample size resulted in N = 196.
Using the online platform Prolific, 292 participants were recruited, allowing for participant drop-out and data exclusion. Participants were compensated through Prolific using an hourly rate of £7.56. A pre-screening was made to include only those who reported being fluent in English. During data collection for Part 1, 70 participants requested to interrupt their participation, and four participants timed out (although Prolific automatically filled their places with new participants in both these cases). Data was missing for six participants due to technical issues. Based on the collected data, participants were excluded from Part 1 according to criteria specified in the preregistration: four spent less than one minute in total, and seven failed the attention check. We also added an exclusion criterion for Part 1 not in the preregistration; we excluded three participants who spent less than ten seconds on reading the story (our own reading tests indicated that spending less time than that could not be considered a serious attempt at completing the study). Two hundred seventy-two participants completed Part 1 successfully. For Part 2, 17 participants did not return to the study, twelve failed the attention check, and 112 failed the manipulation check (42 ‘don't remember’ answers and 70 incorrect answers). Thus, 131 participants remained and completed all parts of the study successfully, aged 19 to 71 (M = 30.49, SD = 9.47) years, 84 females and 46 males (one missing value). Further demographic information concerning reading and engaging with fiction and nonfiction is shown in Appendix, Table A4; no differences were found comparing the fact and fiction conditions, as tested with Mann-Whitney U tests, used due to non-normal data distributions, nfact = 45, nfiction = 86, 1761 < Us < 2071.5, zs < 0.86, ps > .39. Countries of residence were reported as South Africa (44%), Portugal (8%), Poland (8%), United Kingdom (5%), Spain (5%), France (5%), Netherlands (4%), Italy (3%), Hungary (2%), Greece (2%), Germany (2%), Czech Republic (2%), Sweden (2%), Finland (2%), Australia (2%), and 8 other countries (4%, <= 1% each).
Design
The experiment used a 2×2 random groups design with the factors fictionality (fact/fiction) and story emotional valence (positive/negative). Both factors were manipulated between subjects. The reason that each participant took part in only one of the four conditions was to avoid participants comparing the conditions and possibly artificially enhancing the contrast between conditions in their ratings (with or without awareness). Sample sizes for the four conditions were: Fact-positive (n = 22), fiction-positive (n = 41), fact-negative (n = 23), fiction-negative (n = 45) 2 . Dependent variables were simulation (the degree to which participants mentally simulated the events of the story while reading it) and a number of self-rated memory qualities.
Materials
The stimuli consisted of eight stories in English of around 100 words each (see Appendix, Table A1). We used the OpenAI tool ChatGPT 3.5 to generate the stories using various prompts, for example, ‘write a 100 word positive news story involving a protagonist and a conflict and a concrete scene’. The style of the stories was such that they could be considered either news stories or fictional stories. Four stories were generated as positive, then prompted to be rewritten as negative, and the opposite for the other four stories (e.g., using the prompt ‘rewrite as negative’) to counterbalance the content of positive and negative stories. All stories were written in third person and involved a protagonist (four male and four female characters). Stories involved a conflict and a concrete scene to prompt visual imagery. To assess the emotional valence and semantic similarity between positive and negative stories, we carried out automatic sentiment and semantic similarity analyses. For semantic similarity, we calculated the cosine similarity score using Spacy (3.7.2, Dictionary: en_core_web_sm). The sentiment analysis was calculated using the compound score using Vader for NLTK (3.7) (see Appendix, Table A1). The participants (N = 131) rated the stories using 7-point scales as easy to understand, M = 5.95 (SD = 1.36) and fairly interesting, M = 4.96 (SD = 1.59). Also, the larger sample who completed the first part of the study (N = 272) gave similar ratings, M = 5.95 (SD = 1.33) and M = 4.94 (SD = 1.58), respectively. Since the ratings of the larger sample are not lower than for the smaller sample, we conclude that these ratings were not an artefact of systematic subject loss when exclusion criteria were applied in the second part of the study.
The online tool PsyToolkit (Stoet, 2010; 2017) was used for stimuli presentation and data collection.
Measures
We used the same memory qualities (shown in Table 1) as in Gander and Lowe (2023), but removed the measure of whether the events remembered had a simple or complex storyline, as it was judged not to be as relevant to the idea of mental simulation of events. The measures and items were originally based on Johnson et al. (1988) and Berntsen and Rubin (2006). The qualities clarity, colour, visual detail, other senses, emotional intensity, and bodily reactions all relate to the fiction-simulation hypothesis; higher ratings would indicate a more lively, realistic experience in line with simulating the events. The quality emotional valence was included as a check to see that the positive stories were indeed remembered as more positive compared to the negative stories.
Measures of Memory Qualities.
Mental simulation was measured through a novel scale consisting of the mean of five items (see Appendix, Table A2). We constructed the items to reflect perspective taking and reflection about the events in the story, based on accounts of mental simulation in fiction (Oatley, 1999; Mar & Oatley, 2008; Altmann et al., 2014; Oatley, 2016). Scale properties are reported in the Results section.
Procedure
The study consisted of two parts. In Part 1, participants encoded a story in memory. In Part 2, one day later, they recalled the story and rated their experienced memory qualities. See Figure 1 for a flowchart.

Flowchart of experiment procedure for Study 1.
First, participants took part of the study information on Prolific, described as a study on the psychology of reading. They were informed that the study is anonymous, that data will be used for research purposes only and will be stored in a public repository, that they may interrupt the study and revoke their data at any time (using Prolific's ‘return’ function or by anonymously contacting the researcher). Participants gave their informed consent by starting the study. Then, participants were informed that they would read an excerpt from a story and that they would be asked some questions later. The story was presented as either ‘a news story’ or ‘a fictional story’ (see Appendix, Table A3, for instructions). Thereafter, the story was presented on a separate page. After selecting ‘continue’, participants were asked how easy they thought the story was to understand and how interesting it was. This was followed by five items to measure the degree of mental simulation. After that, participants answered demographic questions about attitudes and habits of engaging with fiction and non-fiction (taken from Hartung et al., 2017, with some minor wording changes) (see Appendix, Table A5). An attention check was included here, instructing participants to rate one question with the lowest value (i.e., 1). Part 1 was concluded with a possibility to give optional comments. Participants were informed that they would be contacted the next day for Part 2.
Part 2 was announced to the remaining participants from Part 1 24 h later and was carried out by participants between 24 and 47 h after Part 1. First, participants were instructed to recall the previous story and inspect their memory while answering the questions. Participants rated seven qualities of their memory with the order of items randomised for each participant. An attention check was included among the questions, instructing the participants to select the highest value (i.e., 7) on that question. Then, a manipulation check asked participants if the story they read previously was a news story or a fictional story, or if they did not remember (see Appendix, Table A8). Part 2 concluded with a debriefing and an opportunity to give optional comments.
The median completion time for Part 1 and Part 2 together was 4 min and 48 s.
Statistical Analysis
We used IBM SPSS Statistics (version 29) for ANOVAs, Mann-Whitney U tests, Cronbach's α, Pearson correlation, and factor analyses.
Confidence intervals for η2p were calculated using R (R Core Team, 2024) with the function ci.pvaf from the MBESS package version 4.9.3 (Kelley, 2023). We used 90% CI instead of 95% CI for the effect size for ANOVAs since they are more appropriate for α = .05 (Steiger, 2004).
The level of statistical significance was set to .05 for all tests. We did not make corrections for multiple comparisons because of the specificity of our hypotheses.
Results
Memory Qualities
The mean ratings of memory qualities for the four conditions are shown in Table 2 while Figure 2 shows a comparison between fact and fiction. A two-way ANOVA showed no support for an interaction between fictionality and story emotional valence for clarity, F(1, 127) = 1.66, p = .69, η2p = .001, 90% CIη2p [0, .062]. Thus, Hypothesis 1 was not supported. Two statistically significant effects were found. For colour, there was an interaction between fictionality and story emotional valence, F(1, 127) = 4.21, p = .042, η2p = .032, 90% CIη2p [.00058, .095]. For fiction, negative stories were rated as being more in colour than positive stories, p = .042, Mdiff = .78, 95% CIdiff [0.03, 1.52]. Additionally, for emotional valence, there was a main effect of story emotional valence as we expected, so that negative stories were remembered with more negative emotions than positive stories, F(1, 127) = 22.58, p < .001, η2p = .15, 90% CIη2p [.065, .24], Mdiff = 1.16, CIdiff [.68, 1.64]. There were no statistically significant effects on any of the memory qualities. Comparisons of memory qualities between fact and fiction gave ps > .10, η2p < .02. Hence, Hypothesis 3 (more intense memory qualities for fiction) did not receive support.

Mean ratings of memory qualities and simulation for fact and fiction, Study 1.
Mean Ratings of Memory Qualities for Study 1.
Note. The Total column shows the mean of the combined ratings for both positive and negative stories. Ratings were made on 7-point scales. Standard deviations within parentheses.
Simulation
The simulation scale, consisting of five items, was considered for a sample of N = 272 that completed Part 1 of the study. The scale showed reliability in terms of internal consistency, Cronbach's α = .77. In terms of correlations of items, the determinant was .23, and the highest correlation was r = .62, indicating non-collinearity. Further, a factor analysis showed that only the first factor eigenvalue 2.65 was above 1, accounting for 53.06% of the variance, suggesting that the scale is unidimensional. Factor loadings for the items can be seen in the Appendix, Table A2.
To check whether the excluded part of the sample that did not pass the manipulation check (N = 112) differed in terms of mental simulation from the sample that did (N = 131), a comparison between simulation scores was made. An independent t test revealed no statistical difference between those who failed (M = 4.00, SD = 1.34) and those who passed (M = 4.11, SD = 1.30), p = .55, d = −0.77, 95% CI d [–.33, .18].
The simulation scores (using the final N = 131) are shown in Table 3 and the means of the simulation scale is shown in Figure 2. A two-way ANOVA with the factors fictionality and story emotional valence did not reveal a statistically significant difference in simulation for fiction compared to fact, F(1, 127) = 1.20, p = .28, η2p = .009, 90% CIη2p [0, .054]. Although ratings of all individual items and the scale were numerically higher for fiction than for fact, no differences were statistically significant, with ps > .086, η2p < .023. The interaction between fictionality and story emotional valence was not statistically significant, F(1, 127) = 0.20, p = .66, η2p = .002, 90% CIη2p [0, .030].
Mean Ratings of Individual Items of Simulation and Simulation Scale for Study 1, N = 131.
Note. The Total column shows the mean of the combined ratings for both positive and negative stories. Ratings were made on 7-point scales. The items are presented in shortened form. Simulation scale is calculated as the mean of items 1 to 5. Standard deviations within parentheses.
Correlations were calculated using Pearson r between participants’ simulation score and rated memory qualities. Simulation correlated moderately with emotional intensity (.43), and weakly with clarity (.34), colour (.34), visual details (.35), other senses (.24), and bodily reaction (.34), ps <= .006. Simulation did not correlate with emotional valence (–.047), p = .59.
Because Hypotheses 2 and 4 are dependent on evidence in favour of Hypotheses 1 and 3, which were not found, it is not necessary to test them. However, since the analyses were preregistered, we still carried them out to present the results openly (see Appendix B). The results show no support for Hypotheses 2 and 4.
Discussion
In the current study, participants rated memory qualities when remembering stories labelled as either a news story or a fictional story. The results showed no effects of fictionality or story emotional valence, with the exception of an interaction between the two factors for colour (to what extent the participant experienced the memory in black and white versus colour): For fiction, negative stories were rated as being more in colour than positive stories. The results also showed that participants rated the memory valence of negative stories lower compared to positive stories, as we expected. Additionally, in the study, participants answered questions after reading intended to measure the degree of mental simulation during reading. The simulation scale correlated weakly or moderately with the memory qualities, suggesting that, although there is some relation to memory qualities, it measures a distinct construct, as we intended. However, no differences were found in the degree of mental simulation for fact and fiction.
We had preregistered four hypotheses. Hypothesis 1 did not receive support, and we could therefore not replicate the interaction result from Gander and Lowe (2023) that memories of negative fictional stories are rated with higher clarity than negative factual stories. Neither did we find support for the stronger version of the fiction-simulation hypothesis, formulated in Hypothesis 3, namely that all memory qualities would be greater for fictional compared to factual stories. Since Hypotheses 1 and 3 were not supported, Hypotheses 2 and 4 were not supported either, because they depend on evidence in favour of Hypotheses 1 and 3.
We could not find any explanation for why there was an uneven exclusion between the fact and fiction conditions so that participants in the fact condition were excluded more. An analysis showed that the excluded participants did not differ on measured mental simulation, which gives us some confidence that this does not affect the outcome of the study.
There were some indications that many participants did not remember the stories well. Seventy participants answered incorrectly when asked whether the story they read was a news story or a fictional story. Forty-two participants answered ‘don't remember’ on the same question. Additionally, there was feedback from some participants that they did not remember the story at all. This reduced the remaining sample (N = 131) to below the estimated size calculated before the study (N = 191). A sensitivity power calculation showed that the power was .63 given N = 131 with α = .05 and effect size η2p = .04. But even for participants who passed the control question, the quality of the ratings may have been influenced by poor memory of the story. This may in turn have masked effects of the experimental factors.
Study 2
In Study 2, our intention was to replicate Study 1, while aiming to rule out that the results would be a consequence of that participants did not remember the stories well. We controlled and strengthened the encoding of the story and the story label.
Method
The design and material of Study 2 was the same as for Study 1, with additional questions described under Procedure. The participants in Study 2 rated the stories using 7-point scales as easy to understand, M = 6.27 (SD = 1.08) and fairly interesting, M = 4.78 (SD = 1.59).
Preregistration
The study's desired sample size, as well as variables, hypotheses, and planned analyses were preregistered on Open Science Framework (https://doi.org/10.17605/OSF.IO/8N3QA) prior to data being collected. Materials, data, and analysis scripts are available on OSF at https://doi.org/10.17605/OSF.IO/ZGYE8.
Participants
Three hundred participants who had not participated in Study 1, reported as fluent in English, were recruited through the online service Prolific in the same way, and were compensated in the same way, as for Study 1. During the data collection, 90 participants returned their submission and 11 timed out (although Prolific filled these participants with new ones). Data was excluded from participants according to criteria in the preregistration: no data stored (1), failed attention check (7), failed manipulation check (18) (incorrect answer on ‘news story/fictional story’ [4], confidence under 50% [11]), and failed story content questions at the end (1). Thus, data from 254 participants was used in the study. A sensitivity power calculation revealed a power of .90 given this sample size with α = .05 and effect size η2p = .04. Participants were aged from 18 to 71 (M = 34.98, SD = 12.03), 133 males and 120 females (1 preferred not to say). Further demographic information is shown in the Appendix, Table A5. No differences in demographics were found comparing the fact and fiction conditions, as tested with Mann-Whitney U tests, used due to non-normal data distributions, nfact = 123, nfiction = 131, 8254 < Us < 8819, zs < 1.35, ps > .18. Countries of residence were reported as United Kingdom (28%), South Africa (16%), Portugal (11%), Poland (6%), Mexico (5%), Italy (4%), Greece (4%), Germany (4%), Spain (3%), Canada (3%), Hungary (2%), France (2%), Chile (2%), Ireland (2%), and 13 other countries (6%, <= 1% each).
Sample sizes for the four conditions were: Fact-positive (n = 53), fiction-positive (n = 62), fact-negative (n = 70), fiction-negative (n = 69).
Procedure
The procedure was similar to that of Study 1, with some modifications (see Figure 3). First, the retention interval was reduced to around five minutes and kept within the same session. This also has the benefit that no participants are lost between study sessions. Next, in order to strengthen the memory of the story, participants were instructed to read it carefully, and were not able to continue in less than 30 s, making them spend longer time reading the story (see Appendix, Table A3). To further strengthen the encoding of the story, participants could not proceed until they answered questions correctly about the story content (see Appendix, Table A7); otherwise they would need to re-read the story (we did not record which participants re-read the story since our primary interest was that participants knew the story well enough, and we believe re-reading would not introduce a systematic effect given our design). Moreover, a filler task was inserted between the presentation of the story and the rating of memory qualities. The purpose of the filler task was to clear working memory, so that the story would need to be retrieved from long-term memory. In the filler task, participants were instructed to write texts consisting of three to five sentences in relation to three photographs, unrelated to the story. Participants had to stay on each writing task for at least 30 s. Since participants also answered questions, the total time between encoding and retrieval of the initial story was around five minutes. At the end of the study, participants answered questions on story content again (the same as after reading the story), as well as on manipulation (news story or fictional story) and their confidence in their answer on the manipulation (see Appendix, Table A8). One minor improvement over Study 1 was that the order of the response options was randomised for the manipulation check (whether the story they read was a news story or a fictional story, or if they did not remember) to eliminate bias. The median completion time of Study 2 was 10 min and 11 s.

Flowchart of experiment procedure for Study 2.
Results
Memory Qualities
Mean ratings of memory qualities for the four conditions can be seen in Table 4 while Figure 2 shows a comparison between fact and fiction. For the memory quality clarity, a two-way ANOVA with fictionality and story emotional valence showed an interaction effect, F(1, 250) = 5.37, p = .021, η2p = .021, 90% CIη2p [.0016, .058]. There was a simple main effect so that for factual stories, negative ones were rated with higher clarity than positive ones, p = .022, Mdiff = 0.54, 95% CIdiff [.08, .99]. The expected simple main effect for fictional stories was not statistically significant, p = .35, Mdiff = 0.21, 95% CIdiff [–.65, .23]. For the remaining factors, there were no statistically significant effects on any of the memory qualities, with ps > .059, η2p < .014. The interaction between fictionality and story emotional valence for colour from Study 1 was not replicated in Study 2. Further, no effect was found of story emotional valence on rated memory emotional valence, F(1, 250) = .20, p = .66, η2p = .001, 90% CIη2p [0, .023], Mdiff = 0.089, 95% CIdiff [–.3, .48]. Thus, memories of negative stories were not rated as more negative. Hence, neither Hypothesis 1 nor Hypothesis 3 were supported.
Mean Ratings of Memory Qualities for Study 2.
Note. The Total column shows the mean of the combined ratings for both positive and negative stories. Ratings were made on 7-point scales. Standard deviations within parentheses.
As with Study 1, testing Hypotheses 2 and 4 after failing to reject the null hypothesis for Hypotheses 1 and 3 is redundant, but since the analyses were preregistrered, we report the results for Study 2 in Appendix B.
Simulation
The simulation scale (N = 254) showed reliability in terms of internal consistency, Cronbach's α = .77. In terms of correlations of items, the determinant was .25, and the highest correlation was r = .65, indicating non-collinearity. Further, a factor analysis showed that only the first factor eigenvalue 2.63 was above 1, accounting for 52.65% of the variance, suggesting that the scale is unidimensional. Factor loadings for the items can be seen in Appendix, Table A2.
The ratings and the simulation scale is shown in Table 5 and the simulation scale mean is shown in Figure 4. A two-way ANOVA with fictionality and story emotional valence revealed no difference in simulation score for factual and fictional stories or for positive and negative ones, Fs < 1.65, ps > .2, η2p < .007. The interaction between fictionality and story emotional valence was not statistically significant, F(1, 250) = 0.40, p = .54, η2p = .001, 90% CIη2p [0, .019].

Mean Ratings of Memory Qualities and Simulation for Fact and Fiction, Study 2.
Mean Ratings of Individual Items of Simulation and Simulation Scale for Study 2.
Note. The Total column shows the mean of the combined ratings for both positive and negative stories. Ratings were made on 7-point scales. The items are presented in shortened form. Simulation scale is calculated as the mean of items 1 to 5. Standard deviations within parentheses.
The pattern of correlations between the simulation score and memory qualities were similar to that of Study 1: emotional intensity (.55), clarity (.27), colour (.22), visual details (.35), other senses (.38), bodily reaction (.39) (ps < .001), emotional valence (.059), p = .35.
Discussion
In Study 2, as in the previous study, participants rated memory qualities when remembering stories labelled as either factual or fictional. We were confident that participants had encoded the story and could remember it, as well as the label they were presented with (news story or fictional story). The results showed no differences between fact and fiction for the memory qualities. However, an interaction was found so that for fact, negative stories were remembered with higher clarity. The finding from Study 1, that for fiction, negative stories were experienced more in colour, was not replicated. Since neither of these two interactions were found in both studies, we disregarded them as unreliable. We were surprised that participants did not rate their memories of the negative stories as more negative compared to the positive stories. One speculation for this result is that in Study 1, in which there was such a rated difference, there is some evidence that participants read the stories hastily, which may have resulted in them remembering only occasional keywords, perhaps the negative ones, which set the emotional tone when remembering. In contrast, the participants in Study 2 read the stories more carefully which may have resulted in a more nuanced memory which was not as clearly negative. More importantly, as in Study 1, we could not replicate the result of Gander and Lowe (2023) that fictional negative stories resulted in higher clarity, and thus Hypothesis 1 was not supported. Further, the stronger version of the fiction-simulation hypothesis, Hypothesis 3, did not receive support either.
General Discussion
We set out to replicate the results found in Gander and Lowe (2023) that the fictional status of stories influences memory clarity. More specifically, we tried to replicate the finding that memories of negative fictional stories have higher clarity than memories of negative factual stories. Several shortcomings of the original study were addressed. One improvement concerned the stimuli. We used different and a larger number of stories which were counterbalanced concerning story emotional valence. Additionally, we measured mental simulation during reading to test the fiction-simulation hypothesis as an explanation for the finding in Gander and Lowe (2023). Increased mental simulation when engaging with fiction compared to fact would involve more extensive imagery processing, which in turn lead to increased clarity of the memory. We measured mental simulation using a novel scale, consisting of five items, which showed acceptable scale properties. In two studies, with the second study also controlling the encoding of the material and manipulation instruction, we could not replicate the result of Gander and Lowe (2023). The explanation for why the study did not replicate may be because of the stories used in the original study. Since story emotional valence was not counterbalanced there, the negative stories differed also in content (i.e., positive and negative stories were about different things). However, since the effect also included the factor of fictionality the specific story content cannot fully explain the effect. Another possibility, which we consider more likely, is that the finding was due to a Type I error. Notwithstanding, based on the lack of replicated results in our two studies, we conclude that there are little or no resulting differences in memory qualities as a consequence of labelling a story as fact or fiction. This result also is in alignment with the studies of Hartung et al. (2017) and Gander and Gander (2022), which did not find any differences in memory qualities for factual and fictional stories.
We did not find support for the fiction-simulation hypothesis in either of the two studies. The fiction-simulation hypothesis is based on ideas on mental simulation and fiction reading by Oatley (1999), Mar and Oatley (2008), Altmann et al. (2014), Oatley (2016), and Nissel and Woolley (2022). The idea is that reading fiction, opposed to fact, involves mental simulation of the events, characters, and actions to a higher degree. We measured simulation adequately, and it correlated to various degrees with the experienced memory qualities when remembering the stories. It was just not specifically tied to reading fiction but was similarly present when reading fact. A possible reason for the lack of difference is the participants did not read the stories in the same way they read longer, more skilfully crafted pieces of fiction (more on this issue below). Finally, there is the possibility that the fiction-simulation hypothesis is false. Actually, to our knowledge, there is no empirical support that reading about events that one believes are fictional would include mental simulation to a higher degree than when reading about events one believes are factual.
The results have implications for accounts of human memory. If there are no systematic differences in memory qualities between factual and fictional events, memory qualities cannot play an operative role in the memory system to distinguish these two types of events, in a similar way as in reality monitoring (Johnson & Raye, 1981). Instead, the results are consistent with accounts that posit that memories of factual and fictional information are distinguished using other memory mechanisms, such as external source monitoring 3 (Johnson et al., 1993) or the multiple-processes framework suggested by Gander et al. (2023a).
One way our studies were limited is that they included a particular set of stories. First, it is possible to raise the question of how representative the stories were. We attempted to solve this issue by involving eight different stories with various types of content, involving both male and female protagonists. Nonetheless, any study would be limited to a particular selection of stimuli. Further, there is a possibility that the stories were biassed in the direction of fact or fiction, so that participants perceived them as fact or fiction, regardless of how they were labelled. However, unless most stories were perceived as only fact or only fiction, this issue would not systematically impact the outcome. We did not include a check of whether participants really believed that the story was factual or fictional. The reason is that such an explicit question would arguably initiate other types of processing compared to the processing participants do when they read normally. For participants to answer such a question they would need to reason and make a judgement on the fictional status based on cues in the story as well as demand characteristics of the study. Thus, answering this question may not accurately reflect how the participants’ approached the stories initially. However, future studies could explore implicit measures of participants’ beliefs to clarify this issue. Another, potentially more serious limitation concerning the stories is that they may have been too short to activate the processing associated with reading fiction. It is possible that mental simulation is present to a higher degree when reading longer pieces of fiction, which may then lead to increased clarity and possibly higher intensity for other memory qualities. However, there are some reasons why this may not be an issue. The stories used in Gander and Lowe (2023) were even shorter, so this argument does not directly relate to the outcome of the replication attempt. It is also the case that Hartung et al. (2017) used longer stories, corresponding to around three book pages (884 words), but still did not find any differences in memory qualities. Taken together, there is no reason to believe that the length of the story would impact mental simulation, but future studies could be open to explore this possibility. Another limitation in the data analysis is that, while Hypotheses 2 and 4 might have been superfluous to perform given the results in Hypotheses 1 and 3, the sample size of linear models could have hampered the more complex models’ ability to fit the data.
In conclusion, since we were unable to replicate the result of Gander and Lowe (2023) and based on the findings of the two studies presented here, we offer additional evidence supporting the view that there are no differences in experienced memory qualities as a function of the fictional status of the remembered events. Likewise, mental simulation, at least how we operationalised it here, is not particular to reading fiction, but is equally operative for reading about allegedly real events. These results contribute to a clearer understanding of memory qualities of fictional events, and their possible role in processes that distinguish fact and fiction in memory.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Magn. Bergvalls Stiftelse (grant number 2020-03917).
Notes
Author Biographies
Appendix
Stimuli Stories. Simulation Scale Items With Factor Loadings for Study 1 and Study 2. Note. Factor loadings for Study 1 are given for the sample who completed Part 1. Instructions for the Manipulation in the Two Fictionality Conditions. Participants’ Demographic Information for Fiction and Nonfiction for Study 1. Note. Standard deviations within parentheses. Ratings were made on 7-point scales. The items are presented in shortened form. Participants’ Demographic Information for Fiction and Nonfiction for Study 2. Note. Standard deviations within parentheses. Ratings were made on 7-point scales. The items are presented in shortened form. Items for Attitudes and Habits of Fiction and Non-Fiction. Questions on Story Content for Study 2. Manipulation Check.
Story emotional valence
Semantic similarity
Sentiment
Positive
Negative
Positive
Negative
In a remarkable turn of events, celebrated environmental activist, Alex Turner, stood resilient in the face of a challenging but peaceful protest against a contentious mining project. Turner and a united group of demonstrators came together to voice their concerns about the project's environmental impact, sparking a constructive dialogue with local authorities. Amid the gathering, Turner's eloquent calls for environmental preservation resonated with both the public and officials, resulting in a productive discussion about the project's potential consequences and viable alternatives. The scene underscored the power of advocacy and cooperation, bringing hope for a sustainable future for the region's precious ecosystem while respecting the rights and voices of its protectors.
In a shocking twist, prominent environmental activist, Alex Turner, faced a perilous clash with law enforcement during a protest against a controversial mining project. As Turner and fellow demonstrators gathered to oppose the destructive venture, a heated confrontation ensued. Riot police were deployed to disperse the crowd, leading to violent clashes with tear gas and arrests. Amid the chaos, Turner's passionate pleas for environmental preservation were drowned out by the chaos and forceful resistance. The conflict highlighted the escalating tensions between advocates and authorities, casting a shadow over the future of the region's fragile ecosystem and the rights of those who fight to protect it.
0.95
0.97
−0.99
Amid a challenging legal process, Emily Parker, a resilient single mother of three, faced an inspiring turnaround in her housing situation. The conflict, initially driven by unpaid rent disputes with her landlord, took a positive turn when the court recognized her commitment to resolving the matter. In a courtroom scene filled with hope, Emily, accompanied by her children, received a fair and compassionate judgment that allowed her to retain her apartment. The weight of uncertainty lifted, and the future looked brighter for Emily and her children, underscoring the importance of equitable legal outcomes and support for individuals facing housing challenges.
Amid a contentious legal battle, Emily Parker, a single mother of three, faced eviction from her modest apartment. The conflict erupted when her landlord, citing unpaid rent, sought a court-ordered removal. In a stark courtroom scene, Emily, her children by her side, fought back tears as the judge ruled in favor of the eviction. The weight of uncertainty bore down on her, and her children's future hung in the balance as they faced the prospect of homelessness. The scene, defined by a dimly lit courtroom and the heavy gavel's thud, underscored the pressing issue of housing instability amid economic hardships.
0.94
0.97
−0.92
In a heartwarming turn of events, John Mitchell, a devoted elementary school teacher, found himself at the forefront of a community-driven effort to preserve essential extracurricular programs. Mitchell, renowned for his unwavering dedication to his students, faced unexpected challenges when the local school board announced budget cuts. However, his passionate advocacy, coupled with the support of concerned parents, ignited a wave of support. The community rallied together, working collaboratively to find innovative solutions to secure the programs. As unity prevailed, a sense of hope and optimism permeated the community, bolstering the prospects of a brighter future for the cherished extracurricular activities.
In a somber turn of events, John Mitchell, a dedicated elementary school teacher, found himself embroiled in a bitter dispute with the local school board. Mitchell, known for his unwavering commitment to his students, faced unexpected opposition as the board announced drastic budget cuts, threatening the existence of vital extracurricular programs. The conflict escalated as Mitchell, along with concerned parents, fought against the decision, arguing that the cuts would deprive children of essential opportunities for growth and development. As tensions continue to rise, the community remains divided, with no resolution in sight, leaving the future of the affected programs hanging in the balance.
0.97
0.99
−0.95
Jane Anderson, a resilient single mother of three, is experiencing an overwhelming wave of support from her community. Her family home was tragically lost in a late-night fire, along with cherished belongings and mementos. However, the local community has come together to assist her and her children in rebuilding their lives. Neighbors and friends have launched a relief fund, showcasing the strength of solidarity in times of adversity. This event, while challenging, demonstrates the immense capacity for kindness and support that exists within the community, leaving Jane with hope for a brighter future.
Jane Anderson, a resilient single mother of three, faces a devastating loss as her family home was consumed by a fierce fire. The blaze erupted in the dead of night, destroying all her possessions, including irreplaceable family mementos. Jane, who worked tirelessly to provide for her children, now confronts the daunting task of rebuilding their lives. She stands alone, without any help from neighbors or friends. This tragic event serves as a stark reminder of the fragility of life, leaving a determined mother to grapple with a future filled with uncertainty.
0.92
0.96
−0.97
When reading the text … 1 (Not at all/Never) … 7 (Very much/Daily)
Study 1 Factor Loadings (N = 272)
Study 2 Factor Loadings (N = 254)
I imaginatively placed myself in the person's situation
.78
.78
I thought about the emotions that the person experienced
.77
.78
I thought about other events that happened before or after
.73
.76
I thought about other possible events that could have happened, but did not
.75
.67
I related the events to my life
.60
.62
Study
Fictionality
Fact
Fiction
Study 1
You will now read a short excerpt from a news story. After reading the story, you will be asked some questions.
You will now read a short excerpt from a fictional story. After reading the story, you will be asked some questions.
Study 2
You will now read a short excerpt from a news story. Please read carefully. You can not continue until after 30 s. After reading the news story, you will be asked some questions.
You will now read a short excerpt from a fictional story. Please read carefully. You can not continue until after 30 s. After reading the fictional story, you will be asked some questions.
Item
Fact Condition
Fiction Condition
Total
Like reading fiction
5.78 (1.58)
5.90 (1.43)
5.85 (1.48)
Like to engage with other fiction
6.11 (1.13)
6.01 (1.29)
6.05 (1.23)
Frequency of engaging with fiction
5.51 (1.42)
5.41 (1.31)
5.44 (1.35)
Like reading nonfiction
5.36 (1.38)
5.31 (1.47)
5.33 (1.44)
Like to engage with other nonfiction
5.20 (1.42)
4.94 (1.54)
5.03 (1.50)
Frequency of engaging with nonfiction
4.93 (1.32)
5.13 (1.46)
5.06 (1.41)
Item
Fact Condition
Fiction Condition
Total
Like reading fiction
5.37 (1.61)
5.68 (1.39)
5.53 (1.52)
Like to engage with other fiction
5.49 (1.69)
5.81 (1.33)
5.65 (1.52)
Frequency of engaging with fiction
5.09 (1.66)
5.35 (1.51)
5.22 (1.58)
Like reading nonfiction
5.15 (1.58)
5.29 (1.40)
5.22 (1.49)
Like to engage with other nonfiction
4.92 (1.56)
4.98 (1.51)
4.95 (1.53)
Frequency of engaging with nonfiction
4.88 (1.33)
5.03 (1.47)
4.96 (1.40)
Responses: 1 (Not at all/Never) … 7 (Very much/Daily)
Do you like reading fiction?
Do you engage with other types of fiction besides reading (e.g., movies or series, comic books, etc.)?
How often do you engage with fiction?
Do you like reading non-fiction (stories based on true events)?
Do you engage with other types of non-fiction media [e.g., journal articles, science reports, (auto-) biographies, etc.]?
How often do you engage with non-fiction?
Item
Response options (randomised order)
What was the gender of the central person in the story?
Male / Female / Don't remember
What was the role of the central person in the story?
Environmental activist / Victim facing eviction / School teacher / Victim of a fire / Don't remember
What happened in the story?
A protest / A housing conflict / Budget cuts / A fire / Don't remember
Who was the antagonist (opposing party or force) in the story?
Local authorities/law enforcement / A landlord / The school board / A fire accident / Don't remember
Study
Item
Response options
Study 1
The events I read about were from …
a news story / a fictional story / don't remember
Study 2
The events I read about were from …
a news story / a fictional story / don't remember (randomised order)
How confident are you of your answer to the previous question (what type of story you read)?
7-point scale: Not at all; Very much
