Abstract
Age-related decline in theory of mind (ToM) may be due to waning executive control, which is necessary for resolving conflict when reasoning about other individuals’ mental states. We assessed how older (n = 50) and younger (n = 50) adults were affected by three theoretically relevant sources of conflict within ToM: competing self-other perspectives, competing cued locations, and outcome knowledge. We examined which best accounted for age-related difficulty with ToM. Our data show unexpected similarity between age groups when people are representing a belief incongruent with their own. Individual differences in attention and response speed best explained the degree of conflict experienced through incompatible self-other perspectives. However, older adults were disproportionately affected by managing conflict between cued locations. Age and spatial working memory were most relevant for predicting the magnitude of conflict elicited by conflicting cued locations. We suggest that previous studies may have underestimated older adults’ ToM proficiency by including unnecessary conflict in ToM tasks.
Keywords
Previous studies of normal aging and theory of mind (ToM), the ability to infer another person’s thoughts, beliefs, and desires, have produced contradictory findings regarding whether ToM competence declines in older adulthood (Henry et al., 2013; Love et al., 2015; Phillips et al., 2011). Methodological limitations, however, often in the form of tasks that make excess demands on executive functions, may explain this disparity (Love et al., 2015): Executive functions and processing speed typically deteriorate with healthy aging (Salthouse, 1996, 2010, 2012), but cognitive conflict may also be embedded in representing other people’s cognitive perspectives (Austin et al., 2014; Leslie et al., 2004), making it difficult to understand age-related differences in ToM proficiency.
Executive functions are important for an operational ToM (Austin et al., 2014; Vetter et al., 2013). ToM often involves reasoning about other individuals’ beliefs that may differ from our own. Sometimes there is more than one other person, and our own certainty about the right answer may vary. These factors are often confounded in the classic ToM literature. Working memory, attention, and inhibition have been suggested to support mental-state representation through managing conflict (Austin et al., 2014; Leslie et al., 2004). Such conflict could arise from competing cued information, which requires attentional resources to be disengaged from one information source to select another—a typical feature of false-belief paradigms that are used to assess ToM. Likewise, knowledge of an event’s outcome may also interfere with an individual’s ability to reason because of bias toward one’s own, salient self-knowledge—termed a “curse of knowledge” (Birch & Bloom, 2004, 2007; for a review, see Ghrear et al., 2016). To make predictions based on an agent’s false belief, one must inhibit one’s own perspective to adopt the other person’s perspective. Indeed, in healthy adults, false-belief reasoning (compared with true-belief reasoning) is associated with slower, more error-prone behavioral performance (Apperly et al., 2008, 2011). Manipulation of core parameters within the false-belief task demonstrates additional processing associated with “self-perspective inhibition,” suggesting that incongruent self-other cognitive perspectives may create conflict that is distinct from other sources of conflict within ToM tasks (Hartwright et al., 2015; Samson et al., 2005, 2015). However, it is not clear whether the basis of competition in false-belief reasoning is attributable to having a privileged knowledge of reality (KoR) or to the mismatch in the cognitive perspectives of self and other. Moreover, it is unclear whether the source of competition may explain conflicting findings in healthy aging.
In this study, we examined which factors are responsible for variation in the difficulty of reasoning about the beliefs of other people across age groups. We assessed three theoretically relevant potential sources of conflict in ToM: privileged outcome knowledge, congruence of self-other perspectives, and competing cued locations. Research assessing the curse of knowledge suggests that one’s own privileged knowledge can cause interference when judging what other people know (Birch & Bloom, 2004, 2007). Further, when that knowledge differs between self and other, the conflicting self-perspective must be inhibited (Hartwright et al., 2015; Samson et al., 2005, 2015). On this basis, we tested how a participant’s KoR and incongruence between the participant’s “self-perspective” and an agent’s “other perspective” contribute to processing difficulty. These two aspects are confounded in the classic object-transfer false-belief task; consequently, we aimed to disentangle these as candidate sources of conflict. Furthermore, giving the correct answer in false-belief paradigms often requires participants to shift attention between competing locations, typically cued by a participant’s representations of where the object is located and where the other person thinks it is located (see Friedman & Leslie, 2005). This is a feature of many false-belief paradigms, although, unlike differences between self- and other perspectives, it is not an essential feature of false-belief problems. In this study, we deconfounded this factor from effects of KoR in conditions in which alternative locations corresponded to the beliefs of two different agents. Building on research suggesting that incongruent self-other perspectives create conflict distinct from other sources of conflict within ToM (Hartwright et al., 2015; Samson et al., 2005, 2015), we hypothesized that there would be greater cognitive effort associated with holding in mind competing self- and other perspectives than with managing alternate cued locations.
Statement of Relevance
We use our ability to interpret what other people think on a daily basis. Termed theory of mind (ToM), this is an essential facilitator of social interaction. Previous studies suggest that ToM proficiency declines in later life, which is associated with poorer psychological well-being. However, these earlier studies often confounded changes in ToM with broader age-related changes in executive functioning, such as attention, memory, and inhibitory control. There is a consensus that these executive functions decline with age, but this is less clear for ToM because of how ToM and executive functions interact. In this research, we attempted to disentangle the effects of declining executive functions from any age-related changes in ToM. We show how prior studies may have overestimated the decline of ToM in healthy aging because of experimental demands that are not essential for a functioning ToM. This information could inform support for older adults’ psychological functioning.
Furthermore, we aimed to understand how these three sources of conflict are relevant to aging in ToM. When compared with younger adults, older adults demonstrate greater hindsight bias when informed with outcome knowledge (Bernstein, Erdfelder, et al., 2011), more difficulty with managing incongruence between beliefs, and larger biases toward cued locations in false-belief reasoning (Bernstein, Thornton, & Sommerville, 2011). Older adults might therefore experience difficulty with self-perspective inhibition, attending and managing conflict from multiple cued locations, and handling incongruence between beliefs—all aspects pertinent to false-belief representation but not all essential to ToM. Mental-state representation has consistently been shown to recruit different brain systems to nonmental representation (Saxe & Kanwisher, 2003; Saxe & Powell, 2006), but those neural systems for ToM interact with systems for executive functions (Hartwright et al., 2012, 2013, 2016; Mars et al., 2012). Dwindling underlying baseline connectivity of brain regions typically associated with ToM—particularly the temporoparietal junction—only partially predicts poorer performance in older adults (Hughes et al., 2019). Given that prior research shows differentiation between younger and older adults in ToM as a function of executive-function demands (Bailey & Henry, 2008; Bottiroli et al., 2016; German & Hehman, 2006), it is important to better understand how conflict affects ToM processing more broadly and in aging. We therefore assessed whether there is a psychologically relevant age decline in managing competing self-other perspectives or whether older adults are disproportionately affected by methodological confounds, such as the curse of knowledge and the need to manage competing cued locations. By manipulating psychologically relevant parameters within a single ToM task, we evaluated which cognitive components associated with belief reasoning explain age-specific deficits in performance. We also used standardized measures of executive functions to predict the magnitude of conflict elicited by our ToM manipulations to further explore the neuropsychological bases of our results.
Method
Participants
One hundred two adults with no self-reported neuropsychiatric history and normal or corrected-to-normal vision participated in the study. Two participants’ data were excluded: one younger adult because of a methodological issue and one older adult for scoring beyond the cutoff point in a dementia-screening measure. Thus, the final sample comprised 100 participants: 50 younger adults (21 male; age: range = 18–29 years, M = 20.2 years) and 50 older adults (18 male; age: range = 60–79 years, M = 67.9 years). 1 Younger adults were recruited via the university’s research-participation scheme, university noticeboards, and email advertisements to staff; older adults were recruited from the university’s research panel, local interest and hobby groups, university noticeboards, and email advertisements to staff. Younger adults were compensated with either course credit or a small honorarium, and all older adults received a small honorarium for their participation. The majority of the older adults (57%) were educated to the undergraduate degree level or higher (see Section S2 in the Supplemental Material available online). The study was approved by Aston University’s Life & Health Sciences Ethics Committee. All participants gave written informed consent prior to participation.
Statistical power
The PANGEA tool (Westfall, 2015) was used to conduct a post hoc sensitivity analysis. Given our sample size of 50 per group, with 18 repetitions per observation in our primary task, we had approximately 90% power to detect three-way interactions with a small effect size (Cohen’s d = 0.2).
Design
The study design and analysis plan were preregistered on OSF at https://osf.io/dc8ce. All data and project materials can be accessed at https://osf.io/m6rgw.
ToM abilities: false-belief task
The current ToM paradigm was based on the work by Apperly et al. (2011) and Hartwright et al. (2012); it can be downloaded in E-Prime format from https://osf.io/m6rgw (E-Prime Version 2.0; Schneider et al., 2012). The current task was noninferential: Participants were explicitly made aware of other people’s belief states. For a discussion on the equivalence of inferential and noninferential ToM tasks, see Section S3 in the Supplemental Material.
The current ToM task consisted of a three-factor (2 × 2 × 2) design wherein each factor manipulated whether a theoretically based potential source of conflict was high or low (see Table 1). The first factor, termed knowledge of reality (KoR), 2 varied the presence of the participant’s explicit knowledge about reality and was based on prior work suggesting that one’s own self-knowledge can cause interference when representing that of another. The KoR manipulation resulted in a reality-unknown and a reality-known condition. The second factor, termed other-other congruence (OOC), manipulated the congruence of two agents’ perspectives, resulting in a minimal-conflict (congruent) and maximal-conflict (incongruent) condition. The third factor, termed self-other congruence (SOC), concerned the congruence of the participant’s and the target agent’s perspectives, where the presence of conflict between those perspectives was manipulated. As with the OOC condition, this resulted in a minimal-conflict (congruent) and maximal-conflict (incongruent) condition. These latter two factors were based on work suggesting that ToM reasoning is supported by executive selection to resolve competition between salient cues.
Summary of Experimental Factors and Levels
Note: The level of conflict (low vs. high) was theoretically driven. In the OOC and SOC conditions, the agent’s beliefs could be true or false because, unlike the participant, the agent had no KoR.
The three-factor design was formulated into a computer-based task in which participants were required to respond from a target agent’s perspective (ToM trial) or on the basis of what they, themselves, explicitly knew (an antistrategy trial, herein termed a filler). Each experimental trial consisted of a game in which a magician hid a ball in one of three cups and subsequently shuffled the cups away from view. Participants were required to indicate either (a) where a target agent believed the ball was hidden (ToM trial) or (b) where they themselves thought the ball was (filler trial). Each trial comprised a sequence in which each of the two agents indicated where they thought the ball was hidden, plus a clue—which the agents were not privy to—regarding what was inside one of the three cups. For each trial, a response probe was presented that indicated which of the two agents was the target agent and the nature of the response the participant should give (ToM or filler). The fillers were developed to confirm that participants were attending to the clue regarding where the ball really was by responding with the true location of the ball, because this clue permitted differentiation in beliefs between the participant and the agents. Note that the fillers can be solved without ToM reasoning and, therefore, were used only to identify and exclude participants who were not attending to the task appropriately.
Each trial comprised five static images, each of which was followed by a central fixation mark (see Fig. 1a). The first image always depicted a magician shuffling three cups. Three further images were then presented. The order of presentation of these three images was counterbalanced using a Latin square and randomized. One image showed the magician’s hands obscuring the contents of two of the three cups. In the unobscured cup, either a green ball or an X was shown to indicate the presence (green ball) or absence (green X) of the ball, respectively. The participant only ever knew the contents of one cup per trial; consequently, the participant either knew explicitly the true location of the ball (ball shown) or had to infer that it was under one of the two obscured cups (X shown). Two further images depicted one of the agents in front of one of the three cups, indicating which cup that agent believed the ball was located in (both agents’ beliefs were indicated in every trial). Following presentation of the three images, a response probe was shown. This depicted either an image of one of the two agents, requiring the participant to respond with where that agent thought the ball was (ToM trials), or an image of a hand with a finger pointed toward the participant, requiring the participant—on the basis of the earlier clue—to indicate where they themselves thought the ball was (filler trials). Participants used 1, 2, and 3 on the number pad of the computer keyboard to indicate their selected cup (left to right; Cup 1 was coded as 1 on the number pad). The eight experimental conditions were created by manipulating whether the participant knew where the ball was (KoR: reality known vs. reality unknown), whether the two agents’ beliefs about the location of the ball matched or not (OOC: congruent vs. incongruent), and whether the participant’s and target agent’s beliefs about the location of the ball were congruent (true belief) or incongruent (false belief), as outlined in Figure 1b. By crossing these three conditions, we created eight conditions: (a) KoR reality unknown, OOC congruent, SOC congruent; (b) KoR reality known, OOC congruent, SOC congruent; (c) KoR reality unknown, OOC incongruent, SOC congruent; (d) KoR reality unknown, OOC congruent, SOC incongruent; (e) KoR reality known, OOC incongruent, SOC congruent; (f) KoR reality known, OOC congruent, SOC incongruent; (g) KoR reality unknown, OOC incongruent, SOC incongruent; (h) KoR reality known, OOC incongruent, SOC incongruent. Each condition described the state of affairs in relation to the target agent, as indicated by the response probe.

Sample trial sequence (a) and experimental design (b). Each trial comprised five static images, each of which was followed by a 500-ms central fixation mark (indicated here by red stars). The first image always depicted a magician shuffling three cups. The presentation order of the following three images (marked with asterisks) was counterbalanced and randomized. One image showed the magician’s hands obscuring the contents of two of the three cups. In the unobscured cup, either a green ball or an X was shown to indicate the presence (green ball) or absence (green X) of the ball, respectively. Two further images depicted one of the agents in front of one of the three cups, indicating which cup that agent believed the ball was located in. Following the three images, a response probe depicted either an image of one of the two agents, requiring the participant to respond with where that agent thought the ball was (theory-of-mind trials), or an image of a hand with a finger pointed toward the participant, requiring the participant to indicate where they themselves thought the ball was (filler trials). The schematic shows the three experimental factors and how these were achieved. In the knowledge of reality (KoR) condition, if reality was unknown, an X was shown to indicate the absence of the ball from that location; if reality was known, the ball was shown to highlight its true location. The other-other congruence (OOC) condition manipulated whether the two agents’ beliefs about the location of the ball were congruent or incongruent with one another. The self-other congruence (SOC) condition manipulated whether the target agent’s belief about the location of the ball was congruent or incongruent with the participant’s belief about the location of the ball.
The study comprised 18 repetitions of each experimental condition for trials and nine repetitions of each condition for fillers. This resulted in reaction time (RT) and accuracy data for 144 trials of interest and 72 fillers. The number of repetitions of each condition, the location of the ball, and the target agent were counterbalanced across the experiment. Participants completed four counterbalanced experimental blocks, each containing 54 trials. RTs indexed the time taken to respond following the onset of the response probe. Accuracy was indexed as identification of the correct cup, as required by the response probe. Omissions were treated as errors. Overall, the ToM experiment comprised 216 trials, equally split across four blocks (54 per block; block duration = 9 min; each trial = 10 s). Each block comprised 36 ToM trials and 18 fillers.
ToM abilities: self-reported perspective-taking capacity
The Interpersonal Reactivity Index (IRI; Davis, 1980) comprises four self-report subscales: Perspective Taking, Fantasy, Empathic Concern, and Personal Distress. Although we administered the full scale to ensure reliability of the measure, we were primarily interested in data from the Perspective Taking subscale because this is said to be indicative of a participant’s (self-reported) proficiency with taking other people’s cognitive perspectives. This measure did not form part of any preregistered hypotheses but was used for exploratory analyses.
Neuropsychological testing
Participants’ executive functioning was evaluated using the Cambridge Neuropsychological Test Automated Battery (CANTAB Eclipse, Version 6; Cambridge Cognition; Fray et al., 1996). The test battery comprised the Motor Screening Task, used to familiarize participants with the CANTAB system; the Choice Reaction Time (CRT) task, a simple two-choice RT measure encompassing uncertainty; the Stop Signal Task (SST), used to measure response inhibition; the Attention Switching Task (AST), used to assess attention and cognitive flexibility; and the Spatial Working Memory (SWM) task, which measures retention and manipulation of visuospatial information. Data from these tasks did not inform any preregistered hypotheses but were collected for exploratory analyses and to describe the sample characteristics.
Screening
We asked all participants to complete the 10-item Autism Spectrum Quotient (AQ-10; Allison et al., 2012) to screen for suspected autism; a cutoff score of 7 was used to exclude participant data. In addition, older adults also completed a dementia screening using the Mini-Addenbrooke’s Cognitive Examination (Hsieh et al., 2015). Participants scoring 25 or less on the Mini-Addenbrooke’s Cognitive Examination were excluded from the final sample; this resulted in the exclusion of one older adult.
Procedure
First, participants completed a short training session (detailed in Section S4 in the Supplemental Material) followed by two 9-min blocks of the ToM task with a self-paced break between each. After completing the second block, participants completed the IRI followed by a 15-min enforced break. Next, executive functioning was evaluated using the CANTAB. The order of CANTAB testing was as follows: Motor Screening Task, Stop Signal Task, Attention Switching Task, Spatial Working Memory, and Choice Reaction Time task. The Stop Signal Task, Attention Switching Task, and Choice Reaction Time task required the use of a left/right response button box, whereas the Motor Screening Task and Spatial Working Memory task were completed using the CANTAB touch screen. After a further 15-min enforced break, participants completed Blocks 3 and 4 of the ToM task. Lastly, the AQ-10 was administered, and then older adults completed the Mini-Addenbrooke’s Cognitive Examination. The total duration of the session was approximately 3 hr, which allowed for numerous breaks. This time also permitted casual interaction and refreshments to reduce participants’ fatigue and increase engagement (for explicit tests showing no significant cross-group fatigue effects, see Section S5 in the Supplemental Material).
Statistical analysis
All confirmatory analyses were conducted in SPSS (Version 24) and JASP (Version 0.12.2; JASP Team, 2020). The raw data, summary data, and novel test materials used in this study can be downloaded from https://osf.io/m6rgw.
Our primary hypotheses were assessed by running a four-way mixed analysis of variance (ANOVA) on the ToM task data. Age group was entered as a between-subjects factor, and there were three within-subjects factors: KoR, OOC, and SOC. Table 2 outlines the preregistered hypotheses and statistical tests used to assess these.
Preregistered Hypotheses, Predictions, and Associated Tests
Note: The table presents a summary of statements taken from the preregistration. The knowledge of reality (KoR) condition varied the presence of the participant’s explicit knowledge about reality (known vs. unknown). The other-other congruence (OOC) condition manipulated the congruence of two agents’ perspectives, resulting in a minimal-conflict (congruent) and maximal-conflict (incongruent) condition. The self-other congruence (SOC) condition manipulated the congruence between the participant’s and the target agent’s perspectives, resulting in a minimal-conflict (congruent) and maximal-conflict (incongruent) condition. ANOVA = analysis of variance.
Processing costs are inferred on the basis of increased reaction times and error rates.
Results
Sample characteristics
Older adults showed poorer performance across all neuropsychological measures; however, there was no statistically significant difference in self-reported perspective taking in the IRI (see Table 3). Because of equipment failure, no CRT data were acquired for one older adult.
Descriptive Statistics for Younger and Older Adults
Note: Perspective Taking is a self-report measure from the Interpersonal Reactivity Index—the maximum score is 28 (a higher score indicates higher perspective-taking proficiency). The other measures are from the Cambridge Neuropsychological Test Automated Battery. For the Stop Signal Task, reaction time is shown. For the Attention Switching Task, the congruency cost is shown (higher values indicate greater difficulty with managing attentional conflict). For Spatial Working Memory, the strategy score is shown (higher scores represent less strategic performance). For Choice Reaction Time, mean motor-response latency is shown for correct responses.
p ≤ .01. ***p ≤ .001.
ToM task analyses
False-belief task data preprocessing
Prior to statistical analysis, the data were preprocessed as described in the study preregistration (for a breakdown of trials removed, see Section S6 in the Supplemental Material). No participants scored below chance in the filler trials (< 31 based on a binomial probability distribution, p < .05), indicating that all participants were attending to the task and could therefore be included in the subsequent analyses. Next, only the ToM trials (not the filler trials), where a correct response was given, were analyzed. Trials with a response latency of less than or equal to 5 ms were removed, which resulted in two trials being excluded (both older adults from the KoR-reality-known/OOC-congruent/SOC-congruent condition). Finally, RTs that were beyond 2 standard deviations from each participant’s condition mean were removed (322 for younger adults, 309 for older adults; 631 in total). Then, trials with incorrect responses—including null responses—were removed (387 for younger adults, 678 for older adults; total = 1,065 trials, 7.4% of overall data set) from the RT analysis and analyzed separately as the number of errors per condition. Altogether, 1,698 trials (11.8%) were removed prior to analysis: 709 for younger adults and 989 for older adults.
False-belief task RT and error-rate analyses
Our primary hypotheses were tested using a series of factorial analyses conducted on the RT and accuracy data. The condition-mean RTs (see Fig. 2a) were entered into a four-way mixed ANOVA 3 with age (younger vs. older) as a between-subjects factor and three within-subjects factors: KoR (reality unknown vs. reality known); OOC (agents’ beliefs congruent vs. agents’ beliefs incongruent); and SOC (agent’s-participant’s beliefs congruent vs. agent’s-participant’s beliefs incongruent). Similarly, a four-way mixed ANOVA was conducted on the error-rate data (see Fig. 2b). To be consistent with our preregistered design and analysis protocol, we focus on the three repeated measures main effects (KoR, OOC, SOC), the interaction between KoR and SOC, and the relationship between age group and our primary within-subjects manipulations. However, for transparency, all results are reported in Tables 4 and 5.

Mean reaction time (RT) latency (a) and mean percentage of incorrect responses (accuracy; b), separately for each condition. The eight conditions were created from three within-subjects factors: KoR (reality unknown [KoR 0] vs. reality known [KoR 1]); OOC (agents’ beliefs congruent [OOC 0] vs. agents’ beliefs incongruent [OOC 1]); and SOC (agent’s-participant’s beliefs congruent [SOC 0] vs. agent’s-participant’s beliefs incongruent [SOC 1]). Error bars represent ±2 SEM.
Mixed Analysis of Variance (ANOVA) Results for Reaction Time (RT) and Error Rate
Note: KoR = knowledge of reality; SOC = self-other congruence; OOC = other-other congruence.
Significant Post Hoc Comparisons for Analyses of Reaction Time (RT) and Error Rate
Note: The direction of significant effects is shown only for simple main effects where the pairwise difference was statistically significant (p < .05, Bonferroni corrected). Statistically significant three-way interactions were probed using further repeated measures analyses of variance as outlined in the text. KoR = knowledge of reality; SOC = self-other congruence; OOC = other-other congruence.
p < .05. **p < .01. ***p < .001.
Which factors are responsible for variation in the difficulty of reasoning about the beliefs of other individuals generally?
Knowledge of reality
On the basis of the theory that one’s own knowledge can cause interference—a curse of knowledge (Birch & Bloom, 2004, 2007)—we predicted longer latencies and more errors when reality was known than when it was unknown (Hypothesis 1). To assess this, we tested the within-subjects main effect of KoR (prediction: reality known > reality unknown). As detailed in Tables 4 and 5, the effect of KoR was statistically significant; however, contrary to our preregistered hypothesis, results showed that participants were slower when they did not know reality, although there was no statistically significant effect of KoR on accuracy.
Incongruent self-other perspectives
Prior research suggests that, when self-other perspectives differ, one’s own perspective may interfere and would thus need to be inhibited (Hartwright et al., 2015; Samson et al., 2005, 2015). On this basis, we predicted that ToM reasoning would be more effortful when self and other knowledge states were incongruent (Hypothesis 2). To assess this, we tested the within-subjects effect of SOC (prediction: SOC incongruent > SOC congruent). In line with our preregistered hypothesis, results showed that participants were slower and made more errors when self-other perspectives were incongruent than when they were congruent (see Tables 4 and 5).
Salient, conflicting knowledge
Prior research shows that false-belief reasoning is more effortful than true-belief reasoning (Apperly et al., 2008, 2011), although the basis of this competition is unclear: Is it because of a privileged KoR—a curse of knowledge—or the mismatch between self-other perspectives? We tested the prediction that one’s own KoR would interfere when self-other perspectives differed by assessing the two-way interaction between SOC and KoR (Hypothesis 3). We predicted that a main effect of SOC would be qualified by an interaction with KoR; specifically, error rates and response latencies would be increased when self-other knowledge states were incongruent and reality was known.
Contrary to the curse-of-knowledge theory, results showed that the expected two-way interaction was not supported in RT. There was, however, a significant interaction in accuracy between KoR and SOC (see Tables 4 and 5). When reality was unknown, participants were more error prone, representing greater incongruent than congruent beliefs, and when self-other knowledge states were incongruent, errors increased when reality was unknown than when it was known. These two-way interactions were qualified by a three-way interaction between KoR, SOC, and age group in RT and accuracy, which lends support to the curse of knowledge, albeit in a more complex way (see the “KoR, False-Belief Reasoning, and Curse of Knowledge” section).
How does attentional cuing contribute to performance costs on false-belief tasks?
Responding correctly in a false-belief paradigm typically requires a participant to shift attention between competing locations while remembering which outcome maps onto which location. This effortful shifting between competing cued locations is a typical feature of false-belief paradigms, although, unlike incongruence between self- and other perspectives, it is not an essential feature of false-belief problems. We therefore developed a condition that required the participant to keep in mind, and shift between, two locations without the need to represent the target agent’s false belief (OOC-incongruent condition). On the basis of the selection-processing theory of false-belief reasoning (Friedman & Leslie, 2005) and work showing that incongruent self-other perspectives create conflict distinct from other sources of conflict within ToM tasks (Hartwright et al., 2015; Samson et al., 2005, 2015), we hypothesized that there would be greater cognitive effort associated with holding in mind a competing self-other perspective (a false belief; SOC-incongruent condition) than with managing alternate cued locations (OOC-incongruent condition; Hypothesis 4). To assess this hypothesis, we compared performance in these two specific conditions within our ToM task (see Fig. 3). In line with our predictions, results of a paired-samples t test revealed that a greater RT cost, around 32 ms, was observed when participants represented a false belief (KoR-reality-known/OOC-congruent/SOC-incongruent condition; M = 978.53 ms) compared with when there was conflict from managing alternate locations (KoR-reality-known/OOC-incongruent/SOC-congruent condition; M = 946.88 ms), t(99) = 2.61, p = .010, d = 0.083. However, almost 5% more errors were made when participants managed alternate locations (KoR-reality-known/OOC-incongruent/SOC-congruent condition; M = 9.72%) than when they managed a false belief (KoR-reality-known/OOC-congruent/SOC-incongruent condition; M = 5.00%), t(99) = 4.23, p < .001, d = 0.415.

Schematic illustration of (a) the key events in an example false-belief trial, (b) the key events in an example conflicting other-other perspectives trial, and (c) which locations were cued in each step of the event sequence. In the classic false-belief condition (self-other congruence [SOC] incongruent), two locations are cued—where the ball really is and where the agent falsely believes the ball is—creating incongruence in self-other perspectives. In the alternative condition (other-other congruence [OOC] incongruent), two competing locations are cued by two agents with competing perspectives, but unlike the original false belief, the target agent’s and participant’s beliefs are congruent. A typical event sequence might proceed as follows: (1) Cups are shuffled, (2) the reality status of one cup is indicated, (3) the red agent indicates belief status regarding one cup, (4) the blue agent indicates belief status regarding one cup, and (5) the target for participant representation is indicated as the blue agent. (Note that the order of presentation of Event IDs 2, 3, and 4 was counterbalanced and randomized across the full experiment.) The gray shaded area highlights the critical components of a trial that were manipulated to generate either a false belief (a) or a conflicting other-other perspective (b). Which locations were cued in which aspect of the event sequence are shown in (c) to illustrate the number of unique locations cued.
Which factors are responsible for variation in the difficulty of reasoning about the beliefs in healthy aging?
Age-related differences in response speed and accuracy
We predicted that older adults would generally have longer RTs and increased error rates compared with younger adults (Hypothesis 5), which we tested via the between-subjects main effect of age group. This hypothesis was supported; older adults generally provided slower, less accurate responses than younger adults (see Tables 4 and 5).
How do different sources of conflict differentially affect older-adult performance in false-belief tasks?
We predicted that older adults would show greater difficulty than younger adults with self-perspective inhibition, handling incongruence between beliefs, and managing conflict from multiple cued locations (Hypothesis 6). These predictions were tested by assessing the two-way interactions between age group and each of the within-subjects factors: KoR, SOC, and OOC. Contrary to our expectations, however, there were no statistically significant two-way interactions between KoR and age group in RT or error rate or between SOC and age group in RT. Age did, nonetheless, interact with SOC in accuracy and with OOC in both RT and accuracy. As detailed in Tables 4 and 5, simple-effects analyses demonstrated that accuracy was affected in both groups by manipulating other-other perspectives. Notably, the effect was almost doubled in older adults, suggesting greater cognitive burden of OOC on older adults (mean difference between OOC-congruent and OOC-incongruent conditions: younger adults = 3.81%; older adults = 6.61%). Indeed, older adults made 5.44% more errors than younger adults when responding to a conflicting OOC. Further, only older adults showed an effect of self-other perspectives on accuracy (SOC congruent < SOC incongruent; mean difference = 4.33%), suggesting that the main effect of SOC on accuracy was driven by older-adult performance. Just as with OOC, older adults made more errors than younger adults when responding to incongruent SOC perspectives (mean difference between SOC-incongruent and SOC-congruent conditions: younger adults = 6.03%). Regarding RT, both age groups slowed significantly to resolve an incongruent versus congruent other-other perspective (OOC congruent < OOC incongruent); however, contrary to the effect of OOC on accuracy, the effect of OOC on RT was more marked in younger adults than older adults (mean difference between OOC-congruent and OOC-incongruent conditions: younger adults = 71.61 ms; older adults = 24.83 ms).
There were several interaction effects with age group that we had not predicted and should therefore be considered exploratory. As detailed in Tables 4 and 5, there was a significant three-way interaction between age, OOC, and SOC in RT. Two separate repeated measures ANOVAs to evaluate this indicated that the interaction between OOC and SOC was statistically significant in younger adults, F(1, 49) = 41.38, p < .001, η p 2 = .458, but not older adults, F(1, 49) = 1.95, p = .169. In both groups, incongruent self-other perspectives and other-other perspectives were completed more slowly than congruent perspectives. However, pairwise comparisons indicated that younger adults’ RTs were more affected by SOC when two agents’ perspectives were congruent than when they were incongruent and by the congruency of other-other perspectives when self-other perspectives were congruent than when they were incongruent (see Table 6).
Pairwise Comparisons of Reaction Times in Younger Adults
Note: Reaction times are given in milliseconds. All p values are Bonferroni corrected.
There was also a three-way interaction between KoR, SOC, and age group in RT and accuracy. These interactions were interrogated using four more two-way repeated measures ANOVAs and pairwise comparisons (see Fig. 4)—one for each age group separated by RT and accuracy. The interaction between KoR on SOC in younger adults was statistically significant in RT, F(1, 49) = 15.45, p < .001, η p 2 = .240, but not in younger adults’ accuracy, F(1, 49) = .04, p = .836, η p 2 = .001. Conversely, the interaction between KoR and SOC was nonsignificant in older adults’ RTs, F(1, 49) = 0.00, p = .949, η p 2 = .000, but significant in older adults’ accuracy, F(1, 49) = 9.07, p = .004, η p 2 = .156. The interaction effect in younger adults’ RT was due to a more marked slowing when managing a curse of knowledge (mean difference between SOC-incongruent and SOC-congruent conditions: KoR-reality-unknown condition = 73.87 ms; KoR-reality-known condition = 124.52 ms; Fig. 4a). Interestingly, both age groups slowed to manage the curse of knowledge (KoR-reality-known condition: difference between SOC-incongruent and SOC-congruent conditions; see Figs. 4a and 4b), and both maintained within-group accuracy levels between true belief and false belief when reality was known (see Figs. 4c and 4d). However, specific to older adults, a substantial cost to accuracy was associated with representing an incongruent as opposed to a congruent belief when reality was unknown (mean difference between SOC-congruent and SOC-incongruent conditions = 6.67%; see Fig. 4d), suggesting that the two-way interaction between SOC and age was driven by older adults’ poor performance in the reality-unknown/false-belief condition. Indeed, both age groups took longest to resolve this reality-unknown false belief, indicating substantial cognitive demand.

Estimated marginal mean reaction time (RT; a and b) and percentage of incorrect responses (accuracy; c and d) in the self-other and other-other congruence conditions, separately for younger and older adults. Asterisks indicate statistically significant pairwise comparisons (*p < .05, ***p ≤ .001). Error bars represent ±2 SE.
Additional exploratory analyses
KoR, false-belief reasoning, and curse of knowledge
The three-way interaction between KoR, SOC, and age group suggested that our initial interpretation of a curse of knowledge should be revised. When they knew the ball’s location, participants were slower to respond if the agent held a false belief (SOC-incongruent condition) rather than a true belief (SOC-congruent condition), which is consistent with a curse of knowledge. However, as shown in Figure 4, participants were slowest and most error prone overall when reasoning about an agent with a false belief when reality was unknown. Participants experienced the greatest difficulty when representing an agent who falsely believed the ball was at a location it was clearly not, which seems counterintuitive to the curse-of-knowledge hypothesis. To further explore this, we theorized that belief representation in the KoR-reality-unknown condition could pose additional difficulty because the empty location—the cup labeled with an X—should be avoided, which would require additional selection and control processes. Leslie and colleagues (2005) proposed that in the classic object-transfer false-belief task, it is implicit that the target agent wants to find the object. With the present task, regardless of whether the target agent had no awareness of the contents of any of the three locations—as in a typical false-belief task—here, too, it was implicit that the target agent would want to avoid the empty location. When told of an empty location, the participants were bestowed with privileged knowledge of where the ball definitely was not. With the current paradigm, we therefore effectively had two false-belief conditions: the classic false belief, as seen in the original object-transfer task (KoR-reality-known/OOC-congruent/SOC-incongruent condition), and a novel, reality-unknown false belief, because the target agent thought that the ball was somewhere the participant knew for certain it was not (KoR-reality-unknown/OOC-congruent/SOC-incongruent condition). Should the participant need to inhibit their knowledge of the location to be avoided, it would be reasonable to expect greater cognitive costs associated with the novel, reality-unknown false-belief condition (KoR-reality unknown, OOC congruent, SOC incongruent) than the classic false-belief condition (KoR-reality-known/OOC-congruent/SOC-incongruent condition). To test this, we conducted two further paired-samples t tests, taking the data from all participants. Our assertion was supported in RTs (classic false belief: M = 978.53 ms; novel false belief: M = 1,033.99 ms; mean difference = 55.46 ms), t(99) = 5.20, p < .001, d = 0.141, and although not statistically significant in accuracy, the direction was consistent with the RT data, ruling out a speed/accuracy trade-off (classic false belief: M = 5.00% errors; novel false belief: M = 6.45% errors; mean difference = 1.45%), t(99) = 1.83, p = .070. We propose that this pattern could be indicative of a cognitively effortful double inhibition (Leslie et al., 2005).
ToM, aging, and executive function
Correlation analyses suggested that RTs in all conditions were significantly positively correlated with individual differences in two-choice motor-response time (CANTAB CRT; rs = .619–.682), inhibitory control (CANTAB SST; rs = .354–.395), and spatial working memory (CANTAB SWM; rs = .325–.417). No RTs were significantly correlated with attentional capacity (CANTAB AST; rs = .113–.168) or self-reported ToM (IRI Perspective Taking; rs = −.067 to −.148). For error rate, only motor-response time was significantly correlated with all conditions (CANTAB CRT; rs = .243–.434). All correlation coefficients are reported in Section S7 in the Supplemental Material.
To assess which aspects of executive functioning explain the magnitude of conflict introduced within each experimental factor, we derived a cost factor for each of the three factors: KoR, OOC, and SOC. We collapsed across task conditions (KoR, SOC, OOC) and subtracted those conditions within each factor with presupposed high levels of conflict (KoR-reality-known/OOC-incongruent/SOC-incongruent condition) from those with low levels of conflict (KoR-reality-unknown/OOC-congruent/SOC-congruent condition). This was done separately for RT and for accuracy, giving six cost factors per participant. Each cost factor is described in Section S8 of the Supplemental Material.
We ran six separate stepwise multiple regression analyses to predict each of the cost factors. The data met the assumptions for multicollinearity, homoscedasticity, linearity, and autocorrelation in residuals (based on a Durbin-Watson statistic of ~2), and the error terms were normally distributed. The older adults exhibited more extreme cost-factor values; however, all data were included: All participants had passed screenings for dementia and autism, all included ToM data met the performance criteria specified in the preregistration, and the cost-factor measures reflect a summary of an individual’s repeated, consistent behavioral performance over numerous trials. Consequently, we considered that all cost-factor values were representative of typical task performance within a continuum of variability. Each regression analysis included six predictors: age group, the self-reported Perspective Taking measure from the IRI, and the four neuropsychological measures from the CANTAB (SST, AST, SWM, and CRT; see Table 7).
Multiple Regression Results for Reaction Time (RT) and Error-Rate Cost Factors
Note: The table presents results from the most predictive model, as determined by six separate stepwise multiple regression analyses (RT and error rate for each cost factor). No statistically significant predictive models were identified for knowledge of reality. Age group was coded as a dichotomous categorical variable. Perspective Taking is a self-report measure from the Interpersonal Reactivity Index (IRI). The Attention Switching Task (AST), Spatial Working Memory (SWM) task, and Choice Reaction Time (CRT) measures are from the Cambridge Neuropsychological Test Automated Battery (CANTAB).
There were no significant predictors of the KoR cost factor in either RT or accuracy. Consistent with the earlier analyses, the results in Table 7 show that age group was a significant predictor of the OOC cost-factor RT, explaining 7% of the variation, where higher costs to RT were associated with lower age. Further, reduced self-reported ToM (IRI Perspective Taking) and less efficient use of SWM explained around 10% of variation in the OOC cost-factor error rate, and 4% of the increased cost introduced to the SOC cost-factor RT was associated with greater difficulty managing attentional conflict (AST). Moreover, longer baseline RT (CRT) and poorer ToM proficiency could explain 14% of the increased error rate introduced by varying self-other perspectives.
Discussion
Prior research has produced conflicting findings regarding whether ToM declines in healthy aging (Henry et al., 2013; Love et al., 2015; Phillips et al., 2011). To unpack this in the present study, we assessed the role of three theoretically derived sources of conflict: privileged outcome knowledge (KoR), congruence of self-other perspectives (SOC), and competing cued locations (OOC). By assessing a series of preregistered hypotheses, this study highlights two important findings.
Competing self-other perspectives and competing cued locations tap different cognitive mechanisms, which affect younger adults and older adults differently
We predicted that conflicting perspectives would be effortful (Apperly et al., 2008, 2011), particularly for older adults (Bottiroli et al., 2016; German & Hehman, 2006), and that managing incongruent self-other perspectives would be more effortful than competing cued locations (Hartwright et al., 2015; Samson et al., 2005, 2015). Our data partially support these predictions, but the findings were more nuanced than expected. Unexpectedly, both groups showed similar slowing to resolve a competing self-other perspective, and both showed comparable error patterns when resolving the classic false-belief scenario. Conversely, although faster overall, younger adults slowed more than older adults to manage invalid cuing, resulting in them committing substantially fewer errors than older adults. This might first appear to reflect a speed/accuracy trade-off; however, because these behaviors differ through the type of conflict, the data are more consistent with meaningful processing differences between groups. Such speed/accuracy trade-off differences have been shown to have a neurological rather than a strategic basis (Forstmann et al., 2011). Indeed, in addition to divergent behavioral profiles, the cognitive systems coopted to resolve each source of conflict were condition specific: Managing competing self-other perspectives was supported by attentional systems, whereas invalid cuing drew from spatial working memory. Considering this pattern of differences, our work suggests that different mechanisms manage these two sources of conflict. This is consistent with neuroimaging data showing that representing a false belief is functionally distinct from attentional demands because of cuing behaviorally relevant spatial locations (Mars et al., 2012; Scholz et al., 2009; Young et al., 2010).
Our work uniquely shows, however, that because of these different mechanisms, the nature of conflict in ToM differentially affects speed-accuracy response behaviors across age groups. This can explain the appearance of poorer ToM in older adults. Conflicting perspectives were resolved similarly, whereas older adults were less reactive to invalid cuing, resulting in proportionately more mistakes. Our data indicate that individual differences in attentional capacity best explained RT performance when participants managed competing self-other perspectives and that errors reflected limitations in executive functioning and motor-response speed rather than an age-related decline in ToM proficiency per se. Our work highlights that older adults were more susceptible than younger adults to irrelevant cues in a false-belief context and that unnecessary demands on working memory, through the cuing of invalid locations, disproportionately affected older-adult performance. Given the association identified between age, cuing, and working memory, limits on processing speed, which declines with age, may explain this pattern of behavior (Brown et al., 2012; German & Hehman, 2006; Salthouse, 1996). Critically, therefore, prior reviews and meta-analyses of ToM performance in aging should be carefully interpreted: Studies in which such cuing occurs may inflate age-related changes in ToM capacity because of incidental task demands that disadvantage older adults.
Interference from KoR is affected not by age but by the perceived higher order intentions of the other
We hypothesized that KoR could interfere with one’s ability to reason because of bias toward one’s own self-perspective—a curse of knowledge (Birch & Bloom, 2004, 2007)—and that older adults would be disproportionately affected by KoR because prior work suggests that the curse of knowledge is more pronounced in later life (Bernstein, Erdfelder, et al., 2011). However, these predictions were not realized: Participants were faster when they knew where the ball was, and both age groups performed comparably, regardless of KoR. To explore these unexpected findings, we considered studies of belief-desire reasoning, where the target agent may wish to avoid the given target object. Increased difficulty is associated with processing false as opposed to true beliefs and avoid as opposed to approach desires, where a false belief combined with an avoidance desire attracts maximal processing costs (see Apperly et al., 2011; German & Hehman, 2006; Hartwright et al., 2012; Leslie et al., 2005). Mentalizing about an agent with a false belief regarding an empty location would be doubly effortful because participants must inhibit their knowledge that the agent must avoid the location they (falsely) believe to be true (Leslie et al., 2005). Our exploratory analysis supported this assertion: Participants took longer to resolve a reality-unknown false belief compared with the classic false belief, suggesting that KoR itself is not the cause of the curse of knowledge. Instead, participants’ initial internal reference toward the agent’s desire created conflict, in our case, resulting in redirection away from the empty cup. This finding is consistent with work showing that we automatically anticipate that other people’s behavior will fulfill, rather than conflict with, their desires (Ferguson & Breheny, 2011), which suggests that the curse of knowledge is mediated by a perception of the agent’s higher order intentions.
Conclusion
The present study suggests that false-belief reasoning is effortful for older adults beyond the nonsocial cognitive demands of classic ToM investigations. Performance in each of our ToM scenarios paralleled individual differences in inhibitory control and spatial working memory. However, the magnitude of conflict experienced and the cognitive systems coopted to resolve this were condition specific: Managing competing cognitive perspectives was supported by attentional systems, whereas invalid cuing appeared to draw on spatial working memory. Further, older adults were particularly disadvantaged by invalid cuing. This indicates that prior studies may have overestimated the effects of aging on ToM and highlights the need for carefully managing conflict in future studies of aging and ToM.
Supplemental Material
sj-docx-1-pss-10.1177_09567976211017870 – Supplemental material for Sources of Cognitive Conflict and Their Relevance to Theory-of-Mind Proficiency in Healthy Aging: A Preregistered Study
Supplemental material, sj-docx-1-pss-10.1177_09567976211017870 for Sources of Cognitive Conflict and Their Relevance to Theory-of-Mind Proficiency in Healthy Aging: A Preregistered Study by Foyzul Rahman, Klaus Kessler, Ian A. Apperly, Peter C. Hansen, Sabrina Javed, Carol A. Holland and Charlotte E. Hartwright in Psychological Science
Supplemental Material
sj-docx-2-pss-10.1177_09567976211017870 – Supplemental material for Sources of Cognitive Conflict and Their Relevance to Theory-of-Mind Proficiency in Healthy Aging: A Preregistered Study
Supplemental material, sj-docx-2-pss-10.1177_09567976211017870 for Sources of Cognitive Conflict and Their Relevance to Theory-of-Mind Proficiency in Healthy Aging: A Preregistered Study by Foyzul Rahman, Klaus Kessler, Ian A. Apperly, Peter C. Hansen, Sabrina Javed, Carol A. Holland and Charlotte E. Hartwright in Psychological Science
Footnotes
Acknowledgements
We thank the research participants, and we also thank Evelyn Egyir, Elizabeth Essex, and Maria Nazakat for collecting the data. S. Javed, C. A. Holland, and K. Kessler participated in the present research while at Aston University. S. Javed is now at the University of Birmingham, C. A. Holland is now at Lancaster University, and K. Kessler is now at the University of Dublin.
Transparency
Action Editor: M. Natasha Rajah
Editor: Patricia J. Bauer
Author Contributions
F. Rahman, C. E. Hartwright, K. Kessler, I. A. Apperly, P. C. Hansen, and C. A. Holland developed the research concept. F. Rahman and C. E. Hartwright designed the experiment. F. Rahman and S. Javed collected the data. F. Rahman, C. E. Hartwright, and S. Javed analyzed the data. F. Rahman and C. E. Hartwright wrote the manuscript. All the authors approved the final manuscript for submission.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
