Abstract
Successful communication requires speakers and listeners to refer to information in their common ground. Shared history is one of the bases for common ground, as information from a communicative episode in the past can be referred to in future communication. However, to draw upon shared history, communicative partners need to have an accurate memory record that they can refer to. The memory mechanism for shared history is poorly understood. The current study investigated the ways in which memory for shared history is prioritised. Two experiments presented a referential communication task followed by a surprise recognition memory task, with the former task serving as an episode of shared history. Experiment 1 revealed superior memory for information that was both seen in the communicators’ common ground and referred to, followed by information that was seen but not referred to, and finally by information privileged to the participants. Experiment 2 provided a replication of Experiment 1 and further demonstrated that these co-presence effects are not dependent on the presence of a speaker with a different perspective to the participant.
Introduction
“Remember the time when the school bus broke down, and we all had to walk to school?.” Human communicators frequently reminisce about past episodes of social interaction they shared with others. In order for such a reference to the past to be understood by both parties, a speaker needs to have encoded and retrieved the correct pairing between the event and the persons involved. If he misremembered the persons present and the current communicative partner was not present when the school bus broke down, then his reference to a historic episode cannot serve as common ground for the current communication.
Common ground is the information shared between communicative parties. Successful communication critically depends on speakers and listeners referring to information in their common ground (Clark, 1992; Clark & Marshall, 1981). In contrast to common ground, privileged ground is information only known to one of the communicative parties. To correctly identify whether a piece of information is in the common ground shared with our communicative partner, we have to take their perspective into account (e.g. did they experience the school bus breakdown? Did they even know about the breakdown?). Failure to do so could lead us to commit “egocentric” errors by drawing on information that is privileged to ourselves but not available to our communicative partner.
According to communication theory (Clark & Marshall, 1981), there are three primary bases for common ground: physical co-presence, linguistic co-presence, and cultural co-presence/community membership. Physical co-presence refers to the environment shared by both communicative parties, such as shared visual access to the same set of objects (i.e. shared visual perspective). Linguistic co-presence refers to the conversational content that the communicative parties shared, either from a present or past communicative episode. Cultural co-presence/community membership refers to the shared sociocultural background between communicative parties and the common knowledge within the sociocultural community. The common ground we once shared with a communicative partner becomes shared history, which is a mental record that could be referred to in subsequent communications. Shared history could be formed on the bases of prior physical co-presence or prior linguistic co-presence. Objects that were once in the shared view (prior physical co-presence) and/or spoken about (prior linguistic co-presence) could be referred to in subsequent communication. The ability to keep track of shared history and to use it as common ground in present and/or future communications provides richness and continuity in our interactions with others.
Evidence suggests that communicators are sensitive to shared history. Most of the studies to date have focused on historic linguistic information. Brennan and Clark (1996) showed that once communicative parties agreed on a set of referring expressions for specific referents, they would continue to use the agreed expressions as if they had made a conceptual pact. Horton and Gerrig (2002) showed that speakers are sensitive to the distinction between information that had been previously shared versus not shared. Both studies demonstrated speakers’ sensitivity to the explicitly spoken contents from historic communicative episodes they shared with specific listeners. Interestingly, shared history not only retains records of information that has been spoken about, but it also keeps track of perceptual features such as the voices in which names were frequently uttered. Barr et al. (2014) showed that a proper name spoken by a voice that has been frequently associated with the name is recognised quicker than an unfamiliar voice (e.g. my brother-in-law’s name uttered by my sister’s voice would be recognised quicker than the same name uttered by a stranger’s voice). It is evident that shared history contains a record of historic linguistic information, which was an explicit part of the discourse between co-present communicators. However, at present little is known about the effect of co-presence on memory records for shared history. The current study set out to address the following questions: how does the memory system prioritise the information it encodes and retains for shared history? Do we remember information in the common ground better than that in the privileged ground because it is shared with others? Or is information privileged to us more salient to ourselves and is hence better remembered?
The literature on referential communication suggests that there are at least three possible outcomes: First, previous findings of high rates of egocentric errors in referential communication (e.g. Apperly et al., 2010; Keysar et al., 2003; Wang et al., 2016, 2019, 2020) may reflect the ways in which communicators encode a communicative episode: from their own perspective. This is in line with an anchoring and adjustment hypothesis (e.g. Epley et al., 2004), which suggests that perspective-taking is anchored on one’s own perspective, followed by a gradual and usually insufficient adjustment towards others’ perspectives. If an egocentric perspective is employed at least at both the start and end of referential communication, then it is conceivable for information in the privileged ground to be better remembered than that in the common ground. Second, according to Clark and Marshall’s (1981) communication theory, communicative parties code the specific instances of co-presence of their communicative partners and the respective common ground. Therefore, one should expect common ground information to be prioritised and retained in shared history. In addition, several studies have demonstrated listeners’ preference for attending physically co-present information at early stages of referential communication (e.g. Barr, 2008; Hanna et al., 2003; Nadig & Sedivy, 2002). It is possible—but has not been demonstrated—that time spent attending to physically co-present information at early stages of communication later translates to better memory records for common ground. Finally, Horton and Gerrig (2005, 2016) propose a low-level domain-general memory mechanism that does not specifically code the co-presence of communicators and contents in their common ground. Instead, they suggest that we likely operate with a parsimonious memory mechanism that retains episodic traces in a similar way to non-communicative episodes. It is therefore possible for the memory record for shared history to show no specific preference for common ground or privileged ground.
The current study examined memory records for information in the common ground versus the privileged ground from a communicative episode in the past. We employed a commonly used referential communication task, also known as the director task (e.g. Apperly et al., 2010; Keysar et al., 2003; Wang et al., 2016, 2019), to serve as an episode of communication. During the director task, a director takes on the role of a speaker, while a participant plays a listener. The director delivers spoken instructions to the participants, who are required to identify objects referred to by the director. The objects that are being spoken about are positioned on a shelf, and that the director and the participants are situated on opposite sides of the shelf, giving them differing views of the shelf. In order for participants to identify the correct referent, they have to take the director’s perspective into account while following her instructions.
The director task was followed by a surprise memory task in which participants were required to make recognition judgements about objects that they have seen from the director task. Participants were not forewarned about the memory task as we aimed to capture a naturally occurring memory record that resembled the ways in which shared history might be encoded in real life. The objects seen in the director task bore one of three types of ground status: 1. Visual + linguistic common ground objects (CGV + L) were seen by both the director and the participants, and referred to. 2. Visual common ground objects (CGvis) were seen by both the director and the participants but not referred to. 3. Privileged ground objects (PG) were only seen by the participants, not the director (see Figure 1). All objects in the open slots were seen by both the director and the participant, therefore they were in their common ground via physical co-presence. A subset of the objects in the common ground were referred to, and were therefore both physically and linguistically co-present. In contrast, all the objects in the slots blocked with a green background from the director’s view were only visible to the participants. Therefore they are in the participants’ PG. The inclusion of all three types of objects status allowed us to directly compare the strength of their memory records.

An example of an image from the director task. The director's task served as a communicative episode. The surprise memory task presented participants with objects of various ground status from the director task. In this example, the director may refer to the apple, the bucket, and the short plant, which means that these objects are in the visual plus linguistic common ground. The tall plant, the doll house, and the stapler were seen by both parties but not referred to, meaning that these objects are in the visual common ground. The drum, and bowling pin, and the duck were not available to the director; therefore, they were in the participants’ PG. Please note that the speech bubble and the condition labels are included here for illustration purposes; they were not presented to the participants.
Two experiments were conducted to examine the strength of the memory records for prior common ground established via physical co-presence and linguistic co-presence. According to Horton and Gerrig’s (2005, 2016) low-level domain-general memory mechanism, no specific memory encoding preference should be allocated to common ground or PG. Therefore, one should predict no difference between the conditions corresponding to the three ground statuses. In contrast, if listeners were paying more attention to information in the common ground during communication (e.g. Barr, 2008; Hanna et al., 2003; Nadig & Sedivy, 2002), then CGvis objects should leave a stronger memory record than PG objects. Furthermore, listeners may differentiate between common ground that was established through physical co-presence (CGvis) versus linguistic co-presence (CGV + L). In this case, memory would likely be stronger for common ground information established through linguistic co-presence. This is because linguistic co-presence is frequently accompanied by physical co-presence, providing opportunities for multimodal encoding to enhance memory (Paivio & Csapo, 1973; Thompson & Paivio, 1994). Finally, information in the PG could appear to be more salient to participants than common ground (e.g. Epley et al., 2004). In this case, PG objects should be better remembered than CGvis objects.
Experiment 1
The current experiment examined the relative memory record for objects in the visual plus linguistic common ground (CGV + L), the visual common ground (CGvis), and PG. A variation of the director task was employed to serve as a historic communicative episode. This was followed by a surprise memory task, presenting objects of the three types of aforementioned ground status.
Methods
Participants
Thirty-two students (24 female, mean age 21.63 years, age range 17 to 32 years) from the University of Birmingham participated in this study in return for study credits. All participants had normal colour vision and normal or corrected-to-normal visual and auditory acuity. A power analysis suggests that the minimum number of participants required to detect a medium-sized main effect from a one-way repeated-measures analysis of variance (ANOVA) (f = 0.25) with 0.8 power is 28, and we recruited 32 participants to ensure sufficient power was achieved after any participant elimination. The study received ethics approval from the Ethics Committee at the University of Birmingham (reference: ERN_09-719).
Design and procedure
Director task
In this variation of the director task, participants were given instructions that included a comprehensive example to illustrate the ways in which they should use the director’s perspective (as reported by Wang et al., 2020). The instructive example has been shown to minimise the number of egocentric errors participants commit. This was critical for ensuring that information presented in the common ground versus PG was reliably interpreted as being in the common ground versus PG, respectively. Otherwise, the high rates of egocentric errors typically observed in the director task (e.g. Apperly et al., 2010; Wang et al., 2019) could be interpreted as participants’ erroneous encoding of PG information as being in the common ground. The wording of the director task instruction and the experimental design employed in the current experiment closely followed that of Wang et al. (2020) with the following exceptions: First, the total number of objects on the shelf was fixed to 12. This was to ensure that there were a sufficient and identical numbers of objects in the CGvis, CGV + L, and PG. Second, the duration of a trial was fixed to 8,000 ms regardless of participants’ response time, to ensure that participants had identical encoding time. Third, the number of instructions the director delivered to accompany each shelf image was fixed to 4, rather than varying 3 to 5. For half of the shelf images (eight images), one of the four instructions was a critical instruction, which required participants to use the director’s perspective to resolve reference (e.g. when instructed “move the short plant one slot right,” participants must account for the director’s perspective and select the object marked Y, even though their own perspective could lead them to consider object X). The non-critical instructions could be resolved without explicit consideration of the director’s perspective (e.g. “move the red apple one slot up” straightforwardly refers to an object in the common ground). For the other half of the shelf images (eight images), all four instructions were non-critical. Finally, to reduce the chances of overwhelming participants with a full director task, an abbreviated version was presented, encompassing two blocks of eight shelf images each.
Memory task
After completing the director task, participants were given a surprise memory task in which they were asked to recognise the objects that had been presented on the shelf. Participants were only presented with the objects associated with the non-critical instructions, as it is unclear whether the perspective-taking demand associated with the critical instructions would alter memory record (this will be addressed in Experiment 2). The memory task consisted of 48 true probes and 48 foil probes. The 48 true probes consisted of images of 16 CGV + L objects, 16 CGvis objects, and 16 PG objects. The 48 foil probes were everyday items from the same broad categories as the true probes. The 96 probes were presented in a random order.
Each trial began with a fixation cross, which was displayed centrally for 1,000 ms. This was followed by a probe image, to which participants were instructed to make a yes/no response. Participants were instructed to press the right arrow key labelled “yes” if they recognised the object presented as being from the director task, or press the left arrow key labelled “no” if they did not recognise the object as being from the director task. The probe image remained on the screen until a response was detected. Participants were instructed to respond as accurately as possible. Both the director task and the memory task were presented with E-prime 2 software (Psychology Software Tools, Pittsburgh, PA) on a desktop computer.
Results
Director task
The director task primarily served as an encoding phase for the subsequent memory task. We checked participants’ performance on the eight critical instructions to ensure that instructions were understood and followed. Of the 32 participants, 1 participant committed egocentric errors on seven trials out of a maximum of 8. This high rate of egocentric errors was likely to be due to a misunderstanding of the task or a lack of concentration; therefore, this participant’s data were excluded prior to analyses. The remaining 31 participants performed with high levels of accuracy, with four participants making 1 egocentric error each, and others making no egocentric errors. This very low rate of egocentric errors is consistent with the findings of Wang et al. (2020), who found that a much more prescriptive set of task instructions resulted in far fewer egocentric errors than have often been observed on this task.
Memory task
Response accuracy to the foil probes was 92.94% (SD = 6.08%). Participants’ response accuracy for the total set of true and foil probes was checked against chance level, to ensure that we did not include participants who performed at or below chance level overall. To be significantly above chance level, 57 correct responses out of 96 trials were required. All participants met this criterion. Importantly, this means that even performance near 50% on true probes (comprising our main data) could not reflect a strategy of guessing.
Proportion correct scores from true probes were calculated for each condition and participant (see Figure 2). The normality and sphericity of the data were checked to ensure that an appropriate analytical approach was taken. The absolute values for skewness ranged from 0.03 to 0.99; the absolute values for kurtosis ranged from 0.07 to 0.9, with Mauchly’s test of sphericity indicating no significant violation of the sphericity assumption for ANOVA (p = .354). Given that the assumptions for ANOVAs were met, a one-way repeated-measures ANOVA was conducted with object ground (CGV + L, CGvis, PG) as a within-subject factor. A main effect of object ground was observed, F(2, 60) = 74.37, MS = 1.16, p < .001, ηp2 = .713, BF01 < .001. BF01 reports the Bayes factor in favour of the null hypothesis (H0) than the experimental hypothesis (H1). It is a probabilistic assessment of the number of times that the null hypothesis is more favourable than the experimental hypothesis. When BF01 is larger than 3, it indicates that the null hypothesis is more favourable. When BF01 is between 0.33 and 3, it indicates that neither the null hypothesis or the experimental hypothesis is clearly favoured. When BF01 is smaller than 0.33, it indicates that the experimental hypothesis is more favourable. BF01 reflects a probabilistic approach, providing an additional assessment of the robustness of the null effect and any significant effects. 1 The BF01 reported here was calculated by comparing the Bayes Information Criteria (BIC) from a model that contained the effect of interest versus a model that did not contain the effect of interest. While a Bayesian approach allows researchers to input previous knowledge about the probability of a certain effect into the model and influence BF, we took a data-led approach and employed a uniform distribution (i.e. the BF was solely determined by the BIC from the models). Two planned paired t-tests were carried out on CGV + L versus CGvis and CGvis versus PG, respectively, to examine the respective effects of linguistic co-presence and physical co-presence. Planned comparisons revealed that CGV + L objects were remembered significantly better than CGvis objects, t(30) = 9.40, p < .001, Cohen’s d = 1.69, BF01 < .001. CGvis objects were remembered significantly better than PG objects, t(30) = 2.38, p = .012, Cohen’s d = 0.43, BF01 = 0.235. Both comparisons remained significant after applying Bonferroni corrections, which adjusted alpha to .0125.

Proportion accuracy scores from Experiment 1. The lighter coloured bands represent 95% confidence intervals.
Discussion
This experiment set out to investigate the effects physical co-presence and linguistic co-presence each have on the memory record for shared history. We compared recognition accuracy for CGV + L objects, CGvis objects, and PG objects seen from a communicative episode. The current results showed that CGvis objects were better remembered than PG objects, revealing that objects of prior physical co-presence left a stronger memory record than those that were not physically co-present. In addition, CGV + L objects were better remembered than objects in the CGvis, indicating that prior linguistic co-presence enhances memory records above and beyond prior physical co-presence. These findings suggest that the common ground formed on the basis of physical co-presence alone was sufficient to promote memory records for CGvis objects, providing a pathway for the formation of shared history. The addition of linguistic co-presence boosted memory for CGV + L objects further than that for CGvis objects, ensuring that objects spoken about in a historic communicative episode could serve as common ground in future communications.
The superior memory for CGvis objects over PG objects suggests that their common ground status likely enhanced their salience and subsequent encoding. This is consistent with the findings that listeners show a preference to attend to common ground information at early stages of communication (e.g. Barr, 2008; Hanna et al., 2003; Nadig & Sedivy, 2002). Listeners’ preference to attend and encode information in the common ground highlights their sensitivity to the distinction between common ground and PG, much like speakers of a conversation (Horton & Gerrig, 2002). However, the degree to which common ground is prioritised may be dependent on the broader context. The memory task from the current experiment only presented objects from non-critical trials, where perspective-taking was not required. It is possible that the process of perspective-taking could alter the coding of shared history, as resolving a conflict between participants’ own perspective and the director’s perspective may enhance the salience of either or both perspectives. In Experiment 2, we contrasted the context in which an object is encountered: either with a non-critical instruction, where there is no need to consider the director’s perspective, or with a critical instruction, where the director’s perspective had to be used to resolve the reference.
The current experiment also observed that linguistic co-presence leaves listeners with a strong memory record for objects previously spoken about. This is consistent with the findings that speakers keep track of the referring expressions used with specific speakers in historic communications (Brennan & Clark, 1996). This indicates that past episodes of verbal communication provide a clear basis for shared history for both speakers and listeners. However, the driver of these effects could still be a domain general one (Horton & Gerrig, 2005, 2016). For example, the CGV + L objects could benefit from dual-coding from both visual and auditory memory (Paivio & Csapo, 1973; Thompson & Paivio, 1994). The CGV + L objects were also acted upon, and the participants’ actions may have led to an increased salience for the CGV + L objects, and hence promoting their memory record. Experiment 2 sought to replicate our key novel findings and refine their interpretation.
Experiment 2
In addition to testing the replicability of the results of Experiment 1, the current experiment set out to address two sets of questions: First, are the effects of linguistic co-presence and physical co-presence observed in Experiment 1 driven by the presence of a speaker with a different perspective to the participant? To address this question, Experiment 2 contrasted memory for items in the test array after playing the game in the presence or absence of a director in the scene. The “director condition” was a replication of Experiment 1. In the “no-director” condition, participants heard identical instructions, but there was no director in the scene, and so there was no perspective difference for participants to take into account when following the instructions. Instead, a logically identical set of contingencies was created by instructing participants with the arbitrary rule to discount all objects in the slots with green backgrounds. This procedure closely resembles that of Experiment 3 of Apperly et al. (2010). If the presence of a speaker with a different perspective is necessary for the effects observed in Experiment 1, then the absence of a speaker with a different perspective in the No-Director condition of Experiment 2 should eliminate these effects.
Second, we investigated whether memory effects vary if items were referential competitors for an object that was mentioned in an instruction. We contrasted memory records for objects that were associated with critical versus non-critical instructions to address these questions. In the Director condition (and as in Experiment 1), critical instructions present a conflict between the participants’ perspective and the director’s perspective. To correctly select the object of the director’s reference, participants need to resolve the conflict between their own perspective and the director’s perspective. For instance, in Figure 3, a critical instruction, “move the small ball one slot up” best describes the small red ball from participants’ own perspective; however, the small red ball is in participants’ PG. Therefore, the director had to be referring to the green ball, which is the smaller ball of the two balls in the common ground. According to the anchoring and adjustment hypothesis (e.g. Epley et al., 2004), perspective-taking begins with an anchor on one’s own egocentric perspective and a series of adjustments towards the director’s perspective. According to this hypothesis, not only will PG objects be attended to, they will also be attended to prior to common ground objects. Therefore, it is possible that in the context of a critical instruction and the presence of a director (which requires resolution of conflict between perspectives), memory for PG objects will be more robust than PG objects seen on non-critical trials. Such effects should be reflected in an interaction between Criticality, Director, and Ground.

An example of the non-critical instruction and critical instruction from Experiment 2. The red textboxes provide a mapping of the conditions for the memory task, which followed the director task. Each shelf image was accompanied by three instructions (two non-critical and one critical), hence presenting three objects from each of the object ground status (three CGV + L, three CGvis and three PG). Six objects from each grid image were presented in the memory task, three of which were non-critical, and the other three objects were critical.
Methods
Participants
Forty-two students from Lancaster University participated in this study in return for study credits or a small honorarium. All participants had normal colour vision and normal or corrected-to-normal visual and auditory acuity. Twenty-two participants were assigned to the director condition (11 female, mean age 19.54 years, age range 18 to 26 years); 20 participants were assigned to the no-director condition (9 female, mean age 19.50 years, age range 18 to 23 years). There was no significant difference between participants’ age in the two groups, t(40) = 0.09, p = .930, Cohen’s d = 0.03, BF01 = 3.29, nor was there a significant difference in the gender distribution, χ2(1, N = 42) = 0.31, p = .580, BF01 = 3.11. Our initial power analysis suggested that the minimum number of participants required to detect a medium-sized effect for a 2 × 2 within-between interaction (f = 0.25) with 0.8 power is 34, and we recruited 42 participants to ensure sufficient power was achieved after any participant elimination. However, this decision was made in ignorance of recent recommendations (e.g. Brysbaert, 2019), which highlighted the problematic assumptions underlying common approaches to power calculation and uses simulations to estimate appropriate sample sizes, taking into account not only the need to detect an interaction but also the post hoc tests that are typically required to interpret the interaction. This approach would suggest a minimum sample of 90 participants to detect the same interaction. Given that our most important questions concerned a replication of the effects of Ground in Experiment 1, and whether similar results might be obtained in the No-Director condition, we will present findings from the full factorial analysis of the data but limit our conclusions to these aspects of the study, for which we had adequate power.
Design and procedure
Participants undertook one of the two variations of the director task before a surprise memory task, just as in Experiment 1. Both tasks were adapted to address the research questions at hand.
Director task
The director task was modified to match the design of previous versions of the task (e.g. Apperly et al., 2010; Dumontheil et al., 2010; Wang et al., 2019; Zhao et al., 2018) more closely than Experiment 1, while retaining the revised instructions from Wang et al. (2020). The current experiment differed from Experiment 1 in the following ways: First, the total number of objects on the shelf was fixed to 9 instead of 12, presenting a matched number of the three types of object ground status (CGV + L, CGvis, PG), while lowering the processing load per display. Relatedly, the number of audio instructions presented to accompany each shelf image was now three instead of four. Second, the duration of a trial was no longer fixed at 8,000 ms. Instead, the display was shown until response, and that participants’ response will bring forward the next display and/or audio instruction. This will allow participants to progress through the experiment at their own pace, in a similar way to real life conversations, and to minimise the chances of participants getting frustrated with the long intertrial duration. Finally, the current experiment reverted back to presenting a full-length director task, which comprised of four blocks of eight shelf images each.
Memory task
A 2 × 2 × 3 mixed factorial design was employed for the memory task, with director (director, no director) as a between-participant factor, criticality (critical and non-critical) and object ground (CGV + L, CGvis, PG) as within-participant factors. Four blocks of 48 trials were presented, which included 96 true probes and 96 foils in total. A fixation cross was presented for 500 ms, followed by a probe image that required a “yes/no” recognition response. The probe image remained on the screen until a response was detected. Participants made the recognition response with their left hand on the computer keyboard using keys X and Z for “yes” response and “no” response, respectively). Participants responded to a 7-point confidence rating sliding scale with a computer mouse in their right hand. The confidence rating scale remained on the screen until a response was detected.
Results
Director task
The director task primarily served as an encoding phase for the subsequent memory task. We checked participants’ performance on the 16 critical instructions to ensure that instructions were understood and followed. Of the 42 participants, only 4 participants committed 1 egocentric error each. The remaining 38 participants made no egocentric errors. Since the task was self-paced, it was possible for participants to take longer on critical trials than non-critical, leading to a longer encoding time. Response time (RT) on critical and non-critical trials were compared against each other, revealing no significant difference between the two conditions (p = .658, BF01 = 5.46). This pattern of result was observed in both the director condition (p = .663, BF01 = 4.11) and the no-director condition (p = .844, BF01 = 4.34).
Memory task
Participants’ response accuracy was checked against chance level, to ensure that we did not include participants who performed at or below chance level. To be significantly above chance level, 108 correct responses out of 192 trials were required. No participants fell short of this threshold, therefore, there no data exclusion was made on this basis. Response accuracy to the foil probes was 93.30% (SD = 6.04%).
Proportion correct scores from true probes and mean confidence ratings from correct and seen trials were calculated for each condition and participant (see Figure 4 and Table 3). The results from the director-non-critical conditions were first examined to ensure that findings from Experiment 1 were replicated following the modifications to the current design. The normality and sphericity of the director-non-critical conditions data were checked to ensure that an appropriate analytical approach was taken. The absolute values for skewness ranged from 0.19 to 1.29; the absolute values for kurtosis ranged from 0.06 to 1.76, with Mauchly’s test of sphericity indicating no significant violation of the sphericity assumption for ANOVA (p = .873). Given that assumptions for ANOVAs were met, a one-way ANOVA was carried out with object ground (CGV + L, CGvis, PG) as a within-participant factor. A main effect of object ground was observed, F(2, 42) = 116.32, MS = 1.66 p < .001, ηp2 = .847, BF01 < .001. Two planned paired t-tests were carried out on CGV + L versus CGvis and CGvis versus PG, respectively, to examine the respective effects of linguistic co-presence and physical co-presence. Planned comparisons revealed that CGV + L objects were remembered significantly better than CGvis objects, t(21) = 9.21, p < .001, Cohen’s d = 1.96, BF01 < .001. CGvis objects were remembered significantly better than PG objects, t(21) = 4.13, p < .001, Cohen’s d = 0.88, BF01 = 0.007. This pattern of results replicated that of Experiment 1; therefore, we proceeded to analyse the full dataset from the current experiment.

Proportion accuracy scores from Experiment 2. The lighter-coloured bands represent 95% confidence intervals.
The normality, sphericity, and homogeneity of the full data set were checked to ensure that an appropriate analytical approach was taken. The skewness and kurtosis values were reported in Table 1, which revealed a kurtosis value higher than the normal range in the no-director critical CGV + L condition only. Mauchly’s test of sphericity indicated no significant violation of the sphericity assumption for ANOVA (all ps < .119). However, Levene’s test for equality of variances indicated a violation of the homogeneity assumption in the non-critical CGV + L condition (p = .027) and a near violation in the critical CGvis condition (p = .071), all other conditions met the assumptions for homogeneity (all ps > .101). Data transformation (z-scores, square root, logarithmic, exponential) was attempted to reduce the inequality of variance and improve the distribution of the data in the non-critical CGV + L condition. However, the assumption for homogeneity remained violated after all of the above transformations (p = .034 for the exponentially transformed data). This is likely due to the near-ceiling effect observed in the non-critical CGV + L condition (see Figure 4). We proceeded with a mixed ANOVA with caution, specifically noting that violation of the homogeneity assumption could inflate the chances of Type I error (Caldwell et al., 2019) in the contrasts between the director and no-director non-critical CGV + L conditions. Therefore, any significant effect involving the director factor in the non-critical CGV + L condition should be treated with caution.
Skewness and Kurtosis values for each condition in Experiment 2.
A 2 × 2 × 3 mixed ANOVA was conducted with director (director and no director) as a between-participant factor, criticality (critical and non-critical) and object ground (CGV + L, CGvis, PG) as within-participant factors. The analysis output for both proportion accuracy and confidence rating is summarised in Table 2. Results from confidence ratings showed high degrees of consistency with analysis on proportion accuracy, revealing little additional information about any implicit memory strategy. The descriptive statistics for confidence rating can be found in Table 3. Henceforth, we focused the statistical report and discussion on the results from proportion accuracy.
Summary of ANOVA analyses on the proportion accuracy and confidence rating data.
Mean confidence ratings on trials with correct recognition responses.
Standard deviation within parentheses.
No significant main effect of director (director vs. no director) was observed. A significant main effect of criticality was observed. A significant main effect of object ground was also observed, and the current sample size allows for this effect to be interpreted with confidence. The interaction between director and criticality was not significant. The interaction between director and ground was significant, as was the interaction between criticality and object ground (see Table 4a and 4b). The three-way interaction of director*criticality*object ground was not significant.
Mean proportion accuracy. Standard deviation within parentheses.
As described earlier, the interaction effect between Director and Ground was likely to be underpowered and must be interpreted with caution. Descriptively, the CGV + L, CGvis, and PG trials showed the same order of relative difficulty in the Director and No-Director conditions, with the most notable difference being relatively high performance on PG trials in the No-Director condition (see Table 4a). Our key question was whether the presence of a director with a different perspective was necessary for the effects of Ground observed in Experiment 1 (and in the Director condition of Experiment 2). We therefore conducted a one-way ANOVA in the No-Director condition, for which we had adequate power to detect effects. The main effect of Ground was significant: F(2,38) = 137.66, p < .001, η²p = .879, BF01 < .001. We performed the same comparisons that had been planned for Experiment 1 and the Director condition of Experiment 2, which revealed the same pattern of differences: CGV + L objects were remembered significantly better than CGvis objects, t(19) = 13.81, p < .001, Cohen’s d = 3.09, BF01 < .001; and CGvis objects were remembered significantly better than PG objects, t(30) = 4.79, p < .001, Cohen’s d = 1.07, BF01 = .004. Both comparisons remained significant after applying Bonferroni corrections, which adjusted alpha to .0125.
The interaction effect between Criticality and Ground was likely to be underpowered, and did not correspond with our predictions. Descriptively, performance on the CGvis and PG trials tended to be higher in the Critical condition than in the non-critical condition, whereas performance on CGV + L trials was more consistent and numerically lower in the Critical condition (see Table 4b).
Discussion
The current experiment replicated Experiment 1’s findings, and additionally addressed two questions: (1) Whether “co-presence effects” are dependent upon the presence of a speaker with a differing perspective? (2) Whether memory effects vary if items were referential competitors for an object that was mentioned in an instruction.
In relation to the first question, the fact that there were significant differences in memory for objects in common versus PG in both the Director and No-Director conditions suggests that these differences were not uniquely dependent upon the presence of a speaker with a different perspective from the participant. Instead, similar effects could be generated by having participants select or reject referents for the instructions according to a rule that they should ignore items in the closed slots on the shelf. This suggests that the observed memory effects were not specific to the demands of perspective-taking during communication, and likely reflected generic memory effects resulting from prioritisation of attention to different items in accordance with the task demands of the two conditions.
In relation to whether memory effects varied if items were referential competitors for an object that was mentioned in an instruction, we did not observe the three-way interaction between Director, Criticality, and Ground, corresponding to the most direct prediction from this hypothesis. Given the very limited effects of the presence or absence of the director in any condition, the two-way interaction between Criticality and Ground does have the potential to inform this question. However, both the absence of the three-way interaction and the presence of the two-way interaction must be treated with caution because of the limited power of Experiment 2. Descriptively, both PG 2 and CGvis objects encountered on critical trials were remembered better than their counterparts on non-critical trials. If replication in a larger sample showed this to be robust, the pattern would suggest that a listener’s memory for objects during a discourse is not only influenced by whether the objects were mentioned (linguistic co-presence), or by whether they were in common ground (physical co-presence), but also by the additional processes involved in resolving reference, such as overcoming one’s own privileged perspective (e.g. Epley et al., 2004).
General discussion
Two experiments set out to investigate the ways in which memory prioritises information it encodes and retains in shared history. We have done so by examining the memory records for information in the common ground and PG from a past communicative episode.
Experiment 1 revealed a superior memory record for CGV + L objects compared to CGvis objects. This indicates that linguistic co-presence had a positive effect on memory. In addition, CGvis objects were found to be better remembered than PG objects, consistent with an effect of physical co-presence on memory. Experiment 2 additionally demonstrated that the effects of linguistic co-presence and physical co-presence were not specifically driven by receiving instructions from a speaker whose perspective differs. In the “No-Director” condition, there was still a “communicative context” insofar as there was a speaker issuing instructions that the participant was required to follow but there was no conflict between the speaker’s and participant’s perspectives. However, there remained potential ambiguity in the speaker’s instructions and thus potential competition between referents for their instructions that had to be resolved according to the rule that participants should ignore items in the slots with green background on the shelf. Similar memory effects were observed irrespective of whether the speaker had a different perspective from the participants, suggesting that these effects arise from domain-general processes of attention and memory, and not domain-specific processes related to common ground. Nevertheless, it is important to recognise that the reason for variation in attention to items in the Director condition was because of the director’s differing perspective. That is to say, the findings are still informative about common ground effects on memory, even if the effects are not uniquely associated with common ground.
The current results point towards a domain-general mechanism, supporting Horton and Gerrig’s (2005, 2016) memory-based processing approach. Such a memory system likely relies on routine and non-specialised memory traces formed by semantic or episodic associations. The current findings additionally demonstrated that the memory records of shared history are shaped by the salience and multimodal richness of the information. This provides no evidence to support claims that communicators code historic communicative episodes in terms of specific instances of co-presence between communicative partners and their respective common ground (Clark & Marshall, 1981). This also suggests that a blanket preference for encoding either common ground or PG information is unlikely. The current finding is also consistent with the idea that social processes need not be underpinned by uniquely social mechanisms (e.g. Heyes, 2014) and that social and non-social reasoning may even recruit overlapping brain regions (Spunt & Adolphs, 2015). Of course, the current results do not demonstrate that domain-general mechanisms are sufficient to account for all social processes. It is possible that effects observed in the social domain reflect the role social information plays in directing attention to salient information, which is later retained in a domain-general memory store.
In conclusion, the current study revealed that listeners retained information about a past communicative episode in the absence of any explicit requirement to do so. This memory record for shared history serves as an important basis for common ground for future communications. The current study revealed the ways in which the shared history memory system prioritises the information it encodes and retains. Information in the common ground by physical co-presence and linguistic co-presence becomes more memorable than information in the PG. However, the preference to attend common ground is likely driven by a domain-general memory mechanism shared across social and non-social contexts. The current study provided important insights into key features of the memory system for shared history that will critically inform our understanding of referential communication and its interaction with the extended cognitive faculties.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
