Abstract
Background and aims
Pragmatic language is a key difficulty in autism spectrum disorder. One such pragmatic skill is verbal reference, which allows the current entity of shared interest between speakers to be identified and thus enables fluid conversation. The aim of this review was to determine the extent to which studies have found that verbal reference is impaired in autism spectrum disorder. We organise the review in terms of the methodology used and the modality (production versus comprehension) in which proficiency with verbal reference was assessed. Evidence for the potential cognitive underpinnings of these skills is also reviewed.
Main contribution and methods
To our knowledge, this is the first systematic review of verbal reference in autism spectrum disorder. PsychINFO and Web of Science were systematically screened using the combination of search terms outlined in this paper. Twenty-four studies met our inclusion criteria. Twenty-two of these examined production, whereby the methodology ranged from elicited conversation through to elicited narrative, the ‘director’ task and other referential communication paradigms. Three studies examined reference interpretation. (One study investigated both production and appropriacy judgement). Four studies examined the relationship between appropriate usage of verbal reference and formal language (lexico-syntactic ability). Two studies investigated whether reference production related to Theory of Mind or Executive Functioning.
Conclusion and implications
Across a range of elicited production tasks, the predominant finding was that children and adults with autism spectrum disorder demonstrate a deficit in the production of appropriate verbal reference in comparison not only to typically developing groups, but also to groups with Developmental Language Disorder or Down syndrome. In contrast, the studies of reference interpretation which compared performance to typical control groups all found no between-group differences in this regard. To understand this cross-modality discrepancy, we need studies with the same sample of individuals, whereby the task requirements for comprehension and production are as closely matched as possible. The field also requires the development of experimental manipulations which allow us to pinpoint precisely if and how each comprehension and/or production task requires mentalising and/or various components of executive functioning. Only through such detailed and controlled experimental work would it be possible to determine the precise location of impairments in verbal reference in autism spectrum disorder. A better understanding of this would contribute to the development of interventions.
Autism spectrum disorder (ASD) is defined by persistent deficits in social-communication alongside restricted, repetitive patterns of behaviour, interests or activities (American Psychiatric Association, 2013; Diagnostic and Statistical Manual of Mental Disorders, 5th ed [DSM-5]). Social use of language, or ‘pragmatics’ (social verbal communication), is considered a central impairment in ASD (Landa, 2000). Difficulties in this area hinder the ability to establish and maintain reciprocity in conversation and impair the successful exchange of relevant information necessary for collaboration, negotiation and daily interaction (Tager-Flusberg, Paul, & Lord, 2005). A deficit in pragmatics has additionally been linked to mental health difficulties in a number of populations (e.g. Helland, Lundervold, Heimann, & Posserud, 2014). Whether all domains of pragmatics are equally impaired or indeed impaired in all individuals with ASD, however, remains unclear (Simmons, Paul, & Volkman, 2014).
A core component of pragmatics is reference, that is the ability to denote an entity, person or event with sufficient clarity for one's interlocutor. While theorists from a semiotic perspective give equal weight to verbal and non-verbal means (see Perkins, 2005, for discussion), in the current review, we follow Norbury (2014) among others in distinguishing pragmatic language from social communication more broadly and therefore we focus solely on ‘verbal reference’. Impaired verbal reference is highly likely to have a severe detrimental effect on conversational flow (and thus on social relationships) and on the ability to collaborate with others (e.g. Murphy, Faulkner, & Farley, 2014), which would also have educational consequences. While the appropriate production and interpretation of verbal reference is often included in speech and language assessment and intervention for individuals with ASD (e.g. Adams, Gaile, Freed, & Lockton, 2010), the extent to which this area of pragmatic ability is universally impaired in this population has not been examined to date.
Forms of verbal reference.
An adult-like mastery of verbal reference not only requires acquisition of form, but also the ability to vary the level of complexity in accordance with context. Referring terms are therefore matched to the informational needs of a specific interlocutor. The appropriate use of verbal reference is often described by appealing to Grice's (1975) theory of communication, in particular the co-operative principle specifying the maxims of quantity and manner. These specify that a speaker should provide sufficient information for the listener to determine reference but also be concise (i.e. speakers should not be over-informative). Although many researchers have since pointed out major problems for the Gricean account (e.g. Gergely & Csibra, 2005; Horn, 1984; Moore, 2014; Sperber & Wilson, 1995, see also Levinson, 1989), the framework still provides a useful means of conceptualising the types of skills required to carry out and understand acts of reference.
Regardless of theoretical perspective, the match (or mismatch) between a particular form of verbal reference (e.g. pronominal, bare noun phrase or complex referring expression) and a particular context can be judged as ‘correct’ or ‘incorrect’. Here, ‘context’ can include the information that specific interlocutors know one another to share; for example, if a child knows that his father is well acquainted with his friend Jamie, then it would be over-informative to use a complex referring such as ‘the Jamie that came to my birthday party’ every time the referent ‘Jamie’ is introduced into the conversation. ‘Context’ can also include whether there are competing referents in the visual context. To illustrate, if there is only one brush in the vicinity, then asking a listener to pass ‘the brush’ may be sufficient. In contrast, if more than one brush is present, then the speaker may need to specify ‘the brush with the brown handle’. Finally, the relevant ‘context’ would also include how recently a referent has been mentioned. That is, if a referent has just been mentioned in dialogue or narrative, the speaker can usually (depending on whether there are competing referents) reduce the specificity of the referring term further by using pronouns. In this case, speaker and listener can use their knowledge of the shared common ground to determine which referents are likely to be most salient (Sperber & Wilson, 1995) and/or activated in working memory, which is usually considered to be the component of short-term memory used to manipulate and update concurrently incoming material (see Baddeley & Hitch, 1974).
For the purpose of this review, we focus on whether referring terms are appropriately based on the informational needs of the listener, given ‘context’ as defined above. Thus, when we say ‘appropriate’, this is not a qualitative judgement; a participant who says, ‘Give me the duck’ in a context in which he or she can see that the addressee can see two ducks should arguably receive a score equating to ‘incorrect’ for this particular request. However, since even typical adult speakers do not perform at ceiling in these types of tasks (Keysar, 2007) we use the terms ‘appropriate’ versus ‘inappropriate’ throughout, to describe the match or mismatch between the form and context.
The first aim of the current paper was therefore to carry out a systematic review to determine whether an impairment in the appropriate usage/interpretation of verbal reference is a global feature of ASD (or whether verbal reference is only impaired in individuals with ASD with comorbid intellectual or formal language difficulties). To this end, our focus was not on whether individuals with ASD used the same forms of reference (e.g. whether they use the same proportion of pronouns within, for example, a conversation as do typically developing controls). Rather, our focus was on whether individuals with ASD are atypical in their understanding of the ‘fit’ between reference form and context.
If we found that some studies did not report an impairment in verbal reference in ASD, our second research goal was to investigate the extent to which this might be due to either to the methodology used or to the modality in which proficiency with verbal reference was measured. Finally, we also wished to investigate whether studies including individuals with ASD provide evidence regarding the cognitive underpinnings of verbal reference ability.
To determine our key search terms, we first attempted to pinpoint the types of tasks typically used to assess verbal reference, that is naturalistic interaction, narrative or the ‘director task’/referential communication paradigm (see Graf & Davies, 2014, for a review). We also attempted to identify the key concepts most commonly associated with verbal reference in the literature. One such concept is that of ‘listener needs’ or ‘audience design’; as previously described, to be optimally informative, a referring term should provide sufficient information without being over-informative. This type of adaption to the informational needs of the listener is considered appropriate audience design (Clark & Murphy, 1982). Successful audience design may be achieved through consideration of the information listener and speaker share, or ‘common ground’ (Clark & Marshall, 1981).
Criteria for current review
Systematic searches were conducted in two databases: PsychINFO and Web of Science for all dates up until March 2016. Our search terms were entered into the ‘keyword’ field as follows: (a) autis* AND narrative, (b) autis* AND referen* AND communicat*, (c) autis* AND common ground, autis* AND audience design, (d) autis* AND listener needs and (e) autis* AND director task. Given that these two search engines are imperfect, it is inevitable that this review will not be exhaustive. Indeed, we found and included one study which met our search engine criteria (Kuijper, Hartman, & Hendriks, 2015), but which was detected by neither search engine. Nonetheless, this review should constitute an accurate representation of literature in this topic to date.
An initial review of titles and abstracts excluded studies that were clearly not related to the key topics of interest, such as articles on literature or politics. The remaining full articles were then examined and our inclusion criteria were applied as follows. To be included the study was required to (1) include participants with a diagnosis of either ASD, Asperger syndrome or pervasive development disorder-not otherwise specified, (2) include a measure of the appropriacy of the match between verbal (lexico-syntactic) reference and context, (3) contain quantitative data which were analysed statistically and (4) to include a control group consisting of either (a) typically developing individuals, (b) individuals with Developmental Language Disorder (DLD) (Specific Language Impairment) or (c) individuals with an impairment in non-verbal (performance) IQ. Without one of these control groups it is difficult to conclude whether or not individuals with ASD are impaired in referential communication. A study was additionally excluded if: (a) it was a case study with a single participant (due to issues of generalisability), (b) it was a training study which did not contain sufficiently detailed baseline data for conclusions regarding impairment in verbal reference to be drawn or (c) it exclusively examined non-verbal communication such as gesture, facial expression and eye contact or solely considered prosody (rather than lexico-syntactic form). Studies exclusively examining non-verbal communication were excluded because we were primarily interested in verbal reference. One reason for this is that when investigating the role of cognitive underpinnings, it is likely that the role of formal language, in particular, would play quite a different role in relation to non-verbal reference than in relation to verbal reference.
The total number of studies considered for inclusion and those excluded at each stage in the search process are shown in Figure 1. The 24 studies that met our inclusion criteria are listed in Tables 2 to 7. First, studies comparing the production of verbal reference in ASD and TD groups are summarised. Next, studies that compare comprehension of verbal reference in ASD and TD groups are summarised. Finally, evidence for the potential cognitive underpinnings related to successful verbal reference is reviewed.
PRISMA diagram of study identification and selection. Production – conversation/personal narrative. ASD: autism spectrum disorder; MLU: mean length of utterance; ✓: well matched; PPVT: Peabody Picture Vocabulary Test; SLI: specific language impairment; TACL: Test for Auditory Comprehension of language; TD: typically developing; X: not well matched.
Production of referring expressions
We first review studies that have used methodologies which most closely map onto naturalistic usage of verbal reference in daily life. We then review studies that have measured appropriate verbal reference during more structured narrative tasks, and finally the most structured elicitation technique, the referential communication task.
Production of referring terms during conversation
Though most closely mirroring real-life interaction, only one study containing at least one control group with clearly defined characteristics returned in our search measured the appropriacy of verbal reference use by individuals with ASD during conversation (see Table 2). In this study, Baltaxe and D’Angiola (1996) examined the use of pronominal, demonstrative (e.g. here/there) and comparative (e.g. bigger/smaller) reference during an hour-long interactive play session. Children with ASD (M = 7;9 yrs, n=10) were matched on language ability with a chronologically younger TD group (M = 3;5 yrs, n=8). Use of ambiguous reference (e.g. saying ‘it’ when the reference is unclear) was never found in the TD group (p. 252). This study also compared the ASD group to a group with DLD, (n=8) matched on receptive language and Mean Length of Utterance. DLD is a diagnosis of language impairment in the absence of a known biomedical condition (Bishop, Snowling, Thompson, & Greenhalgh, 2016). In comparison to the DLD group, the ASD group used more ambiguous personal pronouns, though this failed to reach significance. This is presumably in part due to the extremely small sample size. Nonetheless, as we will see, the finding of a tendency towards ambiguity and the finding of deficits relative to children with DLD will be a recurrent theme throughout this review.
Production of referring terms within narrative
Overview
Narrative tasks usually require the participant to generate or retell a story based on a picture book or film. They therefore constitute a monologue, rather than a reciprocal interaction. Therefore, narrative tasks might seem quite far removed from naturalistic verbal interaction. Nonetheless, narrative measures have been found to correlate strongly with standardised measures of pragmatic language more broadly, such as the Test of Pragmatic Language (e.g. Manolitsi & Botting, 2011).
Since we are interested in the degree to which individuals understand the function of verbal reference, our focus is on measures which assess whether the lexico-syntactic form is appropriate given the context. In the sentence ‘Laura went to the shop and she bought some bread’, for example, the initial reference ‘Laura’ is appropriate as it introduces a new character. The third-person subject pronoun ‘she’ is also appropriately unambiguous, referring the listener back to a character ‘Laura’ recently established as the focus of the conversation. The use of ‘she’ in this way is an example of ‘anaphoric reference’. Errors may be in the direction of over-informativity (e.g. if the full noun ‘Laura’ were used throughout). When two potential referents have recently been mentioned, conversely, an anaphoric reference may be under-informative (e.g. ‘I saw Laura and Karen and
Our survey of studies, which quantitatively measured referential accuracy within narrative, is organised in terms of the elicitation method employed. First, we review narrative generation studies, in which narrative is elicited from a stimulus (generally pictures depicting a story) without a prior model. Then, we review narrative retell studies, in which events are witnessed either in picture or video format and then retold either with reference to the original stimuli or from memory.
Narrative generation
Production – Narrative generation (stimulus present).
ADI-R: Autism Diagnostic Interview-Revised; ADOS: Autism Diagnostic Observation Schedule; AS: Asperger syndrome; ASD: autism spectrum disorder; BPVS: British Picture Vocabulary Scale; CELF: Clinical Evaluation of Language Fundamentals; CELFRLC: Receptive Language Composite; CELFRS: recalling sentences; DSM: Diagnostic and Statistical Manual of Mental Disorders; HFA: high-functioning autism; HFA-F: female: HFA-M: male; ICC; intra-class correlation coefficient; ITPA: Illinois Test of Psychological Abilities; MLU: mean length of utterance; MR: mentally retarded; ✓: well matched; PLI: pragmatic language impairment; PPVT: Peabody Picture Vocabulary Test; RCM: Raven's coloured matrices; SCQ: Social Communication Questionnaire; SLI: specific language impairment; TD: typically developing; TD-F: female; TTFC: Token Test for Children; WAIS: Wechsler Adult Intelligence Scale; WISC: Wechsler Intelligence Scale for Children; X: not well matched.
Three of these studies also compared the ASD group to a group of individuals with DLD. Norbury and Bishop (2003) found that their ASD sample used more ambiguous nouns than did a DLD group matched for chronological age and language ability. The same pattern of results was found by Colozzo et al. (2015), whereby the ASD group possessed superior formal language ability than the DLD group but still used a higher number of ambiguous character references than did the DLD group. Finally, in Norbury et al. (2014), the difference between the ASD group and the DLD group did not reach significance. However, there was a moderate effect size (d=.47) despite the fact that the ASD group in fact had significantly better formal language skills than the DLD group.
Only one of these five studies reported no differences between an ASD and TD group in production of ambiguous character reference (Mäkinen et al., 2014). This study was carried out with Finnish children aged 5–10 years. Groups were well matched for chronological age but the TD group scored higher for formal language and memory. One reason for the lack of a between-groups difference in this study might be that the participants were Finnish speaking and the authors note that in Finnish, TD children tend not to master accurate reference until eight years of age.
All of the above five narrative generation studies were not well matched to typical controls. Four additional studies examined narrative generation in comparison to well-matched controls. Three of these studies found impairments in individuals with ASD in comparison to typical controls. One of these studies focussed on adults and used the Mayer (1969) story ‘Frog, where are you?’ (Colle, Baron-Cohen, Wheelwright, & van der Lely, 2008), which was told to an experimenter who did not have visual access to the story pictures. Their sample with ASD used more ambiguous references to the dog and non-protagonist characters than did a TD group, despite being told that the listener had no previous knowledge of the story and that they should therefore be ‘as clear as possible’. The other three studies tested upper-primary school-aged children and adolescents. Two elicited narratives using a 29-page wordless picture book called ‘Tuesday’ by Wiesner (1991). These two studies reported that children and adolescents with ASD were more likely to use ambiguous reference in comparison to well-matched TD groups (Banney, Harper-Hill, & Arnott, 2015; Suh et al., 2014). The third study required participants to tell four stories to an experimenter, who did not have visual access to the pictures. Each story had two characters of the same gender and which were specifically constructed to examine reference selection for character introduction, character maintenance and character reintroduction (Kuijper et al., 2015). In this study, there were no significance differences in appropriacy of reference selection between the ASD and TD groups, despite large sample sizes (and despite the fact the ASD group scored significantly lower on the WISC ‘Vocabulary’ measure). However, the stories were much simpler than those used by the majority of narrative studies, both in terms of length (as each consisted solely of six pictures) and in terms of the amount of detail in each picture. This might have reduced both the working memory load of the task and the degree to which the individuals with ASD were likely to be distracted by irrelevant information. The potential issue of stimuli-dependent performance is one which will also emerge in the next section.
Narrative retell
Production – narrative retell.
ADI-R: Autism Diagnostic Interview-Revised; ADOS: Autism Diagnostic Observation Schedule; ASD: autism spectrum disorder; ASD-O; older; ASD-Y; younger; MLU: mean length of utterance; ✓: well matched; PPVT: Peabody Picture Vocabulary Test; SCQ: Social Communication Questionnaire; SRS: Social Responsiveness Scale; TD: typically developing; TD-O; older; TD-Y: younger; VIQ: verbal IQ, WAIS: Wechsler Adult Intelligence Scale; WISC: Wechsler Intelligence Scale for Children; X: not well matched.
In the study by Arnold, Bennetto and Diehl (2009), children and adolescents with and without ASD watched a Sylvester and Tweety cartoon and then retold this from memory to a confederate who feigned ignorance of the story. There was no narrator dialogue in the video clip. Instead participants simply watched events unfold. Each character reference was coded for recency of mention of the antecedent. If a referent was mentioned no more than two clauses back, the children with ASD (9;8−12;9) used a significantly higher proportion of noun phrases (as opposed to pronouns) than did the typical controls, which the authors interpreted as over-informativity in this context. In contrast, the adolescents with ASD (13;1–17;8) did not differ from a well-matched TD group in any of the measures used. However, Arnold et al. (2009) did not assess the appropriacy per se of verbal reference selection; the latter does not solely depend on how many clauses back the antecedent was but rather, whether a referential alterative (e.g. Tweety Bird/Sylvester) was also recently mentioned and of course whether the pronouns are gender marked.
In addition, it is possible that the particular elicitation method/stimuli used partly accounts for discrepant findings between studies. To examine the extent to which elicitation method influences performance, Novogrodsky (2013) compared a narrative retell and generation task, analysing the ambiguity of third-person subject pronouns. The same data were reanalysed by Novogrodsky and Edelson (2016), whereby they extended their analysis to include subject and object pronouns. The retell task was the ‘Bus Story’ task (Renfrew, 1991), which requires the child to retell a story, which has first been told to the child, about a bus that escapes from its driver. In this task, participants can look at the pictures as they retell the story. In the generation task, children told ‘Frog, where are you?’ (Mayer, 1969) from pictures, without an initial model. Whilst ASD and TD group performance did not differ in terms of ambiguous pronominal reference during the retold Bus Story, in the generation task the ASD group used significantly more ambiguous pronouns.
Unfortunately, due to the design of this study, there are many potential reasons why results may have differed depending on the particular elicitation paradigm. First, since the children had just heard the administrator tell the Bus Story, those with good auditory recall (which is often a relative strength in ASD) might simply have been able to select appropriate forms of verbal reference by recalling this ad verbum. Second, the ‘Bus Story’ only consists of 12 pictures and thus it could be that the relative simplicity of the story allowed more accurate use of reference.
The fourth study of narrative retell was conducted by de Marchena and Eigsti (2016), who used 60 second cartoon clips. This study differs from the other narrative studies in that listener informational needs were specifically manipulated by having two within-subjects conditions: ‘shared’ (the listener watched a short preview of the clip with the participant) and ‘private’ (the listener was not present during any part of the clip). Some aspects of de Marchena and Eigsti's (2016) data indicate that the adolescents with ASD considered listener information needs to a degree; there was a significant difference in communicative quality ratings (i.e. a rating of how easy the story was to follow) between narratives produced by the ASD group in the shared versus the private condition. However, for the key dependent variable, the degree of referential shortening, there was a between-groups difference. The authors argue that the referential shortening effect is a measure of whether participants take audience needs into account. The argument is that, if speakers take audience needs into account, their narratives should be shorter when retelling in the shared as opposed to the private condition. This effect was seen for the typical control but not the ASD group, indicating that the latter had difficulty adapting to listener information needs. However, de Marchena and Eigsti's analysis rests on the assumption that a longer narrative would contain a greater number of full noun phrases or indeed noun phrases with modifying phrases (see Table 1). This is of course not necessarily the case since a proper noun (e.g. Laura) is usually highly informative and yet does not differ in word length from a pronoun. Conversely, not all modifying phrases provide sufficient differentiating information. The extent to which reference selection was appropriate for a given context was not examined.
Nonetheless, there was a significant relationship between the referential shortening effect and symptom severity as measured by the Social Responsiveness Scale (Constantino & Gruber, 2007) in the ASD group, whereby those more likely to demonstrate the effect showed less ASD traits. This supports the authors' conclusion that the referential shortening effect taps some of the social communicative deficits which are diagnostic for ASD. Older children with ASD were also more likely to show the referential shortening effect than those who were younger, tying in with Arnold et al.'s (2009) finding that selection of appropriate reference may improve with age in the ASD population.
Referential communication tasks
Over all narrative elicitation methods, the overwhelming tendency indicates an impairment in the appropriate usage (production) of verbal reference. However, it might be argued that the difficulties individuals with ASD experience with narrative tasks are not related to deficits in the production of appropriate verbal reference per se, but instead are related to extraneous demands required by these tasks. Individuals with ASD may be particularly hindered in narrative tasks by the need for episodic memory (e.g. Lind, Williams, Bowler, & Peel, 2014), executive functioning (e.g. Geurts, Verté, Oosterlaan, Roeyers, & Sergeant, 2004), imagination (e.g. Lind et al., 2014) and central coherence (e.g. Happé & Frith, 2006). Therefore, an elicitation method which does not burden episodic memory and which mirrors the back-and-forth nature of conversation might be better able to reveal underlying latent ability in individuals with ASD to use verbal reference appropriately. One such method is the referential communication task.
Referential communication tasks allow both the production and comprehension of referring terms to be measured. Here, we first review the results of studies where referential communication tasks have been used to examine the production of referring terms. Studies which examined the interpretation of reference are discussed later in this review. Our search returned five studies involving a type of referential communication paradigm to examine the appropriacy of referring terms selected in production.
Production – referential communication tasks.
ADOS: Autism Diagnostic Observation Schedule; AS: Asperger syndrome; ASD: autism spectrum disorder; ASD-O; older; ASD-Y; younger; CELF: Clinical Evaluation of Language Fundamentals; DSM: Diagnostic and Statistical Manual of Mental Disorders; FSIQ: Full Scale IQ; ✓: well matched; PIQ: performance IQ; SCQ: Social Communication Questionnaire; TD: typically developing; VIQ: verbal IQ, WAIS: Wechsler Adult Intelligence Scale; WISC: Wechsler Intelligence Scale for Children; X: not well matched.
Using a director task, Nadig, Vivanti and Ozonoff (2009: Exp 1) found that children with ASD aged 9–14 years used proportionally fewer appropriate referring terms to identify objects than a well-matched TD group (both groups: n = 17). In the privileged condition, participants with ASD tended towards over-informativity, inappropriately using a specific referring term (e.g. ‘big cup’ when there was only one cup available from the listener's visual perspective) significantly more frequently than the TD group (p < .01). In the shared condition, the ASD group more frequently failed to use a complex referring term when two competing referents were visible, though this group difference was only of marginal significance (p = .08, effect size r = 0.24). These findings reflect the simultaneous over and under-informativity in reference use by individuals with ASD which was also the general finding from narrative and conversational studies.
Fukumura (2015) used a similar director task whereby she directly compared the ‘privileged’ and ‘shared’ perspective conditions. The dependent variable was the percentage of complex referring expressions (e.g. ‘the small door’) as opposed to unmodified nouns (e.g. ‘the door’). Thus, if individuals with ASD were taking listener informational needs into account, there should be significantly less complex referring expressions used in the ‘privileged’ condition, since this would be over-informative from the addressee's perspective. For both 6- to 10-year-olds (Exp 1) and 11- to 16-year-olds (Exp 2), there was a group by condition interaction, indicating that the typical controls were significantly more likely to make this audience design distinction than were the individuals with ASD.
Nadig et al. (2009) and Fukumura's (2016) studies indicate that when the speaker and listener perspectives differ, individuals with ASD have difficulty selecting referring expressions appropriate to their listener's perspective. However, even when participants know that their listener can see the same visual array as themselves, Nadig et al. (2009) found diminished performance for ASD groups. This latter observation is reflected in the findings of two referential communication studies which did not manipulate listener perspective. Both used adaptations of the original reference communication paradigm developed by Glucksberg and Krauss (1967) in which the participant and a confederate play a version of the ‘Guess Who?’ game. Volden, Mulcahy and Holdgrafer (1997) asked adolescents and adults with ASD to provide information to identify a target from one of two circles which varied on one of four possible attributes (colour, shape, pattern and position of a small black dot). Whilst individuals with ASD never failed to provide the distinguishing feature in their description, they were more likely to include redundant information that did not uniquely identify the target referent.
Using a similar paradigm, Dahlgren and Dahlgren Sandberg (2008) asked children with ASD to provide descriptions to identify a given face from a selection of 16. They measured how many of the features mentioned were ‘relevant’ (appropriately discriminated between pictures), ‘irrelevant’ (common to all pictures e.g. ‘has a mouth’) and ‘redundant’ (already a given). Children with ASD produced significantly fewer relevant features than did TD controls and they also included proportionally more irrelevant than relevant features than the TD group.
Whilst director tasks have been used to manipulate visual common ground knowledge, social common ground (namely the ability to determine the knowledge what one shares with a specific interlocutor; Moll & Kadipasaoglu, 2013) is arguably the skill used more often when selecting referring terms in everyday conversation. In a ‘referential pact’ paradigm, Nadig, Seth and Sasson (2015) examined whether adults with ASD engaged in lexical entrainment – the process by which interlocutors come to agree on mutual referring terms. Participants provided information to enable their listener to identify one of an array of abstract forms (tangrams). Individuals tended to alter referential descriptions in co-operation with the listener over successive trials (e.g. pairs may agree to call a shape ‘the elephant’ after initially describing it as ‘a four legged or two legged animal facing the right… The head is a parallelogram and its back leg is a rectangle and the front legs look like paws’). To investigate whether this alignment of referring terms was due merely to priming or if social common ground was utilised, the game continued with either the original or a new listener. If common ground was considered, the agreed referring terms should be used with the original but not a new listener. In the ‘new listener’ condition, the ASD group were marginally (p = .05, r = .37) less likely than the TD group to change the referring expression (referential pact) they had agreed with the original listener.
Studies comparing ASD with other neurodevelopmental disorders
Some studies returned in our search compared groups of individuals with ASD to groups of children with other neurodevelopmental disorders. For five studies the disorder concerned was DLD. Such comparisons between ASD and DLD can help elucidate the degree to which formal language (lexical and morpho-syntactic skills) might be a contributing factor in proficiency with verbal reference.
Production – studies comparing ASD with another clinical group (where no typically developing group was included in the study).
ASD: autism spectrum disorder; CELF: Clinical Evaluation of Language Fundamentals; DS: Down syndrome; LIPS: Leiter International Performance Scale; ✓: well matched; PIQ: Performance IQ; PPVT: Peabody Picture Vocabulary Test; SLI: specific language impairment; TD: typically developing; X: not well matched.
Comprehension – referential communication task.
ADOS: Autism Diagnostic Observation Schedule; ASD: autism spectrum disorder; DSM-III: Diagnostic and Statistical Manual of Mental Disorders, 3rd ed; DSM-IV: Diagnostic and Statistical Manual of Mental Disorders, 4th ed; ✓: well matched; TD: typically developing; VPT: Visual perspective taking; WAIS: Wechsler Adult Intelligence Scale; WISC: Wechsler Intelligence Scale for Children; X: not well matched.
The remaining four DLD comparison studies have already been mentioned above, since they also included a typical control group (Baltaxe & D'Angiola, 1996; Colozzo et al., 2015; Norbury & Bishop, 2003; Norbury et al., 2014). All four found significant difficulties (or effect size indicative of a difference, Norbury et al., 2014), whereby the ASD group performed worse than the DLD group. This is particularly striking in the case of the two DLD comparison studies in which the group with ASD had higher formal language scores than the group with DLD (Colozzo et al., 2015; Norbury et al., 2014). If the ASD group still showed significantly greater difficulties in verbal reference, this provides somewhat stronger evidence that formal language is unlikely to be the main cause of these pragmatic language difficulties.
In sum, the results of all five studies which compared ASD to DLD suggest that, although referential accuracy poses a challenge for children in both groups, deficits in referential communication are more pronounced in individuals with ASD, even when the latter have superior lexio-syntactic abilities. Thus, referential communication deficits in ASD are unlikely to be solely attributable to difficulties with formal language.
In addition to cross-syndrome comparisons with DLD, two studies returned in our search compared the use of reference in ASD and Down syndrome (DS), which is a neurodevelopmental disorder associated with intellectual disability (see Table 7). The first study is that of Loveland, McEvoy, Tunali and Kelley (1990) who tested children with ASD and children with DS (n=16 in each group) matched on verbal mental age. Children were asked to retell a story depicted via a video or puppet show to a naïve listener. In each group an equal proportion of children made ambiguous references to characters. Whilst Loveland et al. (1990) aimed for participants to have ‘similar’ non-verbal IQ and chronological age, the ASD group had marginally higher mean IQ scores and chronological age than the DS group.
In a less structured task, again comparing reference use in ASD and DS groups, Loveland, Tunalia, Mcevoy and Kelley (1989) asked participants to provide information to a naïve listener (E2) about how to play a board game. Participants were helped by E1 to provide adequate information using a gradient of prompts from more general, for example, ‘Tell me about these things here’ to more specific, for example, ‘Tell me where to start the game’. The ASD group produced significantly less ‘adequate’ descriptions than did the DS group at the most ‘general’ level of prompting and they also required a higher level of specific prompts than did the DS group to provide the adequate amount of information. This was the case even though the two groups were matched on verbal age and although the ASD group tended towards higher non-verbal IQ scores than the DS group. Given that the ASD group had overall higher IQ in both studies, yet exhibited difficulties equal to, or more pronounced than, a DS group, these studies suggest that the ability to develop appropriate usage of verbal reference may not be due solely to latent non-verbal intellectual difficulties.
Production summary
Overall, our search returned 22 studies of verbal reference production in ASD. There were seven studies which compared a group with ASD to a group with another neurodevelopmental disorder (either DLD or DS) and all but one of these studies found indications of poorer performance by the ASD group.
Eleven studies compared a group with ASD to a group of typical controls, whereby groups were either not well matched for formal language and/or non-verbal IQ ability, or this was not reported (Baltaxe & D'Angiola, 1996; Colozzo et al., 2015; Dahlgren & Dahlgren Sandberg, 2008; Mäkinen et al., 2014; Nadig et al., 2015; Norbury & Bishop, 2003; Norbury et al., 2014; Novogrodsky, 2013; Novogrodsky & Edelson, 2016; Tager-Flusberg, 1995; Volden et al., 1997). Nonetheless, it is noteworthy that 10 out of these 11 studies reported that individuals with ASD performed significantly worse than typical controls on at least one reference measure.
Finally, eight studies did compare a group with ASD to well-matched controls and all except one (Kuijper et al., 2015) of these well-matched case–control studies found evidence of a deficit in comparison to the typical group in terms of appropriacy of verbal reference usage (Arnold et al., 2009; Banney et al., 2015; Colle et al., 2008; de Marchena & Eigsti, 2016; Fukumura, 2016; Nadig et al., 2009; Suh et al., 2014). These latter seven studies include a range of age groups. They also include a range of elicitation methods, namely narrative generation (Banney et al., 2015; Colle et al., 2008; Suh et al., 2014), narrative retell (Arnold et al., 2009), the ‘director’ task (Fukumura, 2016; Nadig et al., 2009) and interlocutor-specific perspective taking (de Marchena & Eigsti, 2016). Therefore, it is safe to conclude that there is very good evidence for a clear impairment in appropriate reference selection (production) in ASD.
Comprehension of referring expressions
In contrast to the ample number of studies examining the production of referring terms in ASD, we found only three studies that compared ASD to another group in terms of comprehension of the same phenomena (see Table 7). All three suggest that the pattern of ability differs considerably between the production and comprehension of referring terms. Although not examining interpretation of referring expressions per se, Volden et al. (1997) examined the ability of the adolescents and adults with ASD to judge whether the addressee in a referential communication paradigm had sufficient information to be able to correctly identify the referent. The authors argue that their ASD group performed at ceiling on this meta-pragmatic judgement task. In fact the ASD group were correct on average 87% of the time but the typical group were correct 100% of time (with an SD of zero), with the result that statistical analyses were not carried out. Moreover, since the Glucksberg and Krauss (1967) paradigm was used (where the participant and the confederate are aware that they are viewing identical sets of cards), it could be argued that the ability to take another's perspective was not necessary for this task since the participant merely has to judge whether a confederate's instruction is informative from his or her own perspective.
The final two studies did in fact investigate performance in reference interpretation where the participant's perspective differed from that of the speaker. Both used the director task. In contrast to the other ‘director’ studies already outlined, here the participants were in the role of the addressee. In key (ambiguous) condition trials, each participant is instructed to pick up an object (e.g. spoon) for which the participant (but crucially, not the ‘director’) can see a referential alternative (e.g. another spoon). One dependent variable is thus the number of egocentric errors made, i.e. the number of trials on which a participant selects the object which is occluded from the director's view and thus cannot be the intended referent. A second dependent variable is typically response latency. That is, the longer a participant takes to select the correct object is an indication of the degree to which he or she (egocentrically) considered the referential alterative as a possible target.
In the first such study, Begeer, Malle, Nieuwland and Keysar (2010) examined the ability of adolescents with ASD to interpret referring expressions (e.g. ‘the cup’ versus ‘the big cup’) when responding to instructions in a shared and a privileged condition. Across both groups, participants made egocentric errors on 39% of trials in the key (ambiguous) condition and their response latencies were also significantly longer in the ambiguous than in the neutral condition, indicating that they considered the referential alternative prior to making correct selections. However, crucially, there were no between-group differences for either of these dependent variables. This indicates that the ASD group were as able as the TD group to use visual perspective taking to interpret verbal reference, at least when the visual perspective is as simple as determining whether the interlocutor can see a particular object. Santiesteban, Shah, White, Bird and Heyes (2015) carried out a computerised version of the director task with adults. Similarly to Begeer et al. (2010), they found no between-group differences. Moreover, they found that adults with and without ASD were equally successful in completing the task when a human addressee (avatar) was replaced with a camera.
Thus, the three reference interpretation studies with typical control groups align in suggesting that the ability to take another's perspective to accurately interpret reference is relatively spared in ASD.
Potential cognitive underpinnings
The picture emerging from studies on the interpretation of verbal reference is that this is not an area of impairment in individuals with ASD (Begeer et al., 2010; Sanstieban et al., 2015). This stands in stark contrast to the overwhelming finding that individuals with ASD are impaired relative to both typical peers and peers with neurodevelopmental when reference production is examined. One possible reason for the apparent discrepancy between an impairment in the selection of an appropriate referring expression (production) and an intact ability to take another's perspective to interpret a referring term might be the differing cognitive underpinnings of each skill. We now therefore survey studies which explicitly examined relations between referential communication in ASD and the potential cognitive underpinnings of this skill.
We begin by examining studies which have examined relationships between proficiency with verbal reference, on the one hand, and either formal language (lexical or syntactic) proficiency and/or non-verbal IQ, on the other hand, in the samples of individuals with ASD.
Non-verbal IQ
Three studies (all discussed above) examined the relationship between non-verbal IQ and the appropriacy of reference selection. Nadig et al. (2009) found that performance in the shared perspective condition correlated with non-verbal IQ, that is those with higher non-verbal IQ used more adjectives when they (and their interlocutor) could see two referential alternatives (e.g. two ducks) than when only one potential referent was present. However, in the privileged perspective condition there was no relationship with non-verbal IQ, which makes the first finding difficult to interpret. In line with this latter finding, both Dahlgren and Dahlgren Sandberg (2008) and Fukumura (2016) did not find any evidence for a relationship between non-verbal IQ and any measures of reference production in their ASD groups. Thus, on the whole, these findings – when considered together with the studies outlined above comparing children with ASD to children with DS – indicate that non-verbal IQ is unlikely to play a primary causal role in difficulties with verbal reference in ASD (although analyses in future studies should certainly control for non-verbal IQ).
Formal language
Three studies (all discussed above) examined the relationship between formal language and the appropriacy of reference selection. All three studies used a referential communication paradigm. Dahlgren and Dahlgren Sandberg (2008) found that verbal IQ correlated in the ASD group (but not in the TD group) with the number of relevant features mentioned and their measure of referential efficiency. However, since they did not manipulate the distinction between the participant's and the interlocutor's perspectives, it is unclear whether this indicates that formal language is important for the appropriacy of reference selection or whether it merely suggests that a more advanced mastery of formal language leads to a greater complexity of referring expressions.
The latter interpretation is supported by Fukumura (2016), who found no relationship between formal language and performance in the privileged ground condition in her ASD groups. Rather, the only relationships with formal language (British Picture Vocabulary Scale and WASI vocabulary) were with the number of adjectives produced by the ASD group in the shared ground condition. That is, those children with ASD who had larger vocabularies tended to produce more adjectives in the shared ground condition. Since the shared ground condition does not differentiate the participant's own perspective from that of the interlocutor, this finding merely indicates that those individuals with ASD who have larger vocabularies tend to find it easier to produce complex referring expression. In contrast, Nadig et al. (2009) found that formal language ability (Clinical Evaluation of Language Fundamentals (CELF)) correlated with appropriately informative verbal reference by participants with ASD in the ‘privileged view’ condition of their director task, i.e. the condition which required participants to take the addressee's perspective, since it differed from their own. That is, those with higher scores on the CELF were more able to curtail the usage of complex referring expressions when this would be over-informative. In sum, it seems likely that formal language contributes to difficulties with the production of appropriate referring expressions in children with ASD. However, given that comparisons with DLD indicated that difficulties in verbal reference production are more marked in ASD, despite better formal language skills in the latter group, it appears likely that other factors may contribute to the observed impairment.
Theory of Mind (ToM) and executive functioning
Traditionally, difficulties with appropriate verbal reference selection have been linked to difficulties with ToM, which is the ability to represent others' mental states including their beliefs, emotions and desires (e.g. Baron-Cohen, Leslie, & Frith, 1985). However, it is equally plausible that a failure to provide an appropriate level of information (i.e. under- and over-informativity) could be due to a failure to differentiate between old and new information during a verbal interaction (e.g. Baltaxe, 1977). Such difficulties may be caused in part by an impairment, for example, in working memory. Working memory is usually considered one component of executive functioning, which comprises a set of highly correlated, but separable, aspects of memory, inhibitory control and cognitive flexibility needed for considering consequences to actions (e.g. Miyake et al., 2000; see also Pennington & Ozonoff, 1996 for an overview of EF domains and measurement methods).
Given that several reviews and meta-analyses report clear evidence for impairments in all domains of executive functioning bar inhibitory control in ASD (e.g. Hill, 2004; Lai et al., 2016; Russo et al., 2007) and there is evidence of a link between EF and verbal reference in TD populations (e.g. Brown-Schmidt, 2009), it is somewhat surprising that only two studies examined relationships between EF and usage or interpretation of verbal reference in ASD. Both studies are also the only two to examine the relationship between the appropriacy of reference selection and ToM in ASD. The first study, carried out by Dahlgren and Dahlgren Sandberg (2008), included only two tasks which might plausibly be considered a measure of executive functioning and both of these measured short memory. The first was ‘verbal free recall’ (11 lists of words with 10 words in each) and the second was ‘object free recall’ (in which the child is shown 10 sets of 10 objects and is required to verbally recall them). In both memory tasks, once all items had been presented, the child was asked to repeat as many words (or objects, respectively) as he or she could remember and in any order. Relationships were found between both memory measures and certain aspects of verbal reference, namely the number of relevant features mentioned and the ‘efficiency’ of reference usage, that is the extent to which descriptions were optimally informative (for the comparison group this was only significant for verbal free recall). The authors interpret this as indicating that working memory impacted on the number of referential alternatives which a child could hold in mind and possibly also on the ability to verbally encode the relevant distinguishing information
Dahlgren and Dahlgren Sandberg (2008) also directly examined the relationship between ToM and the usage/interpretation of verbal reference. To measure ToM they used a first-order change-of-location task (Baron-Cohen et al., 1985) as well as Baron-Cohen's (1989)’s second-order false belief ‘ice-cream’ task. For the second-order ToM measure no significant relationships were found with any verbal reference measures for either group. For the children with ASD there was a relationship between first-order ToM and the same aspects of verbal reference used (number of relevant features mentioned and the ‘efficiency’) that correlated with free recall. They note, however, that the correlation with first-order ToM is based only on five children in the ASD group who failed the first-order ToM task (whereas 25 children with ASD passed). More problematically, Spearman's rho was used for all correlational measures, when a point-biserial correlation is appropriate for the first-order ToM task which was essentially a pass/fail measure.
The other study which examined relationships in ASD between appropriacy of reference selection, on the one hand, and either EF or ToM, on the other hand, is Kuijper et al. (2015). They used the Stop Signal Reaction Time Task (Van den Wildenberg & Christoffels, 2010) to measure inhibitory control and the n-back task to measure working memory. First- and second-order ToM was assessed in a scale consisting of eight stories (Hollebrandse, Van Hout, & Hendriks, 2014). For the ‘reintroduction of character in a narrative’ condition (where a noun and not a pronoun would be appropriate), the authors found in a multivariate model relationships between reference usage and both second-order ToM and working memory. Unfortunately, language measures were not entered into the analysis and the results were conflated over three groups, which included a group with Attention Deficit Hyperactivity Disorder, making this finding difficult to interpret.
Based on the studies included in this review, it appears there is insufficient evidence to determine whether the development of verbal reference usage in ASD is underpinned by ToM and/or EF. The degree to which ToM and EF underpin the development of verbal reference in ASD is complicated by the fact that these two areas tend to be inter-correlated with each other (e.g. Pellicano, 2013) and also with formal language (e.g. Milligan, Astington, & Dack, 2007, for a review). However, considering the evidence for relationships between both EF and ToM and other areas of pragmatics (see, e.g. Matthews, Biney, & Abbot-Smith, in press) further exploration of the cognitive underpinnings related to comprehension and production of referring terms is clearly a priority.
Summary and discussion
The current systematic review found 19 studies which met our criteria and in which verbal reference production by a group of individuals of ASD was compared to that used by a typically developing control group. Seventeen of these 19 studies found that the group with ASD were impaired in at least one measure in terms of the appropriacy of match between context and the form of verbal reference. While many of these studies had various methodological issues, this pattern of results also held for seven of the eight studies in which the typical group were matched to the ASD group in terms of chronological age, non-verbal IQ and formal language (Arnold et al., 2009; Banney et al., 2015; Colle et al., 2008; de Marchena & Eigsti, 2016; Fukumura, 2016; Nadig et al., 2009; Suh et al., 2014). This stands in stark contrast to the findings from the three studies of verbal reference comprehension, in which individuals with ASD were observed to show typical understanding/interpretation of verbal reference. This was even true for the two studies in which the perspective of participants differed from that of the speaker and, thus, required a shift in mental perspective (Begeer et al., 2010; Sanstieban et al., 2015).
However, this apparent discrepancy between production and comprehension measures may be an artefact of certain characteristics of the existing studies rather than an actuality. The first characteristic of the data that prevents us from drawing firm conclusions is that the participants of these three comprehension studies were all adults or older adolescents; there are some indications that proficiency with the production of verbal reference may ameliorate to some degree during adolescence (e.g. Arnold et al., 2009; de Marchena & Eigsti, 2016). However, improvement over development seems unlikely to be the full story for the difference between production and comprehension studies since two production studies with adults with ASD did find evidence of impairment in comparison to well-matched controls (Colle et al., 2008; Nadig et al., 2015).
Another possibility is that the apparent discrepancy between comprehension and production is due to task-related differences across studies. For example, the majority of production studies used narrative elicitation (for which there is no obvious comprehension-task counterpart), whereas all three comprehension studies used a referential communication paradigm (Begeer et al., 2010; Sanstieban et al., 2015; Volden et al., 1997). Indeed, one commonality amongst comprehension tasks used in all three studies (Begeer et al., 2010; Sanstieban et al., 2015; Volden et al., 1997) is that the dependent variable is binary forced choice, which is certainly far from the case for most production-dependent variables. That said, five production studies (two of which were methodologically well controlled) also used a referential communication paradigm, where the dependent variable could possibly be considered binary forced choice, and all found impairments in the ASD group relative to the typical control group (see Table 5). Therefore, it seems unlikely that the dichotomy found between comprehension and production studies can be attributed to the fact that the comprehension paradigms are binary forced choice.
To unpick the cognitive underpinnings of this discrepancy, this field needs much more fine-grained task analysis of the processes involved in the appropriate selection of referring expressions and of the processes involved in using the interlocutor's perspective to interpret referring expressions. It is tempting to suggest that production of verbal reference is inherently more burdensome to executive functioning than is interpretation of verbal reference. In production of verbal reference, the speaker requires, for example, working memory to hold information relevant to the listener whilst a sentence is formulated and executed. If the specific syntactic form of the target referring expression (e.g. simple noun phrase versus complex noun phrase) differs between trials, this may also place additional demands on mental set shifting, that is the ability to switch back and forth between multiple trials (see, e.g. Sikora, Roelofs, Hermans, & Knoors, 2016). That said, even comprehension variants of the visual perspective referential communication task have been found to tap various aspects of EF in the typical population (e.g. Brown-Schmidt, 2009; Cane, Ferguson, & Apperly, 2016; Lin et al., 2010; Nilsen & Graham, 2009). Thus, we clearly need a more precise mechanistic model of the fine-grained steps required for comprehension and production and how this might differ depending on the specific tasks used for each.
Whatever the explanation for the discrepancy between performance on laboratory-based measures of verbal reference interpretation and laboratory-based measures of verbal reference production, there is a further overarching issue that needs to be considered when drawing conclusions about these abilities in ASD. Even if individuals with ASD are unimpaired in interpretation using the referential communication tasks, this does not mean that they are necessarily unimpaired in interpretation of verbal reference in everyday life. This is because all referential communication tasks to date in this field have essentially manipulated only level one visual perspective, which is an individual's understanding that the content of what they see may differ from the content of what another sees in the same physical position (e.g. Salatas & Flavell, 1976). This requires the ability to follow another person's line of sight and draw conclusions about whether a person's perception of an object is occluded, yet one need not have a very deep understanding of mental states to determine this (e.g. Moll & Kadipasaoglu, 2013; see also Sanstieban et al., 2015, for a sub-mentalising account).
In everyday life, in contrast, the interpretation of reference is often dependent on ‘social’ perspective taking/common ground, that is an understanding of what a specific interlocutor knows or is likely to find interesting or salient. This often depends on a consideration of which particular information or experiences we have shared with which specific interlocutors. The only study meeting our criteria which investigated this is Nadig et al. (2015), who found that adults with ASD were less likely than typical adults to take discussion shared with a particular interlocutor (via a referential pact) into account when selecting a referent term. Of course, there are numerous divergent ways in which social common ground can be established with a specific interlocutor. One way is through sharing a particular collaborative experience (e.g. painting an action figure) with a certain interlocutor. To date this has only been explored to a degree in a couple of very small-scale production study pilots without control groups (Geller, 1988; Rosenthal Rollins, 2014). No studies have investigated whether individuals with autism can use social common ground to interpret verbal reference.
Conclusions
To move this field forward, we need studies which manipulate the role of social perspective taking and compare this using comprehension and production variants of the task in the same sample of individuals with ASD. We also need the field to shift away from an over-reliance on narrative paradigms. In addition to some issues with narrative paradigms outlined above, narrative is problematic here because verbal reference can be used appropriately in narrative without a real consideration of the listener's perspective, by simply tracking whether the form used for introduction or maintenance of reference is appropriate from one's own perspective (see, e.g. Arnold, 2008, for a discussion of ‘narrator-oriented’ use of verbal reference). Finally, we need a more detailed account of how deficits revealed in experimentally elicited production of verbal reference link to pragmatic language impairments in naturalistic dialogue. To that end, it is striking that to date there exists only one case-controlled study of reference production in conversation (Baltaxe & D’Angolia, 1996) and this study had highly problematic methodological issues. We need to empirically document in more detail the degree to which an impairment in reference usage hinders real-life verbal interaction, and to demonstrate more precisely the potential links that such an impairment has with difficulties in peer interaction and/or mental health difficulties.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by a Research Scholarship to the first author, who completed this study as part of her PhD thesis.
