Abstract
Background and aims
Language development in autism varies widely, from fluently verbal to minimally verbal individuals, with socio-communicative difficulties often cited as key explanatory factors. Statistical learning (SL)—the ability to detect regularities in language—has also emerged as a potential contributor to language acquisition in autism. However, SL research in autism has predominantly focused on verbally fluent individuals, leaving non- and minimally verbal populations underexplored. This study aimed to examine the predictive roles of joint attention and statistical learning, specifically nonadjacent dependency learning, on expressive vocabulary and morphosyntactic outcomes in autistic children.
Methods
Participants included 40 autistic children aged 5–8 years with diverse linguistic profiles, ranging from verbally fluent to minimally verbal, and 40 non-autistic children. Joint attention was assessed during a semi-structured play protocol, which also provided naturalistic language samples for analysis. Measures of expressive vocabulary and morphosyntax were derived from the number of different words and verb flexions produced, respectively. Sensitivity to nonadjacent dependencies was evaluated through an artificial language learning task.
Results
Neither joint attention nor sensitivity to nonadjacent dependencies predicted expressive vocabulary or morphosyntactic skills in autistic children. Response to joint attention scores were significantly lower in autistic children than in non-autistic children but higher than in previous research. This may be due to the less structured and, therefore, more ecologically valid context in which joint attention was assessed (free play), in conjunction with age and maturation factors. Regarding the SL task, both autistic and non-autistic children demonstrated sensitivity to nonadjacent dependencies. Most interestingly perhaps, only 15 autistic children completed the SL task, with non-verbal cognitive abilities significantly predicting task completion.
Conclusions and implications
This study highlights the complexity of investigating the role of statistical learning in language development in autism. It underscores the limitations of behavioral SL paradigms for minimally verbal children. Future research should prioritize developing more ecologically valid and accessible paradigms to accurately assess statistical learning in minimally verbal children, thereby clarifying the role SL may play in language acquisition in autism.
Introduction
Autism spectrum disorder diagnosis rests on two sets of characteristics: significant atypicalities in verbal and non-verbal communication and restrictive and repetitive behavior (American Psychiatric Association, 2013). Since the 5th edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), early language delays are no longer a criterion for autism; instead, language impairment is now considered a specifier. This change highlights that linguistic abilities are independent of the core characteristics of autism so language disabilities are conceived of as a co-occurring condition. That said, delays in the onset of speech are very frequent in autism and remain one of the first concerns of caregivers (Wetherby et al., 2004). Moreover, language development trajectories of autistic children are highly heterogeneous. While some individuals exhibit no structural language deficits, approximately 30% of individuals remain minimally verbal well into their school years and even adulthood (Tager-Flusberg & Kasari, 2013; Thurm et al., 2015; Wodka et al., 2013).
Different thresholds are used to define individuals who are minimally verbal but, by and large, these profiles are characterized by a limited repertoire of spoken words and fixed phrases (see Koegel et al., 2020 for a comprehensive review; Tager-Flusberg & Kasari, 2013). Most definitions center on the number of spoken words as the main criterion. For instance, some studies categorize an individual as minimally verbal if they use fewer than 20 distinct intelligible words (Chenausky et al., 2016). Others rely on the module assigned during the Autism Diagnostic Observation Schedule (ADOS): Module 1 is designed for those with little to no speech or only a few simple phrases while Module 2 applies to individuals who use phrases but whose speech remains non-fluent. Despite variations in classification criteria, most studies presuppose that minimally and low verbal (autistic) individuals face significant challenges in becoming syntactically productive in their first language. Although minimally verbal children may combine words to form simple phrases, they often struggle to move beyond this milestone to more advanced linguistic stages, which involve the productive use of morphological inflections, along with combinatorial morphosyntactic principles. Consequently, the linguistic productions of these autistic individuals remain limited to basic word concatenation and rarely feature more complex, multi-clause sentences or intricate grammatical structures.
Various characteristics of the child and the environment have been examined as potential predictors of language outcomes in autism, including morpho-syntactic skills. Most such retrospective and prospective studies focus on socio-communicative abilities as the chief predictor of language (Kissine et al., 2023). Some studies have found that difficulties in processing and establishing joint attention raise the risk of remaining minimally verbal (Anderson et al., 2007; Luyster et al., 2008; Paul et al., 2008; Yoder et al., 2015). However, the robustness of joint attention as an explanatory factor is limited once nonverbal intelligence is controlled for, which emerges as one of the most robust predictors of language and social functioning (Anderson et al., 2007; Ellis Weismer & Kover, 2015; Pickles et al., 2014; Thurm et al., 2007, 2015; Wodka et al., 2013). In a recent review, Kissine et al. (2023) propose that joint attention has a pivotal role in the onset of speech in autism, in the sense that better joint attention skills increase the probability for the child to reach first language milestones, such as early vocabulary. At the same time, this review also shows that some autistic children manage to acquire language despite low socio-communicative abilities, which suggests that other mechanisms may be at play. This kind of evidence is a strong incentive to go beyond joint attention skills to reach a broader and more accurate grasp of what underpins language development and heterogeneity in autism. Concurrently, even in typical language development joint attention plays a far determining role after the emergence of first words, and is probably not strongly related to the acquisition of complex morphosyntax (Akhtar & Gernsbacher, 2007; Tomasello, 2008; Tsimpli, 2013).
There is a cognitive ability that is less related to social functioning but critical to moving towards productive morphosyntax: the ability to detect and learn nonadjacent dependencies. Such dependencies may be schematized with an AXB structure, where the element in position B can only be predicted based on the element in position A and independently of the intervening material in position X. Nonadjacent dependencies are ubiquitous in natural languages, where they underpin, for instance, long-distance agreements. From a theoretical point of view, nonadjacent dependencies are thus foundational to the acquisition of syntax (Erickson & Thiessen, 2015). In the sentence “she moves,” a third person marker -s is expected after intervening material (the verb) once a third-person subjective singular pronoun (she) has been detected. In typical development, sensitivity to non-adjacent dependencies emerges early during development, usually soon after the first year of life (Gómez & Maye, 2005). To the best of our knowledge, however, and despite the clear link between morphosyntax and the ability to process nonadjacent dependencies, no study has investigated this potential predictor of language variability in autism.
Sensitivity to and learning of nonadjacent dependencies belongs to the broader domain of statistical learning, an umbrella term for cognitive mechanisms enabling individuals to detect both adjacent and non-adjacent regularities in their environment. There is growing interest in the question of whether statistical learning can contribute to atypical language development, including in autism (Arciuli & Conway, 2018). One relevant framework is the procedural deficit hypothesis, which posits that difficulties with rule-based aspects of language observed in various developmental conditions (such as developmental language delays and autism) may arise from neurological abnormalities in the frontal/basal ganglia and cerebellar circuits underlying procedural memory (Ullman & Pierpont, 2005). Although autism and developmental language disorders differ in several respects, some researchers speculate that autistic individuals with low language abilities may share structural impairments with individuals who have developmental language disorders (Boucher, 2012; Leyfer et al., 2008). However, while there is consistent evidence of deficits in statistical learning (of adjacent, transitional probabilities) in individuals with developmental language delays, there is no evidence for such a pattern of difficulties in autism. In fact, studies on statistical learning in autism do not generally report significant differences compared to typically developing groups (Foti et al., 2015; Haebig et al., 2017; Hu et al., 2023; Jones et al., 2018; Obeid et al., 2016).
That being said, the current approach to statistical learning in autism may be conceptually misguided. Most studies investigating statistical learning in autism include either children above 10 years of age or adults, with structural language abilities within typical ranges. While this sampling strategy makes practical sense, because older, verbally advanced autistic participants can more easily engage in tasks requiring sustained attention, it limits our understanding of the role that statistical learning can play in language acquisition in autism. The problem is that this line of research hypothesizes that statistical learning is necessary to acquire structural language in the first place and then tests autistic children who are verbal, with morpho-syntactic abilities within or close to the typical range, on sensitivity to statistical regularities. In a sense, then, these studies investigate an ability that is presupposed to be fully operational by participant inclusion criteria in the profiles being tested. By contrast, minimally verbal autistic children—a population that remains under-characterized but is critically important for understanding early language acquisition in autism—have been largely neglected in the literature on statistical learning.
To the best of our knowledge, only two studies have analyzed statistical learning in younger autistic children, between 3 and 8 years old. Jones et al. (2018) investigated visual statistical learning in preschoolers using a behavioral paradigm and only reported verbal IQ with no measure of structural language. Once again, this study only included children with verbal IQ and nonverbal IQ within the normal range, that is, children who most likely had no language delay. In parallel, Jeste et al. (2015) examined visual statistical learning in young children using Event-Related Potentials. Even though this study did include children with lower verbal abilities, as measured by the Clinical Evaluation of Language Fundamentals 4 (receptive language standard scores range: 50–180; expressive language standard scores range: 50–138), it primarily compared autistic with non-autistic individuals. The authors found significant correlations between visual statistical learning and both nonverbal cognitive abilities and social functioning, but no direct link with language proficiency or language profiles.
To truly understand the extent to which statistical learning—and, more specifically, non-adjacent dependencies—may contribute to the acquisition of advanced structural language, it is therefore essential to include autistic children with diverse linguistic profiles and to relate statistical learning to structural language abilities. The broader objectives of our exploratory study are twofold. Our first objective is theoretical. We aim to examine whether nonadjacent dependency learning is a key factor for the acquisition of structural language, specifically morphosyntax, in autistic children with different language profiles. In parallel, we aim to compare the predictive power of nonadjacent dependency learning to that of joint attention. For reasons explained above, we hypothesize that sensitivity to non-adjacent dependencies, but not socio-communicative abilities, will correlate with morphosyntactic skills in autistic children. Our second aim is methodological. We seek to examine the feasibility of a behavioral statistical learning task in measuring non-adjacent learning in a heterogeneous sample of autistic children, including minimally verbal profiles. Given the challenges of assessing minimally verbal children, we do not expect all of our participants to be able to carry out the task. Nonetheless, we are also interested in characterizing the profiles of those who will and those who will not be able to complete it, with non-verbal cognitive abilities, joint attention abilities, and language levels as exploratory predictors. As this methodological objective is exploratory, we do not have any clear-cut hypothesis on which factor will contribute to task completion. However, this aspect of our study should inform future research, shedding light on the extent to which behavioral research on statistical learning is a promising avenue for explaining language variability in autism.
Methods
Participants
40 autistic (female = 7) and 40 non-autistic (female =25) participants aged between 5 and 8 years were included in the study. The descriptives of the full sample are displayed in Table 1. Participants were recruited from social media, through traditional or special education schools, specialized daycare, and from our lab database. The main inclusion criterion for the autistic group was a formal clinical diagnosis of autism spectrum disorder from a multidisciplinary team specialized and officially licensed to diagnose autism by (MASKED). All participants were exposed to French as the primary language at home, or regular exposure to French in school or daycare since early childhood. Concurrently, the exclusion criteria for non-autistic participants were scores above the cutoff for the Social Communication Questionnaire (n = 0), a brief caregiver report screening of symptoms associated with autism, and the indication of co-occurring developmental language delays clearly independent of the autism diagnosis (n = 1).
Participant Information.
Note. Data are shown as M (SD), range. SES = socio-economic status; SCQ = Social Communication Questionnaire. The column “Autistic – task” shows data for autistic participants who completed the task and achieved more than 60% accuracy in their responses (n = 13). Number of tokens, Number of different words, and Number of different verb forms are only available for autistic participants.
Ethical approval was received for the study from (MASKED) in accordance with the Declaration of Helsinki. Participants’ parents signed a written consent for their children to be enrolled in this study after being informed of their rights and all aspects of the experimental design. When possible, participants gave oral or written consent to participate in the study. When direct consent was not possible, behavioral assent was obtained by observing participants’ comfort and willingness to engage with the tasks, ensuring respect for their nonverbal cues throughout the study.
Measures
Questionnaires
Parents completed the French version of the Social Communication Questionnaire, the lifetime version, which was used to rule out a strong suspicion of autism in the non-autistic group. For six non-autistic participants and six autistic participants, the SCQ scores were missing as the parents did not turn in the questionnaire at school. In addition, parents completed our lab questionnaire concerning language development (age of first words and first phrases), language input (languages used by caregivers and at school), media exposure (amount of screen exposure), and any known conditions (language delays, ADHD, dyslexia, dyspraxia, hearing impairment, vision impairment). Finally, parents also reported on sociodemographic characteristics on a questionnaire adapted and translated to French from the Family Affluence Scale (Torsheim et al., 2016). This measure serves as a proxy of socioeconomic status (SES) by capturing educational attainment on a 0-to-6 point scale (0 = no primary school achievement and 6 = doctoral degrees) and economic status on a 0-to-13 point scale based on indicators such as ownership of assets (car, dishwasher) (0 indicating very low economic status and 13 indicating very high economic status). Both scores were added to form a composite measure of SES. SES Scores are missing for seven autistic participants and three non-autistic participants as the parents did not turn in the questionnaire at school.
Psychometric measures
Core language abilities in French were assessed using the French version of Evaluation of Language Fundamentals-5th edition (CELF-5) (Wiig et al., 2019). Nonverbal intellectual quotient was assessed using the Leiter-3 (Roid et al., 2013). For a fair amount of autistic participants, core language abilities (n = 31) and nonverbal IQ (n = 7) were not collected due to behavioral or language level challenges.
Naturalistic language sample
For autistic participants, spontaneous speech language samples were collected during a semi-structured play with the experimenter. Data for three autistic children is missing because they showed reluctance to participate in the task. Participants completed the Eliciting Language Sample for Analysis (ELSA; Barokova et al., 2020), adapted to French. ELSA consists of eight different activities designed to elicit children's speech. Speech language samples were manually coded following the coding procedure reported in Maes et al. (2023). The number of different words produced during the ELSA was taken as a measure of lexical diversity (Butler et al., 2023). Other measures, such as the Type Token Ratio (Templin, 1957) or the Moving Average Type Token Ratio instead (Covington & McFall, 2010) are not appropriate for minimally verbal children, because they are sensitive to the length of the sample or require at least 100 recorded utterances. As for morphosyntax, the mean length of utterance in morphemes is traditionally used to assess syntactic complexity (Brown, 1973). As French has a rich cumulative affixal verbal morphology (Thordardottir, 2005), we decided to use the variation in verb flexions as a marker of syntactic complexity. French verbs are marked for tense, number, person, and mood, while English verbs typically mark only the third person, regular past tense, and the progressive aspect. Additionally, French adjectives and determiners also agree in both gender and number with the noun they modify (Thordardottir, 2005).
Response to joint attention (RJA)
The index of RJA was obtained for all participants by relying on the ELSA administration data. RJA is traditionally measured using the Early Social Communication Scales (ESCS; Mundy et al., 2003), which involves the examiner pointing to four different targets in four distinct locations twice during administration to assess the child's response. However, in this study, we opted for an approach with more ecological validity to capture joint attention in a more naturalistic setting. This decision was informed by the findings of Roos et al. (2008), who demonstrated that RJA scores obtained through the ESCS and those measured during free play between an experimenter and the child were consistent. The second author designed a coding protocol based on guidelines for RJA scoring of naturalistic examiner-child play samples proposed by Roos et al. (2008). The procedure was then conducted in the ELAN software (ELAN, 2024) (see Supplementary Materials for procedure and interrater agreement).
Non-adjacent dependency measure
The task was presented on a Microsoft Surface 4 Tablet using EPrime 3.0 software. Reaction times were recorded with an RB-540 response pad. We used an artificial grammar task that taps participants’ sensitivity to nonadjacent dependencies learning and whose validity has been well attested in previous studies (Lammertink et al., 2020; Marimon et al., 2021; van Witteloostuijn et al., 2019, 2021). Participants engaged in a game-based task in which they had to help a monkey character gather bananas. To do so, they had to listen carefully to utterances composed of three words. They were told to press the green button as quickly as possible when one of the words was a specific target word (e.g., nuf) and to press a red button otherwise. There was a 1-s interval between each element of the utterance. Children had to press one of the buttons within 750 ms after the end of each utterance, if they did not do so, a null response was recorded, and the next trial (utterance) began. In each utterance, the first and third elements were a monosyllabic CVC pseudoword while the second element was a disyllabic CVCV pseudoword. There were three types of trials. Two types of trials comprised a nonadjacent dependency between the first and the third element: zir X bep or saf X nuf. The X element indicates the disyllabic pseudoword, which was extracted from a list of 24 different elements. As shown in Table 2, the target word of the experiment was nuf or bep depending on whether it was Version1 or Version2 of the task, respectively. Participants with an even or an odd ID number were then respectively assigned to the bep or nuf version of the experiment. Nonadjacent dependency types of trials were further divided into two types: target requiring a green press and non-target requiring a red press, according to which element ended the sentence. The third type of trials were filler trials, which did not contain any nonadjacent dependency (no bep or nuf) and thus always required a red press.
Examples of Artificial Language for Versions 1 and 2.
The original task (Lammertink et al., 2019) was adapted for the artificial language to respect the phonotactic constraints of French. All the syllables shared the same frequency (0.01–0.03) according to the Lexique3 database (New et al., 2001). The FrenchPOND (the French version of the CLEARPOND online database) was used to control for the pseudowords’ phonological neighbors (Marian et al., 2012). The words from the artificial language were synthesized in the online IPA reader (http://ipa-reader.xyz/) with the “Celine” voice in French. A speech synthesizer was used to control stress, duration, and intonation. The experimental instructions were recorded by a female voice who was unaware of the study goals and methods.
The task was composed of six training blocks, one disruption block, and one recovery block. Prior to the training blocks, participants completed six practice trials, during which they were required to correctly complete at least four out of the six trials in order to proceed to the training blocks. If they did not reach this criterion, the practice trials were repeated up to four times; this threshold has been set to give participants sufficient opportunity to understand the task. If participants failed to pass after four attempts, they were excluded from the analysis. During the practice trials, the experimenter provided cues to help improve task understanding. Each block contained 32 trials separated by short breaks in which participants were told how many bananas they had gathered (correct answers) and were asked whether they were ready to continue. An overview of the paradigm and the expected reaction time is provided in Figure 1. In the training phase, the third element of each trial, target or non-target, was determined by the first element. The trials of the disruption block contained the same number of target and non-target elements (in the third position), but they were not preceded, as in the training phase, by the predictive element in the first position, so their onset could not be anticipated. If participants were sensitive to the nonadjacent dependencies, it was to be expected a learning effect, i.e., a gradual decrease in reaction time throughout the training phase, as well as a disruption effect, i.e., a significant increase in reaction time between the last training block and the disruption block. Finally, in the recovery block, the AXB structure of trials was restored to be identical to that of the training blocks. The training and recovery blocks contained 12 target trials (e.g., saf CVCV nuf), 12 non-target trials (e.g., zir CVCV bep), and 6 fillers (trials not ending with nuf or bep). The disruption block contained 12 target- (e.g., CVC CVCV nuf in version 1), 12 non-target (e.g., CVC CVCV with bep in version1), and 6 filler trials.

Visual Representation of the Nonadjacent Dependency Task with the Expected Reaction Time Across Blocks.
Data Preparation and Preprocessing
Natural language sample
For the autistic participants, language speech samples were extracted from video recordings obtained during the ELSA. These recordings were then manually coded in Praat Software (Boersma & Weenink, 2021) by a trained student in linguistics following the coding guide by Maes et al. (2023) for naturalistic language samples. In this coding procedure, the vocal productions of the child are first coded as “in” in the first tier if they correspond to articulated sounds. At this stage, all sounds produced by the child are included, except for breathing. Second, each child's production instance identified as “in” in tier 1 is further classified as belonging to one of three categories in a second tier: “linguistic,” “prelinguistic,” or “other.” Third, whenever the child uses a language other than French, an interval specifying the language is added in Tier 3. As we were only interested in linguistic productions, we used Tier 4 to transcribe orthographically only this category of productions (as specified in Tier 2 and Tier 3), for each child. Unintelligible words were coded as “xxx.” Complete utterances ended with “.” and questions with a question mark. Incomplete utterances were coded with a “+/…” at the end and interruptions with “+/.” Interjections, hesitation marks (euh/euhm), and yes and no instances were coded between parentheses. Summations (counting, colors, alphabet, singing) were coded between brackets. Partial word repetitions were transcribed as “re-repetition.” Transcriptions in Tier 4 were then extracted for each autistic participant and exported in Spacy. Incomplete utterances, interruptions, summations, and partial word repetitions were removed from the analysis. By contrast, echolalic or self-repetitions were included, first, because delayed echolalia is not always easy to detect and, second, because echolalic and self-repetitions in autism do not differ from more generative language productions in syntactic complexity (Maes et al., 2024).
Tokenization, lemmatization, and morpheme extractions were then performed in Python using the Spacy package (Honnibal & Montani, 2017) using both the French and English models as a fair proportion of children spontaneously produced English words and utterances. Lemmas were then imported in R and the number of unique words was calculated using the quanteda package (Benoit et al., 2018). Concurrently, morphological information was transferred to R and the number of unique verb morphological features—person, tense, and mood—was then computed for each affixal morpheme.
Response to joint attention index
The index of response to joint attention was derived from an examiner-participant free play during the ELSA (Barokova et al., 2020). The index was calculated as the proportion of prompts to which the child successfully responded following the examiner's cue over the total number of prompts provided. During the coding process, both the examiner's initiations—whether gestural (e.g., pointing, showing) or verbal—and the child's corresponding responses—such as head turns or gaze following—were systematically annotated. A minimum of eight successful prompts (at proximal distance of the child) was required to calculate the RJA index, following the ESCS (Mundy et al., 2003). All participants met this criterion, so no one was excluded from the analysis. A detailed description of the coding procedure is available in the Supplementary Materials. The protocol was implemented in the video analysis software ELAN, using a template that contained seven different tiers. The first six tiers were:
The promptDir tier indicated whether the child's starting point of focus was different from the direction of the attention-capturing prompt (yes/no or unclear, e.g., if the child was showing their back). The Gestural Prompt tier indicated whether the experimenter used a gestural prompt such as pointing or touching to attract the child's attention towards an entity. The Verbal Prompt tier indicated whether the experimenter used a verbal prompt such as “look” or the child's name to prompt their attention. The Gaze Shifting tier indicated whether the child exhibited a gaze shift towards the relevant referent following the experimenter prompt. The Head Turning tier indicated whether the child exhibited a head turn towards the relevant referent following the experimenter prompt.
Only attempts that included either a verbal or a gestural prompt in a direction different than the one the child was already attending to were considered valid. For each of these attempts, the seventh tier, that is the RJA tier was coded based on the child's response to the prompt. If the child responded to the joint attention prompt by shifting their gaze or turning their head towards it, the response was coded as “yes.” If there was no response to the attentional prompt, it was coded as “no.” The final RJA index was calculated as a proportion score (0–1) based on the number of successful responses (coded as “yes” in the RJA tier) relative to the total number of valid prompts.
Nonadajcent dependency task
For the nonadjacent dependency task, only those participants who completed the task (Training blocks and Disruption block, autistic = 15, nonautistic = 37) were included in the analysis. We recorded accuracy and reaction time in milliseconds for each trial throughout the blocks. Reaction times were measured from the onset of the third word. Reaction time could thus be negative if the child anticipated the third element before its onset. As in Lammertinck et al. (2019), we excluded participants with an accuracy below 60% (autistic = 2). Furthermore, incorrect responses were removed from the analysis. This resulted in a sample of 13 autistic (female = 3) children and 37 (female = 24) non-autistic children (see Table 1 for the description of participants who completed the task). We also excluded trials with reaction time below or above two standard deviations from each participant's mean reaction time (551 trials, 5.26% of trials).
Data Analytical Plan
All analyses were implemented in R (R Core Team and contributors worldwide), using the psycho, lme4 and emmeans packages. Responses to joint attention abilities were compared between the autistic and non-autistic samples using Welch two-sample t-tests.
To examine whether joint attention abilities were correlated with lexical and morphosyntactic development, we used simple linear regressions. The dependent variables were the number of different words or different verb forms as the lexical and morphosyntactic development indexes, respectively. In both cases, the independent variable was RJA, corresponding to the RJA proportion scores. As the model predicted a significant relationship with RJA, we incrementally controlled for some potential confounding variables. First, we controlled for the total number of different words, since children who produce more words might naturally produce a larger number of unique words. Second, we added the chronological age variable to the model. Finally, we added the nonverbal IQ variable, as nonverbal IQ is known to be the most robust predictor of language outcome in autism, which also influences interactional skills (Anderson et al., 2007; Ellis Weismer & Kover, 2015; Pickles et al., 2014; Thurm et al., 2007, 2015; Wodka et al., 2013). By adding each control variable incrementally, we systematically tested the robustness of the effect of joint attention (RJA) on lexical and morphosyntactic development, ensuring that this effect was not confounded by overall volubility, age, or nonverbal IQ.
We then compared the performance of autistic and non-autistic children on the nonadjacent dependency task. We used linear mixed effect models, with participant random intercepts. We performed two sets of analyses, using a forward stepwise model comparison. The baseline model always included control variables (nonverbal IQ and age) and random intercept per subject. Subsequent models were expanded by adding one predictor variable at a time. The selection of each model was guided by model fit indices, including AIC (Akaike information criterion) and BIC (Bayesian information criterion), as well as the significance of predictors. For the disruption peak, we assessed the learning of nonadjacent dependency rules by identifying the presence of a disruption peak, marked by a significant increase in reaction time between the last Training Block and the Disruption Block. For the disruption effect analysis, we included all trial types (Target, Non-Target, and Fillers), as the absence of the learned regularity in the Disruption block may affect reaction times in general, not just in Target trials. Second, we performed the learning rate analysis to see whether autistic or non-autistic children learned the dependency at a different pace. To do this, we examined the reaction time slope after collapsing all trials across the training blocks into a single continuous Trial variable.
To examine whether sensitivity to nonadjacent dependencies was correlated with the morphosyntactic development of autistic children, we extracted the slope coefficient from each autistic participant's performance in the Training Blocks. We chose to use slopes from the training blocks rather than the individual disruption peaks because the slopes better reflect participants’ efficiency in picking up dependencies. In previous research using this task, the disruption peak has been correlated with language measures; however, we avoided this approach as autistic individuals often show resistance to change and reduced mental flexibility (Lage et al., 2024), which could potentially bias the disruption peak's magnitude. Nonverbal IQ and age were controlled for.
The objective of the last batch of analyses was to look closer at what kind of profiles of autistic children can be assessed by means of a statistical learning behavioral paradigm. To examine which factors predicted task completion among autistic participants, we conducted a series of separate binomial logistic regressions. Task completion was coded as 1 for participants who completed the task and 0 for those who did not complete or abandoned it. Each regression included one predictor: age in months, non-verbal IQ (Leiter scale), response to joint attention (RJA) index, SCQ score, lexical level (number of different words), and morphosyntactic level (number of different verb forms). For each predictor found to be significant, we added non-verbal IQ as a covariate to test the robustness of the effect.
Results
Response to Joint Attention
As seen in Figure 2, response to joint attention was significantly lower in the autistic group (M = 0.79, SD = 0.12) compared to the non-autistic group (M = 0.91, SD = 0.06), as indicated by a Welch two-sample t-test [t(57.82) = −5.35, p < .001].

Mean Response to Joint Attention in Autistic and Non-Autistic Children.
The number of different words produced during the ELSA was taken as a proxy for expressive vocabulary. Initial analyses indicated that RJA was a significant predictor of vocabulary (β = 324.64, SE = 57.67, p < .001). This effect remained significant after controlling for the total number of different words produced (β = 103.54, SE = 25.11, p < .001) and age (β = 82.69, SE = 25.74, p = .003). However, once non-verbal IQ was included as a control variable, the effect of RJA on vocabulary was not significant anymore (p > .05).
The number of different verb forms produced during the ELSA was taken as a proxy of morphosyntactic development. Initial analyses indicated that RJA was a significant predictor of morphosyntactic development (β = 23.24, SE = 6.20, p < .001). This effect remained significant after controlling for the total number of different words produced (β = 9.69, SE = 3.24, p = .005) and age (β = 7.61, SE = 3.10, p = .02). However, once non-verbal IQ was included as a control variable, the effect of RJA on vocabulary was not significant anymore (p > .05).
Taken together, these results suggest that non-verbal IQ may account for a substantial portion of the variance in language abilities (lexical and morphosyntactic development) that initially appeared to be related to joint attention skills, corroborating previous findings (Anderson et al., 2007; Kissine et al., 2023; Pickles et al., 2014; Wodka et al., 2013).
Nonadjacent Dependency Task
Do Autistic Participants Differ from non-Autistic Participants in Nonadjacent Dependency Learning?
For the disruption peak, the inclusion of the Group variable or its interaction with the Block variable did not improve the model fit for target and non-target trials. The best fitting model included only the Block variable [Target: χ²(1) = 7.15, p = .007; Non-Target: χ²(1) = 6.59, p = .01]. For filler trials, no predictor (Block, Group, or their interaction) improved the model fit; see Supplementary Materials (S1) for model fit comparisons.
As can be seen in Figure 3, reaction times were faster in the last training block than in the disruption block [Target: β = −50.96, SE = 18.61, p = .009; Non-Target: β = −55.47, SE = 21.07, p = .012]. This suggests that a disruption of the underlying rules resulted in an increase in reaction time for both types of trials.

Fitted Reaction Times with Confidence Intervals During the Last Training Block and the Disruption Block.
For the learning rate, the inclusion of the interaction between the Trial and Group significantly improved the model fit [χ²(1) = 17.08, p < .001] compared to the model with only the fixed effect (see Supplementary Materials for model fit comparison).
As can be seen in Figure 4, the fitted reaction time slope was flat for the autistic group (−0.734, 95% CI [−1.6; 0.13]) and positive for the non-autistic group (1.263, 95% CI [0.87; 1.65]) indicating that autistic and non-autistic participants demonstrated no anticipation of the target stimuli across trials.

Fitted Reaction Times to Target Trials (Shaded Bands Represent 95% Confidence Intervals) Across all Training Trials, in Autistic and Non-Autistic Participants.
Is There a Link Between Syntactical Performance and Sensitivity to Nonadjacent Dependencies?
The simple linear regression model revealed no significant relation between individual slopes during training and the number of verb forms produced during the ELSA (p > .05).
What Predicts the Completion of the Nonadjacent Dependency Task in Autistic Children?
Among the predictors tested, non-verbal IQ, RJA, vocabulary, and morphosyntactic development were found to be significant predictors of task completion, as can be seen in Figure 5. The logistic regression model for nonverbal IQ revealed a significant positive relationship (β = 0.108, SE = 0.040, p = .007), indicating that higher nonverbal IQ scores were associated with a greater likelihood of completing the task.

Significant Predictors of Task Completion with Representation of Nonverbal IQ.
For RJA, the logistic regression model for RJA revealed a significant positive relationship (β = 14.156, SE = 5.09, p = .005), but this effect was no longer significant once nonverbal IQ was controlled for.
The logistic regression model for vocabulary also revealed a significant positive relationship (β = 0.02, SE = 0.009, p = .002) but, again, this effect was no longer significant when adding the nonverbal IQ as a control variable.
For the morphosyntactic development, the logistic regression model for vocabulary revealed a significant positive relationship (β = 0.39, SE = 0.13, p = .003), but this effect was no longer significant when adding the nonverbal IQ as a control variable.
In summary, while response to joint attention (RJA), vocabulary, and morphosyntactic development showed each a significant positive association with task completion initially, these relationships were no longer significant when non-verbal IQ was included as a control. Non-verbal IQ emerged as the most robust predictor, suggesting that higher cognitive abilities play a fundamental role in supporting task engagement and completion among participants.
Discussion
This study had two main objectives. The first was to examine the role of nonadjacent dependency learning in the development of language skills in autistic children with linguistic profiles representative of the heterogeneity that characterizes the autism spectrum at that age, viz. ranging from fully productive to non-speaking children. The second objective was to assess the methodological feasibility of behavioral tasks to investigate statistical learning in minimally verbal children. Statistical learning, which encompasses both nonadjacent and adjacent dependency learning, has been proposed to elucidate atypical trajectories of language acquisition, including language delays and difficulties in the autistic population (Arciuli & Conway, 2018). To our knowledge, nonadjacent dependency learning has never been studied in autism, despite its critical importance for the development of syntax. Nonverbal and minimally verbal individuals typically produce isolated words or simple word combinations, but they struggle to acquire productive morphosyntax, which requires sensitivity to nonadjacent dependencies. By comparing nonadjacent dependency learning with joint attention skills, a predictor very often studied to delineate language profile in autism, we aimed to better understand whether statistical learning mechanisms might serve as a foundation for language acquisition in autism, and potentially compensate for lower socio-communicative abilities.
Our results showed that response to joint attention was significantly lower in the autistic than in the non-autistic group. It is however worth noting that the scores of the autistic group were quite high (0.79) as compared to previous studies comparing non-autistic and autistic participants (i.e., 0.46 in Ellis Weismer & Kover, 2015; 0.31 in Luyster et al., 2008). This discrepancy is likely attributable to differences in the age of the participants and the context in which joint attention was evaluated, a point to which we return below. Furthermore, while there was a correlation between joint attention and vocabulary (as measured by the number of different words used by children during the ELSA; Barokova et al., 2020), it disappeared once nonverbal IQ was controlled for. This finding replicates previous studies showing that the relationship between joint attention abilities and expressive language weakens when IQ is taken into account (Anderson et al., 2007; Ellis Weismer & Kover, 2015; Kissine et al., 2023; Pickles et al., 2014; Thurm et al., 2007, 2015; Wodka et al., 2013). A similar pattern emerged for syntactic complexity (measured by the number of different verb forms produced by children during the ELSA) where no correlation remained significant after controlling for nonverbal IQ.
Another possible explanation for the lack of correlation between joint attention abilities and language observed in our sample is the way we assessed response to joint attention. Traditionally, joint attention skills are measured using the Early Social Communication Scales (ESCS; Mundy et al., 2003) which includes a specific subtest for measuring response to joint attention. A limitation of the ESCS is its reliance on a single subtest, during which the experimenter points to four specific locations within a room (e.g., posters on the wall) twice, which creates a strictly controlled and non-naturalistic setting that lacks ecological validity. Albeit easy to implement, this method operationalizes response to joint attention as a score obtained based on eight possible instances of head turn behavior in response to the experimenter's pointing behavior, which may not do full justice to the nuanced nature of joint attention. In real life, the latter includes the ability to share focus with an interlocutor in relation to a third entity that is shown, to track and engage with such an entity, and to sustain this attentional dynamic over time within socially shared contexts. The ESCS does not account for these additional dimensions of joint attention, potentially underestimating the child's true capacity to engage in joint attention behaviors in more naturalistic settings. To address this, we adopted a more ecologically valid approach based on guidelines from Roos et al. (2008): we assessed response to joint attention during dynamic free play activities during which the experimenter used a range of prompts to capture the child's attention as part of 15-to-25-min-long naturalistic interactions. This approach provided children with a greater number of opportunities to respond to joint attention prompts, offering a more comprehensive and naturalistic view of their joint attention abilities. It allowed also children to engage in socially meaningful interactions rather than responding to isolated cues, potentially capturing a more dimensional picture of their joint attention skills compared to previous studies. However, such a procedure does not come without cost. Coding RJA “in the wild” is more challenging than within the ESCS, where verbal and gestural prompts are strictly defined and standardized. In our setting, it was sometimes difficult to determine whether a verbal prompt was sufficiently explicit or referential, depending on the context. Furthermore, not all gestural prompts (e.g., pointing vs. showing) and verbal prompts (e.g., “look” vs. “I put it here”) are equally salient or directive, which introduces variability in children's opportunities to respond. Nevertheless, since coding was applied consistently across all participants, this variability should not have affected RJA scores per se, although it may have introduced additional noise into the data.
To investigate whether sensitivity to non-adjacent dependencies could help explain language variability in autism, participants were exposed to AXB structures, with learning assessed by their ability to anticipate element B following element A. Our analysis first examined the disruption peak—an increase in reaction time when the A element was removed, making B unpredictable. Results showed no group differences in the disruption peak, indicating that both autistic and non-autistic children learned the long-distance dependencies and were disrupted by the removal of these dependencies in the disruption block. This finding extends previous research on statistical learning in autism, showing intact statistical learning abilities (Foti et al., 2015; Obeid et al., 2016). Surprisingly, our analysis of learning rates revealed a flat or even increasing reaction time curve rather than the expected decrease across training blocks. If implicit learning had occurred, we would expect a reduction in reaction time as children would be able to anticipate more and more element B following element, as a result of prolonged exposure to this pattern. Instead, autistic children showed a flat slope, while non-autistic children exhibited a significant increase in reaction time over trials. This pattern may suggest that the task was insufficiently engaging, leading to decreased motivation among participants responding quickly. However, as both groups were disrupted by the absence of the AXB rule, we can conclude with confidence that both groups internalized the rule and were sensitive to nonadjacent dependencies.
Turning to the link between sensitivity to nonadjacent dependencies and morphosyntactic development in autistic children, our findings showed no significant correlation between individual slopes in the training blocks and the number of different verb forms used. This result contrasts with findings from previous literature, which highlights a correlation between statistical learning and language ability in typically developing children (e.g., Kidd, 2012). This null result may stem from the small sample size of autistic participants who managed to complete the task (n = 13), highlighting a key limitation of the present study.
The second aim of this exploratory study was methodological. Most previous research on statistical learning and autism has focused on children with language and cognitive abilities within the normal range (except for Jeste et al., 2015). This focus limits our understanding of the role of statistical learning in language acquisition in autism across the board, considering the extremely broad range of language abilities existing under the autistic spectrum. Given the hypothesis that statistical learning could help explain language variability in autism, we sought to determine whether, and under which circumstances, it is realistically feasible to assess statistical learning abilities across linguistically heterogeneous profiles—including those of minimally verbal children, who are typically challenging to assess with behavioral paradigms. To address these experimental challenges, we examined which factors could predict task completion in our sample of autistic children. Our analysis revealed that among joint attention, expressive language, socio-communicative scores, and nonverbal IQ, the latter superseded the others in its potential to predict task completion.
This latest finding suggests that it may not be feasible to include all language profiles in experiments relying on behavioral tasks such as the one used in this study. It is however important to relay these findings to the scientific community, as it is critical to report both what works and what does not in experimental paradigms, particularly when working with populations that present additional challenges. The completion of the task required the comprehension of explicit instruction (pressing the green and red buttons according to the target word) and sustained attention as the task duration was long (12 min). The completion of the task required the comprehension of explicit instruction (pressing the green and red buttons according to the target word) and sustained attention as the task duration was long (12 min). Since nonverbal IQ was the strongest predictor of statistical learning task completion and since many minimally verbal children have IQs below the normative range or cannot sustain IQ assessments in the first place (Kasari et al., 2013), the involvement of these profiles in tasks such as the one used here may fail to be productive. An alternative and possibly useful approach to studying statistical learning in autistic minimally verbal children, as explored by Jeste et al. (2015), would involve using EEG to assess statistical learning without requiring overt responses and no explicit instructions. However, EEG paradigms face their own challenges, as they rely on children's cooperation and demand extended exposure to statistical regularities, which may be difficult to sustain for these participants (see Tager-Flusberg et al., 2017, for a discussion and guidelines on assessing minimally verbal autistic children). Research on statistical learning across autism should, therefore, shift towards more ecologically valid paradigms. Recent studies on reading (e.g., Brice et al., 2022; Siegelman et al., 2020) have demonstrated the value of naturalistic approaches in evaluating statistical learning. Rather than employing independent artificial tasks, these studies examined how children and second-language learners spontaneously rely on statistical patterns in their reading activities. This type of approach has the advantage of avoiding forcing the participants to interact with artificially constructed and presented stimuli. Instead, it leverages the rich array of patterns that are inherent in the observable behavior of language users. Taking the example of such studies on statistical learning in reading, research on structural language acquisition should also adopt naturalistic approaches when designing experimental methods. Within the field of autism, this effort would increase the suitability of research on SL, particularly for minimally verbal or nonverbal children, not asking them to sit still, follow explicit instructions, or engage in traditional learning tasks.
Conclusion
This study explores the relationship between nonadjacent dependency learning and joint attention on language development in autistic children, alongside the feasibility of using behavioral paradigms to study statistical learning in minimally verbal populations. While both autistic and non-autistic children exhibited sensitivity to nonadjacent dependencies, no association was found between sensitivity to nonadjacent dependency learning, response to joint attention abilities, and expressive vocabulary or morphosyntactic skills in autistic children. Responses to joint attention scores were lower in autistic children than in non-autistic children but higher than in previous research using standardized assessment of joint attention, likely due to the use of a more ecological approach in this study. Methodologically, our findings underscore the limitations of behavioral tasks for assessing statistical learning in minimally verbal children. Future research should adopt inclusive and ecologically valid paradigms to better understand how statistical learning contributes to language acquisition in autism.
Supplemental Material
sj-docx-1-dli-10.1177_23969415251347878 - Supplemental material for To What Extent Can Statistical Learning Explain Language Profiles in Autism? Methodological and Theoretical Challenges
Supplemental material, sj-docx-1-dli-10.1177_23969415251347878 for To What Extent Can Statistical Learning Explain Language Profiles in Autism? Methodological and Theoretical Challenges by Charlotte Dumont, Emma Peri, Arnaud Destrebecqz and Mikhail Kissine in Autism & Developmental Language Impairments
Footnotes
Acknowledgements
We sincerely thank all the children, parents, and schools who participated and collaborated with us. We extend special gratitude to Marie Belenger and Lena Petrocelli for their invaluable assistance in testing the participants.
Ethical Considerations
Ethical approval was received for the study from the Erasme-ULB ethics committee in accordance with the Declaration of Helsinki. Participants’ parents signed a written consent for their children to be enrolled in this study after being informed of their rights and all aspects of the experimental design. When possible, participants gave oral or written consent to participate in the study. When direct consent was not possible, behavioral assent was obtained by observing participants’ comfort and willingness to engage with the tasks, ensuring respect for their nonverbal cues throughout the study.
Author Contributions
Charlotte Dumont conceived the study, adapted the task, and led data collection. Emma Peri led the coding procedure for the response to joint attention and coded all videos. Charlotte Dumont conceived and performed the analyses under the supervision of Mikhail Kissine and Arnaud Destrebecqz. Charlotte Dumont led the writing of the manuscript, with critical revisions by Mikhail Kissine, Arnaud Destrebecqz, and Emma Peri. Mikhail Kissine and Arnaud Destrebecqz secured funding.
Funding
The project was supported by a Research Project grant 40003675 from the F.R.S.-FNRS.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
De-identified data and R scripts are available from the corresponding author on request.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
