Abstract
Childhood language interventions appear promising for improving children’s lives and yielding economic returns. However, few studies have evaluated long-term effects of these interventions. Our study did this using a large, cluster-randomized trial of a preschool intervention for Norwegian children aged 4 to 5 years whose vocabulary was more limited than that of their peers. Results showed that effects on expressive language were maintained at the 7-month follow-up when the children were in first grade and that those with the weakest language skills initially had the largest and most persistent effects. However, 4 years after the intervention, the differences between the intervention and control groups were negligible. Thus, although effects from the preschool language intervention lasted into the first year of elementary school, effects eventually faded and were completely absent in fourth grade. Our findings suggest the need for a sustained approach to language and literacy support, focusing on persistent interventions and high-quality adapted instruction.
Keywords
Introduction
Childhood language interventions hold great promise to improve the lives of children and contribute economic and social benefits to society (Heckman et al., 2013). Oral language is an important target for early intervention because it is a critical skill not only for play and social interaction but also, later in life, for reading skills and academic achievement.
Early-language interventions typically target a broad set of skills, such as vocabulary, grammar, and narrative (Rogde et al., 2019). Many studies have shown that language can be improved in the short term (e.g., Donolato et al., 2023; Rogde et al., 2019). However, there are questions about whether and why this effect will fade in the long run (Bailey et al., 2017). Researchers in previous studies generally did not follow up with their participants for more than 6 to 8 months (e.g., Rogde et al., 2019). In this study, we assessed children 4 years after they participated in a large-scale early-language randomized trial based on a 30-week intervention that had large effects immediately after the training on expressive language (Hagen et al., 2017). Thus, we aimed to examine this intervention’s longer-term effects on language and reading comprehension.
Postintervention scenarios
Once an intervention’s effects are established, these effects may be maintained over time (sustained effects) or fade over time (fade-out effects). The primary argument for the persistence of early-language intervention effects is that at an early age the brain is malleable and particularly open to environmental influences (Fox et al., 2010; Shonkoff, 2010). Thus, early stimulation through activities might leverage brain plasticity, creating a strong foundation for language learning and enhancing future skill development. The educational environment also has an impact on effect maintenance; in preschool language interventions, effects are more likely to last if the subsequent educational environment is of high quality. A 3- to 5-year follow-up study of a 20-week combined literacy and language randomized intervention in preschool supports this assumption (Gensowski et al., 2024), showing lasting effects on early reading skills (
There are also reasons why effects of language interventions may fade. One is the increasing influence of genetics on the pattern and pace of language development as children grow up (Andreola et al., 2021). This suggests that genetic factors can eventually override intervention effects, contributing to fade-out (Bornstein et al., 2014). Children’s language skills are influenced by both genetics (Andreola et al., 2021) and by home and school environments (Anderson et al., 2021; Mol & Bus, 2011). Children from disadvantaged backgrounds, who may be vulnerable to language difficulties, often experience less supportive home and educational environments (Brito, 2017; Tucker-Drob, 2012). Thus, because both the home environment and genetics are rather powerful factors for language development in children, it is not a given that changing the children’s environment typically 3 to 4 hours per week for a limited period will lead to lasting effects after the intervention. This was demonstrated in a recent meta-analysis (Hart et al., 2024), showing that 4 years after the intervention, the mean effect size for cognitive interventions was considerably reduced and nonsignificant (
Another argument for fade-out is that language interventions typically have an impact only on specific aspects of language skills, such as expressive ability or narrative use, not on the broader underlying language (latent) construct in general, which encompasses a range of skills related to comprehension (Hjetland et al., 2019; Melby-Lervåg et al., 2019). Studies (e.g., Melby-Lervåg et al., 2019) using latent variables have shown that improvements often appear only in expressive-language skills, not in receptive skills (i.e., the ability to understand language).
A meta-analysis of oral-language interventions (Rogde et al., 2019) found small effects on standardized measures of language immediately after the interventions (
The study
In this study, we screened 860 preschool children; those with lower vocabulary skills were randomly assigned at the classroom level to either a language-comprehension program (

Immediate effects of the intervention on expressive-language skills. The model shows the effect of the intervention on expressive-language skills in the grade 1 posttest. Standardized coefficients with 95% confidence intervals are shown except for the intervention dummy variable, where
In this study, we aimed to investigate how the large effect noticed immediately after training developed over time. The following research questions were asked:
What are the long-term effects of the preschool language intervention on expressive language measured in first and fourth grades? Are these effects moderated by sex and/or differences in initial language skills?
What are the effects of the preschool language intervention on fourth grade reading comprehension?
Research Transparency Statement
General disclosures
Study disclosures
Method
Participants
All the children in one cohort enrolled in preschools across two municipalities in Norway were invited to participate in this study. The parents of the children in these municipalities had different socioeconomic backgrounds. As Norway has no dedicated kindergarten year, children attend preschool until they turn six, after which they start first grade. The details concerning recruitment, allocation, and participant flow during the study are shown in Figure 2 and are based on the Consolidated Standards of Reporting Trials (CONSORT) guidelines (Schulz et al., 2010). Ethical approval was obtained from the Norwegian Social Science Data Services; parental consent for the 860 participating children was also obtained.

Flowchart of the randomized trial.
The screening process involved a measure comprising 29 items from the British Picture Vocabulary Scale–II (BPVS-II; Dunn et al., 1997) and 12 items from the picture-naming subtest of the Wechsler Preschool and Primary Scale of Intelligence–III (WPPSI-III; Wechsler, 2003), with a reliability of .67 (Cronbach’s alpha). The children whose scores on the vocabulary screening measure were in the lowest 35% (
The intervention
The preschool intervention was a 30-week program divided into five blocks of 6 to 7 weeks each, with 2-week breaks between blocks. Teachers were trained both before the intervention and halfway through it and received a detailed, scripted manual that described the activities and procedures and included materials. For the theory of change for the intervention, see the Supplemental Material available online (Fig. S1).
The intervention consisted of two main components: one with dialogical reading based on age-appropriate short stories, and one focusing on more explicit language-comprehension training. The first component—dialogic reading—was based on the teacher reading a story aloud and then discussing it with the children. Each story included three to four focus words per week. These were tier-2 words that children might not typically encounter in oral language (Biemiller, 2010).
The second component—explicit language instruction—involved direct instruction targeting vocabulary, grammar, and narrative skills. This explicit instruction included various exercises centered on specific topics (e.g., travel, food, emotions, and animals). Based on these topics, there were listening activities, grammar exercises, word and concept classification, and story structuring and sequencing. Some tasks were specifically created for the intervention, but others were also adapted from educational materials produced by commercial companies.
Data collection
The data for the preschool (pretest) measures and first-grade measures are described in Hagen et al. (2017). Below, we briefly describe the measures used in this study (see Hagen et al., 2017, for more details about the data-collection procedures). The data for the fourth-grade measures were collected for this study. The fourth-grade data were collected via individual testing of students by trained research assistants (master’s-level, special-needs-education students) during the spring semester. The measures used in preschool and first grade were no longer age appropriate for students in fourth grade. We had to find equivalent measures that were suitable for fourth graders, so in fourth grade all students were evaluated with three tests in the following order: Neale Analysis of Reading Ability (NARA) for reading comprehension, NARA for listening comprehension, and the word-definition subtest of the Wechsler Intelligence Scale–V (WISC-V).
Measures
Preschool and first-grade measures
Vocabulary
Vocabulary was measured using items from the vocabulary-definition subtest of WPPSI-IV (Wechsler, 2014) and WPPSI-III (Wechsler, 2003). In this test, the child is asked to define verbally presented words of increasing difficulty. Cronbach’s alpha values were .70 in preschool and .86 in first grade.
Listening comprehension
Listening comprehension was assessed using the Listen, Understand, Remember, and Infer (LURI) test (Hagen et al., 2022). This test contains short stories followed by questions. The pretest had 10 stories, each followed by three to five questions, totaling 36 items. By the posttest, an additional story and six more challenging questions were incorporated to prevent a ceiling effect. Consequently, the number of items for both posttests increased to 42. In the LURI test, the child is read a story and asked to answer questions about it afterward. Cronbach’s alpha values were .83 in preschool and .79 in first grade.
Narrative skills
Narrative skills were measured using the Renfrew Bus Story Test (Renfrew, 1997). In this test, the child is told a story while looking at pictures, followed by instructions to retell the story. The child’s retelling is transcribed verbatim and assigned a score on the basis of keywords and story structure. The scoring system of the Norwegian version was translated from English to Norwegian for research purposes by scholars at the Department of Special Needs Education, University of Oslo. Cronbach’s alpha values were .77 in preschool and .83 in first grade.
Fourth-grade measures
Reading comprehension in fourth grade
Reading comprehension was assessed using the Norwegian translation of NARA-II (Neale, 1997). In this test, children are asked to read stories of increasing length and complexity, followed by questions about the stories’ content. Testing is carried out until a given number of incorrect readings or wrong answers is reached. The reading-comprehension score used in the analysis is based on the number of correct answers. Cronbach’s alpha was .80.
Listening comprehension in fourth grade
Listening comprehension was assessed using the Norwegian translation of the Neale Analysis of Listening Comprehension (Neale, 1997). In this test, the children are read stories of increasing length and complexity, followed by questions about the stories. The test is administered until the child incorrectly answers four or more questions about one story. Cronbach’s alpha was .87.
Vocabulary in fourth grade
Vocabulary was measured using the word-definition subtest of WISC-V, translated and adapted to the Norwegian context (Wechsler, 2017). In this test, children are asked to explain word meanings. Cronbach’s alpha was .68.
Results
The mean raw scores and the
Observed Mean Raw Scores and Standard Deviations for Pretest (T1), Follow-Up 1 (T3), and Follow-Up 2 (T4).
Note: The vocabulary test and the listening-comprehension test were changed between Follow-up 1 and Follow-up 2. CI = confidence interval.
First, we reanalyzed the posttest data immediately after the intervention (Time 2) to examine whether the effect size was moderated by sex and language skills in the pretest. These analyses showed no difference between girls and boys (Wald test, χ2(1) = 0.066,
To test the potential follow-up effects of the intervention on expressive language, we estimated two analysis-of-covariance (ANCOVA) models with latent variables. In one model, we examined the short-term effects in January during the first grade, 7 months after the intervention. In the other model, we investigated the long-term effects in fourth grade, 4 years after the intervention.
As shown in Figure 3a, a moderate-to-strong effect could still be observed at the 7-month follow-up (Time 3). This model had metric invariance and partial scalar invariance across time, χ2(3) = 6.383,

Effects of the intervention. In (a) are shown effects 7 months after the intervention; in (b) are shown interaction effects 7 months after the intervention. Models show the effects of the intervention on expressive-language skills in the grade 1 posttest, with an interaction between expressive-language skills in the pretest and after the intervention. Standardized coefficients (with 95% confidence intervals) are shown, except for the intervention dummy variable and the interaction where
A Bayes factor, estimated from the BIC values of the current model and a model in which the effect size was constrained to zero, suggested that the data favored the current model, 0.00244 to 1. Thus, it indicates that the model with an effect of the intervention is 409.836 (1/0.00244) times more likely than a model with zero effects of the intervention. No difference was found between boys and girls (Wald test = 2.435 [1],
The effect size was moderated by expressive-language skills at pretest (Cohen’s
However, no effects were found for expressive language at the fourth-year follow-up (see Fig. 4). In this model, the expressive-language construct consisted of the listening-comprehension and vocabulary tasks that were used to measure expressive language in the fourth grade. The model had an excellent goodness of fit, χ2(7) = 4.612,

Long-term effects of the intervention on expressive-language skills. Model showing the effect of the intervention on expressive-language skills in the grade 4 posttest. Standardized coefficients (with 95% CIs) are shown, except for the intervention dummy variable where
Furthermore, the effect size was not moderated by expressive-language skills at pretest (Cohen’s
Finally, there were no transfer effects from the preschool language intervention to reading comprehension 4 years later (in fourth grade) when controlling for the pretest (Fig. 5). In this model, we used the same latent expressive-language variable in the pretest and a latent reading-comprehension variable, reflected by a single observed reading-comprehension test (NARA) as the outcome. To control for any measurement error, we fixed the error variance of this observed variable so that it reflected the alpha estimate. This model fitted the data well, χ2(4) = 6.410,

Long-term effects of the intervention on reading-comprehension skills. Model showing the effect of the intervention on expressive-language skills in the grade 4 posttest. Standardized coefficients (with 95% CIs) are shown, except for the intervention dummy variable where
Furthermore, the effect size was not moderated by expressive-language skills at pretest (Cohen’s
Discussion
This study produced important results about the durability of the effects of a rather intensive language intervention. It had considerable impact immediately after the training. We found sustained effects on expressive-language measures 7 months after the intervention; these effects were particularly strong for the children with the poorest language skills before the study started. However, after 7 months, the effect was no longer significant for those who had expressive-language skills 1
Alignment with previous research and theory
The sustained effects observed 7 months after the intervention are consistent with the pattern found in previous studies (Fricke et al., 2013; Grøver et al., 2024; Rogde et al., 2016). The results suggest that without continued support or reinforcement, the benefits of the intervention gradually diminish. Thus, our findings also somewhat align with those of Hart et al. (2024), who, in a recent meta-analysis, showed a small effect size (
The complete fade-out of the intervention’s effects by fourth grade is in line with research suggesting that both biological and environmental factors shape language development over time (Anderson et al., 2021). Although genetic predispositions may play a role, it is also likely that differences in home linguistic environments and access to educational resources after the intervention contributed to the observed fade-out effects. This aligns with previous findings indicating that without ongoing reinforcement, early gains in language development may diminish over time.
Interestingly, the effects of the intervention were stronger and lasted longer for the children with the poorest language skills in the pretest. This result implies that the preschool intervention, at least to some extent, compensated for a poorer home-language environment. Nonetheless, the effects faded for every child after 4 years. Because the intervention took place in preschool, the children transitioned to formal schooling with much higher instructional intensity (e.g., reading instruction), which may explain why the intervention group caught up with their peers.
Interventions targeting language comprehension may have indirect effects on reading comprehension through improved language abilities, as shown in school-aged children (e.g., Clarke et al., 2010). However, if these gains are not sustained, they may not fully translate into long-term improvements in reading comprehension once children start reading to learn. Our findings are somewhat at odds with those of Gensowski et al. (2024), who reported a preschool literacy-and-language intervention’s significant effect on reading 3 to 5 years later for children whose parents had low levels of education. However, as noted earlier, it may also be easier to sustain the effects of early literacy training on early reading skills because decoding letters and sounds is a more restricted skill and more malleable and sensitive to instruction than language comprehension (Snow & Kim, 2007).
In summary, our results indicate that improving the quality of the language-learning environment for a limited time is insufficient for a permanent impact. Although the intervention may have supported language development during the preschool and early school years, disparities in educational resources and opportunities after the intervention likely hindered the maintenance of its effects.
What do our findings mean?
Our study underscores the need to create language interventions with enduring effects. It emphasizes the importance of investing in programs that demonstrate sustained results, rather than focusing solely on short-term intensive interventions. To be effective, interventions must be designed from a longitudinal perspective, incorporating mechanisms to secure and build on initial gains as children progress through school. For instance, integrating booster sessions in later school years could help sustain early effects. Investigating the mechanisms behind long-term fade-out could enhance our understanding of intervention effectiveness. Factors such as intervention dosage, fidelity, and participant characteristics could explain why initial gains diminish, and strategies to mitigate fade-out could then be provided.
Limitations and conclusion
This cluster randomized clinical trial enrolled Norwegian preschoolers aged 4 to 5 who screened below the 35th percentile on vocabulary in two municipalities; a scripted, teacher-delivered Norwegian program was used. Findings—both short-term gains and later null effects—may not generalize to other populations, languages, countries, or instructional contexts.
The Bayes factor estimation indicated that the support for zero effect on reading comprehension (4 years after the intervention) was only moderate. It is possible that with more statistical power, a model with an effect would have been favored. Nonetheless, as there was strong support for zero effect on expressive language, an effect on reading comprehension without an effect on language would have been unexpected because the effect on reading comprehension should be mediated through an effect on language.
Although this intervention produced positive effects in the short term, our findings suggest that such gains may fade without continued instructional support. The absence of effects at the 4-year follow-up highlights the need for sustained, high-quality language support beyond the preschool years. Embedding language interventions into the broader educational curriculum may offer a more promising route for long-term impact (West et al., 2021, 2024). Future research should explore how such integration can be achieved and maintained across the early school years. Consistent with prior work emphasizing the long-term economic and developmental value of continued investment in education (Heckman et al., 2013), our study underscores the importance of viewing early intervention not as a one-time solution, but as the starting point for sustained support across children’s educational trajectories.
Supplemental Material
sj-jpg-1-pss-10.1177_09567976251392219 – Supplemental material for Do the Effects of a Preschool Language Intervention Last in the Long Run? A 4-Year Follow-Up Study
Supplemental material, sj-jpg-1-pss-10.1177_09567976251392219 for Do the Effects of a Preschool Language Intervention Last in the Long Run? A 4-Year Follow-Up Study by Åste Mjelve Hagen, Kristin Rogde, Monica Melby-Lervåg and Arne Lervåg in Psychological Science
Footnotes
Transparency
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
