Abstract
This study investigates the causal effect of relative age at school entry on the likelihood of being identified with special educational needs (SEN) or diagnosed with attention deficit/hyperactivity disorder (ADHD) in Germany. Drawing on representative data from the National Assessment Studies (IQB) for Grades 4 and 9, we use a two-stage least squares (2SLS) instrumental variable approach, leveraging exogenous variation in school starting age created by state-specific enrollment cutoff dates. Results indicate that relatively younger students within a grade cohort are significantly more likely to be classified with SEN, particularly in the domains of learning and developmental disorders, with effects being more pronounced in Grade 9 than in Grade 4. Similarly, we find robust evidence of a relative age effect on ADHD diagnoses in Grade 4, although this effect diminishes by Grade 9. The findings suggest that age-related maturity differences systematically influence assessment and diagnostic decisions. Our results highlight the need for developmentally sensitive and bias-mitigating assessment practices that do not rely solely on teacher judgments, in order to avoid misclassification and ensure educational equity.
Introduction
Relative age effect (RAE) is a well-studied phenomenon describing systematic advantages or disadvantages individuals experience based on their age relative to peers within a same cohort. The concept gained prominence in the 1980s through studies in Canadian youth sports, which showed that athletes born shortly after cutoff dates were overrepresented in elite teams (Barnsley & Thompson, 1988; Musch & Grondin, 2001). These relatively older athletes often benefit from early physical, cognitive, and emotional advantages, increasing their chances of being identified as talented and receiving better training—effects that can compound into long-term success (Lemoyne et al., 2023).
Building on these findings in the early 1990s, economists explored RAE in educational settings, where similar patterns have been observed (Angrist & Keueger, 1991). In school systems with fixed enrollment cutoff-dates, students in the same grade differ in age by up to 12 months. These age differences translate into variations in cognitive development, attention span, and emotional maturity, which can significantly influence academic performance, psychosocial development, teacher evaluations, and student self-perception (Bedard & Dhuey, 2006; Hukkelberg et al., 2025; Rose & Barlow, 2024). Studies have consistently shown that younger students within a cohort are at higher risk of being identified as struggling learners (Attar & Cohen-Zada, 2018; Elder & Lubotsky, 2009; Hukkelberg et al., 2025; McEwan & Shapiro, 2008; Robertson, 2011), placed in lower academic tracks (Jürges & Schneider, 2011), or even diagnosed with neurodevelopmental disorders like Attention-Deficit/Hyperactivity Disorder (ADHD; Elder, 2010; Holland & Sayal, 2018; Mühlenweg et al., 2012; Schwandt & Wuppermann, 2016). Conversely, older students often have an academic and social advantage, which can persist throughout their educational trajectory and later lives (Black et al., 2011; Görlitz et al., 2022; Pehkonen et al., 2015).
In the context of assigning students to special educational needs (SEN) categories, research suggests that various external factors systematically influence the identification process (Galeano et al., 2025; Goldan et al., 2022; Goldan & Grosche, 2021), raising concerns about its fairness and accuracy, which are also echoed in critical theoretical perspectives, such as Disability Studies in Education, that emphasize the socially constructed nature of labels (e.g., Connor et al., 2008). Similar concerns have been raised regarding the diagnosis of ADHD. Like SEN categories, ADHD diagnoses are often initiated by teacher observations and are influenced by behavioral expectations in the classroom (Elder, 2010; Holland & Sayal, 2018). Several international studies have found a pronounced RAE in ADHD diagnosis (Mühlenweg et al., 2012; Schwandt & Wuppermann, 2016) which is particularly consequential as they often involve medical treatment decisions (e.g., stimulant medication). While diagnostic labeling can facilitate access to specialized support services, it may also increase developmental risks by influencing educational placement decisions and triggering further labeling effects (Kashikar et al., 2025; Lauchlan & Boyle, 2020). These effects can negatively impact students’ self-perception and well-being, teachers’ academic expectations, and future opportunities (Demetriou, 2020; Goldan et al., 2022; Hukkelberg et al., 2025).
Initial international findings suggest that the RAE may also contribute to these patterns, with younger students within a cohort often being disproportionately classified as having SEN (Cobley et al., 2009; Dhuey & Lipscomb, 2010; Wilson, 2000). Research on RAE is therefore crucial for informing policies and educational practices that promote equity, ensuring that students are assessed and supported based on their individual needs and potential rather than arbitrary age differences. While relatively younger students may indeed exhibit higher support needs due to developmental differences, assigning formal status classifications on this basis may reinforce structural disadvantages and expose students to long-term educational consequences, ultimately limiting their opportunities for academic participation and contributing to cumulative developmental disadvantages (Algraigray & Boyle, 2017; Goldan et al., 2022).
Building on approaches from econometrics, the present study investigates whether RAE can be observed in the identification of SEN categories in Germany, comparing two samples of the National Assessment Study (IQB Ländervergleich) and, hence, two different age groups of students. Specifically, we examine whether a student’s relative age at school entry influences the likelihood of being classified with SEN in the areas of learning, language, and social and emotional development.
Beyond the identification of SEN in these categories, our analyses also extend to ADHD. By systematically analyzing these associations within a German sample, this study aims to contribute to the international discourse on RAE in special education while providing evidence relevant to the German educational context. The findings have important implications for understanding potential biases in SEN identification and for informing more developmentally appropriate assessment practices.
Theoretical Background and State of Research
SEN and ADHD: Process of Identification and the Role of Teachers in Germany
Attention-Deficit/Hyperactivity Disorder (ADHD) is a neurodevelopmental disorder defined by three core symptoms—inattention, hyperactivity, and impulsivity—which can significantly impair a child’s functioning in academic, social, and home environments (Barkley, 2015). ADHD is associated with lower average school performance, more negative teacher perceptions, increased risk behavior, and greater peer difficulties, such as bullying and social exclusion (Greenway, 2023). The diagnostic criteria for ADHD are outlined in both the ICD and DSM, but notable differences exist between classification systems. A fundamental requirement for diagnosis is that symptoms must manifest in childhood and persist across multiple settings (American Psychiatric Association, 2022; World Health Organization, 2019).
In Germany, the ADHD diagnostic guidelines require a multidimensional process with behavioral symptoms aligned with ICD and DSM criteria (Gawrilow, 2023). To assess this, a range of standardized questionnaires is used, including self-reports (depending on the child’s age), as well as external evaluations by parents and teachers, making educators’ perceptions a crucial factor in the evaluation process (Ledet & Hansen, 2023). According to Schwandt and Wuppermann (2016), ADHD diagnoses are typically made by pediatricians (51%) and child and adolescent psychiatrists (28%), with a smaller proportion diagnosed by primary care physicians (36%). In Germany, the prevalence of ADHD among school-aged children is approximately 5%, although there are substantial regional variations in both diagnosis and treatment rates (Schwandt & Wuppermann, 2016).
Unlike ADHD, SEN is not scientifically defined but represents an educational-administrative construct referring to students whose development cannot be adequately supported within general education alone. In Germany, eight official SEN categories exist: learning difficulties, emotional and social development, speech and language impairments, intellectual disability, physical and motor development, visual impairment, hearing impairment, and autism spectrum disorders. Learning difficulties, emotional and social development, and speech and language impairments are often grouped under the term “Learning and Developmental Disorders” (LDD). This grouping reflects high overlap in support needs within these categories and is administratively used for resource allocation and administrative purposes, although the categories remain diagnostically distinct in practice (Goldan & Zurbriggen, 2025).
While formal SEN diagnoses are legally possible at any point during schooling, they are relatively rare in early primary grades, as many German states discourage early classification in favor of inclusive practices (anchored in state-specific inclusive education acts and federal school laws, e.g., AO–SF; BASS 13–41 Nr. 2.1). The identification process typically involves teachers, who often initiate assessments, guide parents, and conduct diagnostic evaluations. Formal diagnoses become more relevant at the end of primary school (Grade 4), shaping school placement decisions in Germany’s stratified system. The assignment of SEN can result in additional support within inclusive education settings; however, in federal states that maintain a separate special school system, it may also lead to placement in special schools (Goldan & Grosche, 2021).
Unlike more clearly defined SEN categories, such as physical disabilities or sensory impairments, LDD lack standardized diagnostic criteria, making identification susceptible to subjective judgment and potential arbitrariness (Goldan & Grosche, 2021). While administrative procedures vary between Germany’s federal states due to decentralized governance, research indicates that significant variation in SEN classification occurs within, not between, federal states (Goldan & Kemper, 2019; Goldan et al., 2022). This suggests that local diagnostic practices, teacher perceptions, and institutional cultures exert a more substantial influence on SEN identification processes than state-level regulations. Various contextual factors have been examined for their systematic influence on SEN classification, including proximity to special schools, teacher-related characteristics, and—centrally in our study—relative age at school entry (e.g., Balestra et al., 2020; Dhuey & Lipscomb, 2010; Goldan & Grosche, 2021).
State of Research: The Effect of Relative Age on SEN and ADHD Identification
The Relative Age Effect (RAE) has been widely studied, particularly in the economics of education, as it significantly influences academic achievement, student self-concept, and long-term educational trajectories (Elder & Lubotsky, 2009; Jürges & Schneider, 2011; Kretschmann et al., 2021; Mühlenweg & Puhani, 2010; Puhani & Weber, 2007). In school systems with fixed enrollment cutoffs, children in the same grade can differ in age by nearly a year, leading to developmental disparities between the children at school entry (Balestra et al., 2020; Dhuey et al., 2019). These differences influence teachers’ perceptions of their students, shaping their evaluations and diagnostic decisions, and potentially introducing systematic biases in SEN and ADHD identification (Holland & Sayal, 2018; Hukkelberg et al., 2025). Although RAE has been widely studied in education, its impact on SEN classification remains underexplored, particularly in terms of causal evidence. Most existing studies rely on correlational designs and lack causal inference (Fletcher et al., 2024; Sullivan & Bal, 2013; Wilson, 2000).
Especially about ADHD diagnosis, the RAE has received significant attention in educational and medical research. Since ADHD is characterized by inattention, hyperactivity, and impulsivity, symptoms that naturally vary across developmental stages, younger students within a grade cohort may be more likely to exhibit behaviors that are misinterpreted as signs of ADHD. A systematic review by Holland and Sayal (2018) synthesizes findings from 20 international studies examining the relationship between relative age and ADHD diagnosis or medication use. The review confirms that younger children within a cohort are consistently more likely to be diagnosed with ADHD and prescribed medication, particularly in countries with higher ADHD prescription rates. Including also a decent number of studies for a meta-analysis, they find a significant relative age effect, with children born just within 1 month before the school entry cutoff being up to 27% more likely to receive ADHD medication compared to their older peers born within 1 month after the cutoff. However, the effect size varies across countries and healthcare systems, reflecting differences in diagnostic practices and school entry policies. The study underlines that RAE effects may be more pronounced when teachers serve as informants for ADHD symptoms, compared to cases where parents assess their children’s behavior. Studies that examined RAE for different informants report lower or no effects for parents compared to teachers’ evaluations (Elder, 2010; Halldner et al., 2014; Schmiedeler et al., 2015), indicating that teacher perceptions and relative comparisons within the classroom contribute to ADHD identification biases.
Schwandt and Wuppermann (2016) analyzed health insurance data from over 7 million children in Germany (2008–2011) and found strong RAE on ADHD diagnoses and medication use. Their results show that children born just within 1 month before the school entry cutoff—making them the youngest in their grade—are 22% more likely to be diagnosed with ADHD compared to their peers born within 1 month after the cutoff. Crucially, no similar pattern is observed for other medical conditions, reinforcing the argument that RAE leads to systematic ADHD misdiagnoses. The study suggests that teacher and parental demand for ADHD treatment—rather than physician-driven factors—plays a key role, with stronger effects in districts with larger class sizes and higher shares of foreign students.
Dhuey and Lipscomb (2010) investigated the impact of relative age at school start on SEN diagnoses using three nationally representative U.S. surveys spanning 1988 to 2004, covering students from kindergarten through 10th grade. Their study applies an instrumental variable approach, leveraging state-specific school entry cutoff dates to estimate the effect of relative age on the probability of being classified with SEN. The results show that an additional month of relative age at school entry decreases the likelihood of being identified by 2% to 5%, with the strongest effects observed for learning disabilities. When extrapolated over a full year, this corresponds to a cumulative effect ranging from 24% to 60%, highlighting a substantial impact of relative age on SEN identification. However, no significant effects were found for disabilities such as hearing or orthopedic impairments, suggesting that SEN classification is more subjective in cases of learning difficulties. The authors argue that these findings are consistent with the hypothesis that younger students within a cohort are over-referred for SEN evaluations, rather than being objectively more impaired.
In the United States, studies using regression discontinuity designs show that relatively younger students are significantly more likely to receive SEN status. Shapiro (2022) found that the youngest children in a cohort are about 40% more likely to be identified for special education, particularly for speech/language impairments and developmental delay, with effects persisting through eighth grade. Similarly, Dhuey et al. (2019) report that younger students perform worse academically and are more often diagnosed with learning and speech disabilities—especially in early grades—suggesting that early classifications may reflect developmental immaturity rather than stable impairments.
Balestra et al. (2020) examine the causal effect of school starting age on the probability of being diagnosed with SEN in Switzerland. Using administrative health data from the Swiss canton of St. Gallen, the study employs a regression discontinuity design (Angrist & Pischke, 2009), exploiting the school entry cutoff to compare children born just before and just after the threshold. Their results show that younger students within a cohort are more likely to be diagnosed with behavioral problems and speech impairments, whereas no significant effects were found for learning disabilities, ADHD, or dyslexia/dyscalculia. The authors explain their contrary findings by differences in the identification process and assessment methods in the canton of St. Gallen. Unlike studies in countries relying on teacher-reported diagnoses, Balestra et al. (2020) base their analysis on diagnoses made by external psychologists from the School Psychological Service (SPS). Their findings suggest that RAE are more pronounced when the assessment process relies on subjective teacher evaluations, whereas expert assessments appear to mitigate potential biases in SEN classification.
Research Questions
Building on the state of research, this study examines whether RAE can be observed in Grade 4 and Grade 9 students for (1) ADHD and (2) SEN status in Germany. For SEN, we focus on the domains of Learning and Developmental Disorders (LDD) as they are more subject to identification biases as described above. The central research question is whether relative age at school entry systematically influences the likelihood of being identified with ADHD and SEN in the respective categories.
Methodology
Data and Sample
To examine the research questions, we utilize data from the German National Assessment Study (IQB Trends in Student Achievement) from 2015 and 2016 (Schipolowski et al., 2019; Stanat et al., 2017, 2019) which provide state-level representative samples. The data were made available by the Research Data Centre (FDZ) at the Institute for Educational Quality Improvement (IQB). These cross-sectional datasets enable a comparative analysis of RAE effects on ADHD and SEN identification across different educational stages, providing insights into how the role of RAE evolves, comparing data from primary and secondary education. For both samples, students with SEN were oversampled in the federal states of North Rhine-Westphalia and Bremen.
The IQB Trend 2015 included students from different school types (regular schools and special schools) at the end of ninth grade, aged 14 to 17 years (N = 36,542), with 7.4% identified with SEN. The IQB Trend 2016 covers students at the end of fourth grade from regular primary schools and special schools, aged 9 to 12 years (N = 31,335). In the original sample about 11% were classified with SEN. Unfortunately, no questionnaire was administered to more than 30% of the sample. This missing data reduces the number of cases available for analysis (Schipolowski et al., 2018; Stanat et al., 2017).
Methods
Our methodological approach follows econometric studies that employ the instrumental variable (IV) method to estimate the causal effect of relative age at school entry on SEN identification (Puhani & Weber, 2007). Specifically, we use a two-stage least squares (2SLS) estimation strategy (Angrist & Pischke, 2009) to derive the Local Average Treatment Effect (LATE).
Instrumental Variable Approach
A key challenge in estimating the effect of school entry age on SEN identification lies in the endogeneity of actual school starting age. Although school entry is formally determined by administrative cutoff dates, real-world compliance is imperfect: some children are enrolled early or deferred (so-called “redshirting”) based on parental decisions or developmental concerns. As a result, empirical school entry age may be correlated with unobserved factors such as cognitive ability or health, which themselves influence SEN classification, leading to biased estimates in conventional regressions. Crucially, cutoff date regulations differ between the 16 German federal states, creating substantial variation in theoretical school starting age across cohorts and regions. As a result, a child born in late June may be among the youngest in one state but among the oldest in another. Because cutoff dates are set administratively and are unrelated to individual student characteristics, they generate a quasi-natural experiment that enables credible causal inference (Angrist & Pischke, 2009).
To address these issues, we employ a 2SLS instrumental variable approach, using theoretical school entry age based on administrative date-of-birth cutoffs as an instrument for the empirical age at school entry. This strategy isolates exogenous variation in relative age and thus allows us to obtain an unbiased estimate of its causal effect on SEN identification. The validity of the instrument relies on two key conditions: (1) relevance—that is, the instrument must be strongly correlated with actual school entry age, and (2) exogeneity—that is, the instrument must not be correlated with unobserved determinants of SEN status (Angrist & Pischke, 2009). In line with Puhani and Weber (2007), we empirically assess these assumptions by reporting descriptive characteristics of the complier population, balance tests examining correlations between the instrument and key background variables (gender, HISEI, and migration background), and detailed first-stage regression results. To ensure the validity of the instrument, we assume that there is no systematic relationship between a child’s birth month and their likelihood of receiving an SEN diagnosis, aside from its effect via school entry age. We assess this assumption conducting a narrow-window analysis, restricting the sample to children born 1 month before and 1 month after the cutoff date (Fuzzy RDD; Angrist & Pischke, 2009). All robustness checks, balance tests, OLS estimates and first-stage results are reported in the online supplementary materials repository.
It is important to note, however, that real-world adjustments such as parental “redshirting” reduce the effective variation in age at school entry around the cutoff. This phenomenon, often described as effect deflation, reflects the fact that flexible enrollment policies partially compensate for relative age disadvantages—thereby diminishing the observable policy-relevant effect of cutoff-based age differences. As a result, we employ the reduced-form effect of birth month on SEN classification in our study (Jürges & Schneider, 2011) that is typically smaller than the structural IV estimate.
Instruments and Analysis
Our dependent variables include six distinct outcomes that are equally defined across all federal states: (1) ADHD; (2) SEN total; (2.1) SEN in Learning and Developmental Difficulties (comprises SEN in learning, social-emotional development, and language) (LDD); (2.2) SEN in Learning; (2.3) SEN in Social-Emotional Development; and (2.4) SEN in Language.
The independent variable in our 2SLS instrumental variable approach is the empirical school entry age (in years), that is, the age at which the students actually start school. To address potential endogeneity in school entry age (e.g., due to so-called redshirting, i.e., students starting school later than the cutoff date; Graue & DiPerna, 2000), we employ theoretical school entry age (in years)—which is determined solely by state-specific cutoff regulations—as an instrumental variable. The variable was calculated on our behalf by the Research Data Centre (FDZ) based on a student’s birth month, birth year, and the respective state-specific cutoff date for school enrollment that we provided for each specific year. Students were operationalized as “compliers” when, at the time they were in fourth/ninth grade, their actual school entry age corresponded to their theoretical school entry age based on the state-specific cutoff regulations. Under the monotonicity assumption—that the statutory cutoff does not induce countervailing enrollment decisions—the use of this instrument allows us to estimate the LATE, capturing the causal effect of relative school entry age for the subgroup of students whose enrollment timing was determined by institutional cutoff rules. In the German context, this assumption is plausible given that school entry cutoffs are administratively fixed at the state level and deviations from the rule reflect discretionary parental or administrative decisions rather than systematic defiant responses (Angrist & Pischke, 2009). The LATE thus isolates exogenous variation in school entry age and reflects the effect of relative age among compliers—those whose entry behavior aligns with administrative regulations.
Although a valid instrument does not require control variables to yield consistent estimates, we included several controls—gender, HISEI (highest parental occupational status), and migration background (generation status of immigration)—to improve the precision of our estimates and account for potential confounding factors. Including these variables helps reduce unexplained variance and strengthens the credibility of our identification strategy by addressing observable differences that might otherwise bias the results. Migration background was assessed using a categorical variable that captures generational status based on the birthplace of the student and their parents. We recoded the variable into a binary indicator of migration background, with students categorized as having a migration background if they were first- or second-generation immigrants.
We used the missing indicator method (Groenwold et al., 2012) to address missing values in HISEI and migration background, which occurred more frequently among ADHD and SEN students. Since missingness was not at random (MNAR), this approach preserves sample size and reduces bias. As a robustness check, we also applied multiple imputation, which yielded comparable results.
Additionally, we include fixed effects for birth month and federal state. The inclusion of birth month fixed effects controls for potential seasonal effects in births, which have been shown to systematically influence health-related aspects (Lee et al., 2006). Federal state fixed effects are necessary because empirical school entry age is determined at the state level, ensuring that our estimates are not biased by regional differences in enrollment policies or SEN classification criteria.
To further refine our estimation, we adjust the estimator for the share of non-compliers, specifically accounting for students with delayed school entry or who enrolled early. This correction, following the approach of Jürges and Schneider (2011), ensures that our results represent a conservative estimate of the effect of school entry age on SEN classification. Standard errors are clustered at the school level, reflecting the primary sampling unit in the stratified multi-stage design of the studies (Stanat et al., 2017). While the instrumental variation stems from state-specific cutoff regulations, these are fixed and fully observed across all units. Following the design-based framework by Abadie et al. (2023), statistical inference should reflect the level at which sampling variability occurs—that is, across schools.
Results
Descriptive Statistics
Special Educational Needs
Table 1 presents the descriptive statistics for the variables included in the regression models for SEN, separately for ninth and fourth-grade students. The sample consists of N = 18,877 fourth-grade students and N = 36,215 ninth-grade students, excluding cases with complete missing data.
Descriptive Statistics for Students With and Without SEN (IQB 2016 – Grade 4; IQB 2015 – Grade 9).
Note. Bolded values indicate statistically significant differences between groups (p < .05).
HISEI = Highest International Socio-Economic Index of Occupational Status; SEN = Special Educational Needs; SEN-LDD = Special Educational Needs in the Areas of Learning and Developmental Disorders.
In both cohorts, 7.2% of students were identified with SEN. The largest subgroup in Grade 4 was students with SEN in the area of learning (5.3%), followed by those with SEN in the areas of social-emotional development (1.1%) and language (0.4%). In Grade 9, the proportion of students with SEN-Learning was markedly lower (3.5%), while the shares of students with SEN related to social-emotional development (1.5%) and language (2.4%) were higher than in Grade 4. The broader category SEN-LDD, which encompasses these three types of needs (learning, social-emotional development, and language), accounted for 6.6% of the 2016 sample and 6.7% in the 2015 sample and captures most SEN cases. The discrepancy between the total SEN rate and the SEN-LDD category (6.6% in Grade 4 vs. 6.7% in Grade 9) reflects the presence of additional, less prevalent SEN types such as intellectual disabilities, hearing, and visual impairments. Given their low prevalence and the comparatively objective nature of their diagnostic criteria, these categories were excluded from separate analysis.
Statistically significant differences in background characteristics between students with and without SEN were observed in Grade 4 (IQB 2016). Students with SEN had a slightly lower theoretical school starting age (6.45 vs. 6.51 years) and a significantly higher empirical school starting age (6.50 vs. 6.41 years), suggesting delayed school entry. Their socioeconomic status, as measured by the HISEI index, was also markedly lower (39.2 vs. 53.2), and a higher share of students with SEN had a migration background (24.1% vs. 18.6%) and were male (65.1% vs. 49.8%).
In Grade 9 (IQB 2015), statistically significant differences were found for empirical school starting age (6.93 vs. 6.66 years), HISEI (36.9 vs. 51.9), and gender (64.0% vs. 51.1%). Theoretical school starting age and the proportion of students with a migration background did not differ significantly between groups in this cohort.
These patterns illustrate consistent associations between SEN identification and sociodemographic disadvantages, which are taken into account in the subsequent regression analyses.
ADHD
Table 2 reports descriptive statistics for students diagnosed with ADHD and their peers without such a diagnosis. ADHD was reported in 2.1% of students in Grade 4 and in 0.7% of students in Grade 9.
Descriptive Statistics for Students With and Without ADHD (IQB 2016 – Grade 4; IQB 2015 – Grade 9).
Note. Bolded values indicate statistically significant differences between groups (p < .05). Values represent means or shares; standard deviations in parentheses.
HISEI = Highest International Socio-Economic Index of Occupational Status; ADHD = Attention-Deficit/Hyperactivity Disorder.
In both cohorts, statistically significant group differences were observed in several background characteristics. In Grade 4 (IQB 2016), students with ADHD had a slightly lower theoretical school starting age than their peers (6.46 vs. 6.51 years). More substantial differences were observed in socioeconomic status, with students with ADHD having markedly lower HISEI scores (45.7 vs. 52.3), and in gender, with a significantly higher share of male students among those with ADHD (74.2% vs. 51.3%). In contrast to SEN students, students with ADHD were significantly less likely to have a migration background (11.9% vs. 19.2%).
In Grade 9 (IQB 2015), similar patterns emerged: students with ADHD had significantly lower HISEI scores (43.0 vs. 51.3), were more often male (79.1% vs. 51.1%), and less frequently had a migration background (6.7% vs. 15.4%). Additionally, their empirical school starting age was significantly higher (6.75 vs. 6.63 years), indicating a greater likelihood of delayed school entry or grade retention.
Table 3 reports descriptive characteristics of the different compliance groups, distinguishing early entrants (always-takers), on-time entrants (compliers), and redshirted entrants (never-takers), separately for the Grade 9 (IQB 2015) and Grade 4 (IQB 2016) samples. Across both cohorts, compliers—that is, students whose school entry timing aligns with the administrative cutoff—constitute the largest group.
Descriptive Characteristics of Compliance Groups by School Entry Timing (IQB 2016 – Grade 4; IQB 2015 – Grade 9).
Note. All calculations based on the regression sample.
In both samples, compliers and always-takers exhibit a more balanced gender distribution than redshirted entrants, among whom boys are overrepresented (60%). Socioeconomic patterns differ across cohorts: in the IQB 2015 sample, early entrants have the highest average HISEI values, whereas in the IQB 2016 sample, compliers exhibit the highest mean HISEI.
Across all SEN categories and ADHD, the highest proportions are observed among redshirted entrants. This pattern likely reflects underlying developmental difficulties that motivate delayed school entry. Nevertheless, in absolute numbers, the majority of students fall into the complier group, to which the estimated LATE applies.
Taken together, the descriptive patterns underscore the relevance of an IV approach that accounts for endogeneity in school entry timing.
Regression Results
Special Educational Needs
Tables 4a and 4b present the results of instrumental variable regressions (2SLS) estimating the causal effect of relative age at school entry on the probability of being identified with various categories of SEN. The instrumental variable estimates (LATE), the corresponding reduced form (RF) estimates—adjusted for the share of non-compliers—and the narrow window analysis estimates in reduced form (RDD RF) are reported. The LATE captures the causal effect of relative age for the subgroup of students whose school entry timing was affected by the month-of-birth cutoff and not by parental delay or early enrollment. As such, the LATE does not estimate an average treatment effect for the entire population but rather the effect for those whose enrollment decisions were determined by institutional rules (cutoff dates). All models include month of birth and federal state fixed effects; Models 3 and 4 in the respective category additionally control for relevant sociodemographic covariates (gender, HISEI and migration background).
Instrumental Variable Regressions (Two-Stage Least Squares) and Regression Discontinuity Design Estimating the Effect of Relative Age at School Entry on SEN Status – Regression Coefficients and Standard Errors (IQB 2016 – Grade 4; IQB 2015 – Grade 9).
Note. LATE = Local Average Treatment Effect; RF = Reduced Form estimate. RDD RF = Regression Discontinuity Design Reduced Form estimate (± 1 month around cut-off). Non-standardized regression coefficients; clustered standard errors on the school level in parentheses. All models include fixed effects for month of birth and federal state. Models in Columns 3 and 4 include student-level control variables (gender, hisei, migration background). SEN – total = all students with special educational needs. SEN – LDD = students identified with learning and developmental disorders (comprises comprises SEN in the categories learning, social-emotional development, and language. Bold values indicate significant effects.
*p < .1; **p < .05; ***p < .01.
Instrumental Variable Regressions (Two-Stage Least Squares) and Regression Discontinuity Design Estimating the Effect of Relative Age at School Entry on SEN Status – Regression Coefficients and Standard Errors (IQB 2016 – Grade 4; IQB 2015 – Grade 9).
Note. LATE = Local Average Treatment Effect; RF = Reduced Form estimate. RDD RF = Regression Discontinuity Design Reduced Form estimate (± 1 month around cut-off). Non-standardized regression coefficients; clustered standard errors on the school level in parentheses. All models include fixed effects for month of birth and federal state. Models in Columns 3 and 4 include student-level control variables (gender, HISEI, migration background). Bold values indicate statistically significant effects.
*p < .1; **p < .05; ***p < .01.
Instrumental Variable Regressions (Two-Stage Least Squares) and Regression Discontinuity Design Estimating the Effect of Relative Age at School Entry on SEN Status – Regression Coefficients and Standard Errors (IQB 2016 – Grade 4; IQB 2015 – Grade 9).
Note. LATE = Local Average Treatment Effect; RF = Reduced Form estimate. RDD RF = Regression Discontinuity Design Reduced Form estimate (± 1 month around cut-off). Non-standardized regression coefficients; clustered standard errors on the school level in parentheses. All models include fixed effects for month of birth and federal state. Models in Columns 3 and 4 include student-level control variables (gender, HISEI, migration background).
*p < .1; **p < .05; ***p < .01.
For fourth-grade students (IQB 2016), the results indicate a statistically significant negative effect of being relatively younger within the school cohort on the likelihood of receiving an SEN classification. Specifically, younger children are about 1.5 to 2.0 percentage points more likely to be identified with any SEN (LATE: −0.020; RF: −0.015; p < .05). This effect appears to be primarily driven by SEN-LDD, for which both LATE and RF estimates are significant on the 10% level, but only at the 10% level (LATE: −0.018; RF: −0.014; p < .10). While this lower level of statistical significance should be interpreted with caution due to the increased risk of Type I error, these findings are reported given the causal estimation framework.
The effects for SEN-Learning are negative (LATE: −0.011; RF: −0.008) but fall short of statistical significance, and similarly, the coefficients for social-emotional and language-related SEN are small and non-significant, likely reflecting small case numbers in those categories (e.g., n = 654 for learning; n = 274 for social-emotional and n = 460 for language-related SEN in Grade 4).
The RDD RF estimates based on the narrow-window specification provide additional evidence that is consistent with the instrumental variable results. Across outcomes for fourth-grade students (IQB 2016), the RDD RF coefficients are statistically insignificant, reflecting limited statistical power due to the small sample size within the narrow bandwidth. Importantly, the point estimates are comparable to the corresponding IV and RF estimates.
For ninth-grade students (IQB 2015), the pattern is consistent and slightly more pronounced. The estimated effect on SEN total remains negative (LATE: −0.023; RF: −0.018; p < .05), and the models show significant associations for SEN-LDD (LATE: −0.022; RF: −0.018; p < .05). Notably, the effect for SEN-Learning is statistically significant at the 10% level (LATE: −0.016; RF: −0.012; p < .10). Looking at the RDD RF results, the estimates are statistically significant in line with the IV and RF results, while the point estimates are notably larger and directionally consistent across the outcomes (SEN total: RDD RF: −0.049; p < .01; SEN-LDD: RDD: −0.042; p < .05; SEN Learning: RDD: −0.041; p < .01).
Across both cohorts, no significant associations were found for SEN classifications in the domains of social-emotional development or language, likely due to the small number of cases in these subcategories. Although overall sample sizes were large, statistical power in some domain-specific models remains limited.
When interpreted relative to the observed base rates of SEN classification, the reduced form estimates reveal substantively meaningful effects. In Grade 4, the RF estimate for SEN total is −0.015, which corresponds to a 21% increase in the likelihood of SEN identification for relatively younger students, given a baseline prevalence of 7.2%. For SEN-LDD, the RF effect of −0.014 implies a 21% relative increase based on a baseline prevalence of 6.6%. In Grade 9, the RF estimate for SEN total is −0.018, representing a 25% increase over the same base prevalence. For SEN-LDD in Grade 9, the RF effect of −0.018 corresponds to a 27% higher probability of identification for younger students, relative to a base rate of 6.7%. Even the effect on SEN-Learning in Grade 9 (RF: −0.012), while only marginally significant (p < .10), implies a 34% relative increase based on a base prevalence of 3.5%.
The larger RDD estimates indicate that RAE intensify around the cutoff, suggesting that contrasts between the youngest and oldest students within narrowly defined cohorts yield especially strong differences in SEN identification. For SEN-Learning in Grade 9, the RDD RF estimate implies an increase of about 4.1 percentage points in the probability of identification around the cutoff; relative to a mean prevalence of 3.5%, this corresponds to a relative increase of approximately 117%.
ADHD
Table 5 presents the results of IV regressions estimating the causal effect of relative age at school entry on the probability of being diagnosed with ADHD. As before, both the LATE, RF, and RDD estimates are reported, with and without the inclusion of covariates.
Instrumental Variable Regressions (Two-Stage Least Squares) and Regression Discontinuity Design Estimate Estimating the Effect of Relative Age at School Entry on SEN Status – Regression Coefficients and Standard Errors (IQB 2016 – Grade 4; IQB 2015 – Grade 9).
Note. LATE = Local Average Treatment Effect; RF = Reduced Form estimate. RDD RF = Regression Discontinuity Design Reduced Form estimate (± 1 month around cut-off). Non-standardized regression coefficients; clustered standard errors on the school level in parentheses. All models include fixed effects for month of birth and federal state. Models in Columns 3 and 4 include student-level control variables (gender, HISEI, migration background). Bold values indicate statistically significant effects.
*p < .1; **p < .05; ***p < .01.
For fourth-grade students (IQB 2016), the results indicate a statistically significant negative association between being relatively younger and the likelihood of receiving an ADHD diagnosis. Specifically, younger students are about 1.0 to 1.4 percentage points more likely to be diagnosed with ADHD, depending on the model specification (LATE: −0.013 to −0.014; RF: −0.010 to −0.011; p < .05). These estimates are consistently significant across specifications, suggesting a robust pattern that aligns with previous evidence on relative age effects in behavioral diagnoses.
When interpreted in relation to the observed base rate, the effect of relative age on ADHD diagnosis in Grade 4 is substantial. The prevalence of ADHD in the sample is only 2.1% (397 cases among 18,877 students), yet the RF estimate indicates that relatively younger children are 1.1 percentage points more likely to be diagnosed with ADHD (RF: −0.011). This corresponds to a relative increase in the probability of diagnosis by approximately 52%, meaning that younger children are 52% more likely to receive an ADHD diagnosis compared to their older peers in the same cohort.
In contrast, for ninth-grade students (IQB 2015), the results show no statistically significant effect of relative age on ADHD diagnosis (LATE and RF estimates ranging from −0.002 to −0.003; all p > .10). The point estimates remain negative, but the coefficient is smaller, and the confidence intervals are wider, likely reflecting both a reduction in the strength of the age effect over time and a smaller number of identified ADHD cases (N = 248 in Grade 9).
For both samples, the RDD RF estimates for ADHD are comparable to the corresponding LATE and RF estimates (IQB 2016: RDD: −0.010; p < .01; IQB 2015: RDD: −0.003; p > .10).
Discussion
While numerous studies have examined RAE in areas such as academic and psychosocial development, research on its impact on SEN classification remains limited so far. Our study provides causal evidence that relative age at school entry impacts the likelihood of students being identified with SEN and diagnosed with ADHD in Germany. Using data from two National Assessment Studies and a 2SLS approach, we show that relatively younger students within a grade cohort are substantially more likely to receive these classifications in certain domains.
Summary and Interpretation of the Findings
Special Educational Needs
For all SEN students and especially in the domain of LDD, the probability of being identified increases by up to 2 percentage points for relatively younger students—equivalent to a relative increase of approximately 21% in Grade 4 and 25%–27% in Grade 9, that is, being a 1 year younger student at school entry, increases the probability of being classified with SEN by up to 27%. These magnitudes are in line with prior causal estimates from the U.S. context (Dhuey & Lipscomb, 2010), where teacher evaluations also play a central role in the SEN referral process.
When disaggregating the results by SEN subcategories, the estimated effects remain directionally consistent but lose statistical significance in the individual domains. For SEN-Learning in Grade 9, the effect falls short of statistical significance (only below the 10% level). However, the RDD RF estimates around the cutoff are substantial and statistically significant, indicating a pronounced local effect among students born close to the cutoff.
The statistical significance of effects observed in Grade 9 is likely driven by the larger available sample size. Given the similarity in the magnitude and direction of the point estimates across grades, as well as the substantial overlap in confidence intervals, the estimates for Grades 4 and 9 should be considered statistically comparable. At the same time, the consistent pattern of precisely estimated effects in Grade 9 warrants closer attention, as it suggests that RAE do not fade out after primary school but may persist—and potentially intensify—over the course of secondary education.
Although most maturational differences diminish over time as younger students catch up developmentally (Urruticoechea et al., 2021), the SEN status often persists and shapes students’ educational trajectories through lowered academic expectations, differentiated learning goals, and even placement in segregated settings. What begins as a temporary maturational gap may thus solidify into a structural disadvantage—highlighting the enduring impact of early diagnostic decisions that do not account for relative age (Algraigray & Boyle, 2017; Norwich, 2014).
Several interrelated mechanisms may explain this amplification over time. First, students who are already perceived as having difficulties in primary school may not receive a formal SEN diagnosis until after the transition to secondary school, where institutional demands increase and diagnostic classifications become more prevalent (Klemm et al., 2023). Second, early developmental differences may trigger a Matthew effect (Walberg & Tsai, 1983; Sideridis, 2011): students who start school with maturity-related advantages tend to accumulate academic and social gains over time, whereas relatively younger peers struggle to compensate relative difference. These disparities may be reinforced by structural selection mechanisms, particularly through the assignment of an SEN status. This is especially evident in the category of SEN-Learning, where classification often entails reduced curricular demands. While these adaptations are intended to provide targeted support, they may inadvertently lower academic demands as well as teacher expectations, and hence limiting students’ opportunities for meaningful progress (De Boer et al., 2010; Kashikar et al., 2025). In the context of SEN classification, lowered expectations can lead to self-fulfilling prophecies, whereby students adjust their performance to align with perceived expectations. This, in turn, reinforces teachers’ assumptions about students’ abilities and developmental potential—a dynamic commonly referred to as the Pygmalion effect (Szumski & Karwowski, 2019; Wang et al., 2018).
Consequently, relatively younger students who are initially disadvantaged due to developmental immaturity may become locked into educational trajectories that perpetuate—rather than compensate for—early gaps. For the context of SEN diagnoses, these findings challenge the assumption that RAE naturally fade as students mature. Instead, they appear to become institutionally entrenched, calling for a critical reassessment of early diagnostic and placement decisions—particularly in systems characterized by early tracking and rigid SEN classification procedures.
ADHD
For ADHD, the results reveal a pronounced RAE effect in Grade 4, indicating that younger students within a cohort are substantially more likely to receive a diagnosis than their older peers. Specifically, students who are 12 months younger than their classmates are 52% more likely to be diagnosed with ADHD—an increase of approximately 1.0 to 1.4 percentage points—closely aligning with previous international findings (e.g., Elder, 2010; Holland & Sayal, 2018; Schwandt & Wuppermann, 2016). This suggests that age-related developmental immaturity plays a significant role in early behavioral assessments. By Grade 9, however, the effect diminishes and is no longer statistically significant, while the estimate remains negative.
Although the sample size of our study is relatively small, the findings closely relate to the results reported by Schwandt and Wuppermann (2016). They found that children born just before the school entry cutoff are more than 30% more likely to be diagnosed with ADHD and to receive medication. Moreover, their findings suggest that diagnostic quality matters in order to explain RAEs: in regions with more specialized physicians, misdiagnosis rates tend to be lower—echoing evidence from Denmark where only specialized physicians are allowed to conduct ADHD assessments (Dalsgaard et al., 2012). Importantly, Schwandt and Wuppermann (2016) emphasize teacher-related factors regarding misdiagnoses. Increases in class size or student diversity correlate with higher ADHD rates, supporting the view that teacher referrals may inadvertently drive RAE. Yet, previous research suggests that teachers’ evaluations of behavior are more strongly affected by relative age than parental assessments (Childs & Wooten, 2022; Holland & Sayal, 2018).
Age-related differences in self-regulation, attention, and impulse control may be pathologized when teacher expectations are implicitly shaped by intra-cohort comparisons within classrooms. Hence, the strong RAE observed for ADHD in early schooling underscores the need for developmentally sensitive and professionally guided assessment practices. It raises concerns not only about labeling effects, as outlined above in the context of SEN, but also about the potentially harmful side effects of medication use resulting from ADHD misdiagnosis.
Implications for Educational Practice
Studying RAE in education is essential for uncovering systemic labeling dynamics and biases in categorical special education systems (Connor et al., 2008; Norwich, 2014). The finding that younger students are disproportionately identified with SEN and ADHD due to age-related developmental differences raises serious concerns about the fairness and validity of current diagnostic practices. These patterns should not be attributed to individual bias alone but understood as the result of structurally embedded processes—where decisions about special education needs are shaped by institutional routines, administrative constraints, and stratified school systems (Norwich, 2014). A structural reliance on teacher judgments, in the absence of standardized or externally moderated diagnostic tools, may lead to an increased risk of misclassification. Supporting this view, evidence from Switzerland and Denmark shows no significant RAE in the diagnosis of ADHD or learning disabilities—likely due to the stronger involvement of external specialists and standardized procedures, which reduce reliance on subjective classroom-based referrals (Balestra et al., 2020; Dalsgaard et al., 2012). This underscores the importance of implementing bias-mitigating strategies in the evaluation of academic and behavioral development. Such strategies may include delaying high-stakes classification decisions, applying age-normed reference standards, and strengthening the role of independent, multidisciplinary teams or professionals in the diagnostic process. On a systemic level, these findings call into question the continued reliance on rigid status diagnoses as the main gateway to support—highlighting the need for more flexible, needs-based approaches that do not depend on categorical labels (Norwich, 2014). Instead of assigning support based on classifications, educational systems should prioritize formative, individualized assessments that guide learning and development through personalized support planning. Such inclusive approaches offer a more equitable and developmentally appropriate pathway to ensure that all students’ needs are addressed without resorting to potentially stigmatizing status labels. To this end, teacher education should emphasize inclusive practices that support individual development and foster awareness with regard to labeling practices.
Strengths and Limitations of the Study
A major strength of this study lies in its ability to estimate causal effects of relative age at school entry on the likelihood of SEN and ADHD classification—an important advancement in a field where much of the existing research remains correlational. The use of an IV approach based on state-specific cutoff dates allows for a more rigorous identification of age-related effects, mitigating concerns of endogeneity. In addition, the study draws on data from two large-scale assessments in Germany, enhancing both the robustness and generalizability of the findings within the national context.
Nonetheless, several limitations must be acknowledged. First, the analyses are based on two cross-sectional datasets, which restricts causal insights into the long-term stability of SEN or ADHD classification and the developmental trajectories of affected students. Access to longitudinal, individual-level population data would allow for more nuanced analyses of cumulative disadvantage over time. Second, the dataset includes substantial item nonresponse, especially on key sociodemographic variables, which may introduce bias and reduce statistical power. Also, the sample size for the specific subgroups is particularly limited, making subgroup analyses more tentative. A further limitation relates to potential spillover effects at the classroom or school level. If delayed school entry occurs systematically in certain schools, individual outcomes may be influenced not only by a student’s own relative age but also by the relative age composition of the classroom, for example through comparison processes in teacher evaluations.
Finally, while the IV strategy effectively addresses endogeneity in school starting age, it does not differentiate between types of non-compliance. It remains unclear whether non-compliers are predominantly students who were retained in grade or deliberately redshirted. This distinction is relevant for understanding heterogeneity in treatment effects and should be explored in future research.
Ethical Considerations
This study uses secondary data from the Institute for Educational Quality Improvement (IQB) education trend (IQB-Bildungstrend). The data were collected and provided by IQB in accordance with applicable data protection regulations and ethical standards. All data are fully anonymized and do not allow identification of individual participants. The use of the data for secondary analysis complies with the terms of data access provided by IQB. According to institutional and national guidelines, no additional ethical approval was required for the present study, as it is based on anonymized secondary data.
Supplemental Material
sj-docx-1-ecx-10.1177_00144029261441922 - Supplemental material for Does the Early Bird Catch the Label? Relative Age Effects in the Assessment of Special Educational Needs and ADHD: Evidence from Germany
Supplemental material, sj-docx-1-ecx-10.1177_00144029261441922 for Does the Early Bird Catch the Label? Relative Age Effects in the Assessment of Special Educational Needs and ADHD: Evidence from Germany by Janka Goldan and Franz G. Westermaier in Exceptional Children
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article
Data Availability
The data were made available by the Research Data Centre at the Institute for Educational Quality Improvement (FDZ at IQB).
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
