Abstract
A negative correlation between schools’ migrant share and students’ educational outcomes has been described in multiple contexts, including Spain. In this article, we concentrate on testing the implications of one of the main mechanisms explaining this relationship, which pays attention to the share of migrants who are not proficient in the language of instruction. Spain represents an interesting case due to the significant presence of migrants born in Latin American countries, who are Spanish native speakers. By exploiting the different shares of Spanish-speaking and non-Spanish-speaking migrants across schools in Spain, we are able to test whether the share of non-Spanish native speakers (rather than the share of migrant students) affects students’ test scores in math. Our results show that the concentration of non–Latin American migrant students is significantly and negatively associated with students’ math test scores, although the effect is very small.
One of the most researched topics in the literature on school effects is the relationship between the composition of schools’ student body in terms of social class, ethnicity or migrant origin, and students’ academic outcomes. A negative correlation between schools’ migrant share and achievement has been described in multiple contexts, including Spain (Cebolla-Boado & Garrido-Medina, 2011), although this correlation has been partially of entirely explained by factors that shape both the sorting of students across schools and students’ learning outcomes, most notably families’ economic resources. In general, most studies have tried to identify whether there is a causal link between schools’ migrant share and students’ academic outcomes, controlling for other relevant individual characteristics. The findings are mixed and vary across national contexts, with some studies showing negative effects (Brunello & Rocco, 2013; Gould et al., 2009; Hardoy & Schøne, 2013; Jensen & Rasmussen, 2011), no effects (Cebolla-Boado, 2007; Conger, 2015; Geay et al., 2013; Hardoy et al., 2018), or even positive effects (Hermansen & Birkelund, 2015). While this literature has mostly concentrated on finding identification strategies to measure the causal effect of schools’ migrant share on achievement, little attention has been paid to testing the underlying mechanisms explaining this relationship, with some exceptions; see, for example, Cebolla-Boado and Garrido-Medina (2011).
In this article, we focus on a specific mechanism that could explain the (typically negative) causal relationship between schools’ migrant concentration and achievement described in previous studies, that is, schools with a high proportion of students who do not speak the language of instruction might slow the classroom learning and place an additional burden on teachers, who would adapt their pedagogical practices to students with low language proficiency. By lowering the level of instruction, teachers’ behavior might in turn affect the academic performance of both native and migrant students in the same classroom. This causal mechanism is tested indirectly, given that indicators of teachers’ level of instruction across schools are rarely available. We therefore examine the implications of this mechanism by exploiting the different distributions of Spanish-speaking (Latin American) and non-Spanish-speaking (non–Latin American) migrant students across schools in Spain.
Spain represents an interesting case to test the empirical validity of this argument due to the significant presence of migrants born in Latin American countries, the large majority of whom are native Spanish speakers. If being less proficient in the language of instruction is the main mechanism explaining the negative relationship between schools’ migrant concentration and students’ achievement, we expect this negative relationship to be associated only to the concentration of non-Spanish-speaking migrants (i.e., non–Latin American migrant students), but not to the concentration of Spanish-speaking migrants (i.e., Latin American students). Previous research has measured the impact of the concentration of students with low proficiency in the language of instruction on achievement. For example, in the American context, Cho (2012) finds negative effects of having English language learner (ELL) classmates on non-ELL’s reading scores, but not in their math scores; Diette and Uwaifo Oyelere (2017) identified small negative effects of the share of non-English-speaking students on high-achieving natives. In contrast, Geay et al. (2013) do not find any significant causal effects of having nonnative English-speaking peers on students’ educational outcomes in England; in this case, the initial negative correlation observed in the raw data was explained by the fact that nonnative English speakers were more likely to attend schools with English native speakers from disadvantaged backgrounds.
Previous research has also investigated heterogeneous peer effects in education (see Sacerdote, 2011, for a review), acknowledging that students’ response to their schools environment is not homogeneous and can vary depending on students’ own characteristics. While scholars have typically paid attention to the differential impact of schools’ migrant composition on migrant and native students alike, there are, however, fewer studies examining how students with different levels of ability react to the same school environment. A notable exception is the work by Diette and Uwaifo Oyelere (2017), who identify negative effects (albeit small) of the share of non-English-speaking students on high-achieving natives, but positive effects for native students with average or low academic achievement. In this regard, our article pays attention to the heterogeneous effects that different shares of Spanish-speaking and non-Spanish-speaking migrants have on low-, average-, and high-performing students.
In sum, this article contributes to the literature on the effects of migrant school concentration by exploring whether being less proficient in the language of instruction is the main mechanism behind the typically negative relationship between schools’ migrant share and students’ achievement. In addition, we also examined whether there is a differential impact of schools’ migrant composition on students with different levels of ability.
School Effects and Migrant Concentration
The literature on school effects have sought to explain why students’ learning outcomes vary across schools and the extent to which this variation can be explained by (1) the clustering of students with different backgrounds across schools, which in turn generate different peer group effects; (2) schools’ characteristics, including schools’ resources, organizational practices, and climate; and (3) teachers’ characteristics, including their skills, effort, and pedagogical styles. The results on the impact of school effects on learning outcomes have been mixed, but there is now consensus on the fact that a large part of the between-schools variation in attainment is due to the uneven sorting of students of across schools.
Although the debates regarding the causal impact of school-level factors on attainment have been vibrant, their empirical contrast has most of the times been limited to quantifying the residuals after controlling for the composition of the student body in terms of class, ethnicity, or family background (Hanushek et al., 2003), with little theory-driven research (Scheerens, 2013). Experimental research designs are still scarce, with some notable exceptions, for example, Duflo et al. (2011) and Zimmerman (2003).
In trying to untangle the complexity of school effects, the seminal distinction between so-called “Type A school effects” and “Type B school effects” (Raudenbush & Willms, 1995) is of key importance. Type A school effects refer to the role of peer effects, which are conditioned by the composition of the student body. Isolating the influence of the peer group composition on students’ outcomes is, however, challenging. The main two strategies followed by researchers have been either including student-level or school-level fixed effects or relying on an exogenous shock to the social composition of peer groups and/or schools (Sacerdote, 2011). The different empirical strategies used to identify the role of school composition on students’ outcomes partially explain the lack of consensus regarding the findings. Type B school effects refer to structural and institutional differences across educational systems and schools, including material and human resources as well as schools’ organizational practices. These explanations have largely been the focus of more policy-oriented research (Organisation for Economic Co-operation and Development, 2012).
The research on the impacts of schools’ migrant concentration on learning outcomes has a long tradition of exploring nonlinear effects. In this regard, multiple studies have shown that changes in school composition do not equally affect all students; for example, high-achieving students tend to experience the largest peer effects from other high-achieving peers (Sacerdote, 2011). Some studies have also shown that migrant students tend to experience larger negative effects of migrant concentration compared with nonmigrant students (Ballatore et al., 2018), although other scholars find opposite results (e.g., Jensen & Rasmussen, 2011).
Schools’ Migrant Concentration and Students’ Learning Outcomes: Mechanisms Explaining This Relationship
The negative correlation between schools’ migrant concentration and students’ academic outcomes has been partially of entirely explained by factors that influence both the sorting of students across schools and students’ learning outcomes, such as families economic resources (Cebolla-Boado, 2007). Yet several studies have shown that migrant concentration can have a causal impact on students’ cognitive and noncognitive outcomes above and beyond socioeconomic factors (Brunello & Rocco, 2013). Crucially, migrant students might influence other students’ academic performance (1) through negative peer interactions that affect the classroom environment or (2) by influencing teachers’ behavior (Hermansen & Birkelund, 2015).
During their adaption process to a new educational setting and social environment, migrant students can lag behind their nonmigrant peers, particularly when they are not proficient in the language of instruction and arrive at older ages (Lemmermann & Riphahn, 2018; Rumbault, 2004). If they are not able to catch up with their peers and feel excluded as a consequence, they might behave disruptively and worsen the classroom’s climate. This mechanism is, however, difficult to reconcile with the large literature showing that migrant children tend to be more ambitious and have higher educational aspirations than their nonmigrant peers; see, for example, Kao and Tienda (1998) for the United States and Gil-Hernández and Gracia (2018) for Spain.
An alternative mechanism explaining the relationship between schools’ migrant share and students’ academic outcomes would operate through changes in teachers’ behavior. A high share of migrant students might have an impact on learning outcomes by inducing changes in teachers’ level of instruction. Teachers might adjust their level of instruction and teaching practices to those students experiencing more learning difficulties due to factors such as low language proficiency (Gándara et al., 2003; Hunt, 2017). While disadvantaged students might benefit from this adjustment to their needs, it can be detrimental for more advanced students, who might not be able to fulfill their academic potential. At the same time, several studies have shown that teachers tend to exit schools with a high concentration of disadvantaged and lower achieving children (Hanushek et al., 2003), which in turn can widen the differences in attainment between schools with high and low concentrations of migrant students due to differences in teachers’ quality.
Hypotheses
A significant share of migrants in Spain originate from Latin American countries (Hierro, 2016) and hence are native Spanish speakers. If lack of proficiency in the destination country language is indeed a relevant mechanism explaining the causal relationship between schools’ migrant concentration and students’ academic outcomes, we expect to find no relationship between the school concentration of Spanish-speaking migrants and students’ performance. From this argument, the following two hypotheses follow:
The concentration of non-Spanish-speaking migrant students might have a larger detrimental effect on high-ability students than on average and low-ability students. If teachers adjust their level of instruction to those children experiencing more difficulties due to their low language proficiency, those with higher levels of ability might be less likely to fulfil their potential due to a less stimulating learning environment. The following two hypotheses derive from the previous argument:
Finally, as a robustness check, we examine whether the relationship between schools’ migrant concentration and students’ academic outcomes differs for Latin American and native students. If lack of proficiency in Spanish language is the main mechanism driving the negative correlation between schools’ migrant share and students’ performance, we expect the schools’ share of non–Latin American migrants to have the same effect on both native and Latin American students’ outcomes.
Data and Method
We use the Encuesta General de Diagnóstico (General Education Survey, EGD), a survey of a representative sample of students in secondary schools conducted by the Spanish Ministry of Education in 2011. The EGD 2011 measures students’ performance in math, Spanish, physics, and social sciences and is the only nationally representative survey in Spain allowing the identification of students’ country of birth. The sample is made up of 27,961 students clustered in 933 schools. 1
The dependent variable measures students’ standardized test score in math. Since math classes and tests are likely to be less dependent on students’ Spanish language proficiency, we focus on this specific subject. We use quantile regressions (QR) to estimate the impact of migrant concentration across students with different levels of performance in the math test. The main advantage of QR over standard least squares regression models is that QR does not assume that the association between the explanatory and dependent variables is the same at all levels, so it is more suited to understanding nonlinear relationships. We use bootstrapping and cluster–robust standard errors at the school level to improve the reliability of significance tests.
Students’ ability is proxied by their performance is the standardized math tests. The results of the QR models are presented for students in the 10th and 20th percentiles of the math test achievement distribution (low-achieving students), for those in the median, and for those in the 80th and 90th percentiles of the achievement distribution (high-achieving students). The model also includes the following control variables: type of school (public or private), as the relative weight of immigrants and natives in the public and private sectors has become increasingly different in Spain (Cebolla-Boado & Garrido-Medina, 2011); schools’ average socioeconomic status (school-level variable indicating the average socioeconomic status of students attending a given school); and students’ socioeconomic status (synthetic index of household resources, a composite variable provided by the survey that combines information on parental education and occupation, household resources, and number of books in the household). Alternative QR models, including students’ gender, family structure, and school resources as additional control variables, have also been estimated, but the results are consistent with those yielded by the simpler models (results available on request).
The Distribution of Migrant Students Across Spanish Schools
Figure 1 shows the distribution of migrants across the schools included the sample, differentiating between Latin American and non–Latin American migrant students. Just below a third of schools had no Latin American students and 38% of schools had no non–Latin American migrant students. There is a positive correlation between schools’ (Latin American and non–Latin American) migrant share and the average socioeconomic status of schools’ student body, which is slightly above 0.13 in both cases.

Distribution of migrants (Latin Americans and others) across schools in the sample.
Table 1 provides a descriptive summary of the variables used in the empirical analyses.
Summary Statistics.
Source. Encuesta General de Diagnóstico, Spanish Ministry of Education.
Results
The first step of the empirical analysis involves the estimation of a QR model with no control variables that only includes the school share of non–Latin American and Latin American migrant students as covariates (Figure 2). The two charts depicted in Figure 2 show the negative relationship between students’ math test scores and schools’ share of Latin American and non–Latin American migrants when no control variables are included. It is clear that the negative correlation between students’ math test scores and the share of non–Latin American migrants is substantially larger than the correlation between math test scores and the share of Latin American students. In addition, this negative relationship is more pronounced among high-achieving students than among average and low-achieving students. Among low-achieving students, 1 percentage point increase in the share of non–Latin American migrants is associated with a 1-point decrease in math test scores (note, however, that the estimated coefficient is considerably small in size, given that math test results range from 141 to 841 points). Among high-achieving students, 1 percentage point increase in the share of non–Latin American migrants is associated with 2-point decrease in math performance.

Relationship between students’ math test scores and schools’ share of Latin American and non–Latin American migrant students.
Figure 3 estimates the relationship between schools’ migrant share and students’ math test scores after adding to the model the control variables measuring students’ socioeconomic status, schools’ average socioeconomic status, and school type (public or private). The inclusion of these covariates allows us to control for relevant factors that are known to influence students’ academic achievement and their uneven distribution across schools. As shown in Figure 3, controlling for these three covariates has a dramatic effect on the coefficients estimating the effect of the share of Latin American and non–Latin American migrant students on students’ math test scores, which are substantially reduced in size compared with Figure 2.

Relationship between students’ math test scores and schools’ share of Latin American and non–Latin American migrant students.
Crucially, the school share of Latin American students is no longer significant for students’ performance in math; that is, once we take into account the type of school, students’ socioeconomic status, and the average socioeconomic status of schools’ student body, the presence of Latin American migrants is not significantly associated with students’ math test scores. In contrast, the concentration of non–Latin American migrant students remains negatively associated with students’ math performance, although the effect is notably reduced in size compared with the model with no control variables depicted in Figure 2. In addition, we do not find evidence that this negative effect varies in size across students in different positions of the achievement distribution. In consequence, based on results presented in Figure 3, we find evidence in favor of Hypothesis 1a (schools’ share of Latin American students is not associated with students’ test scores), Hypothesis 1b (schools’ share of non–Latin American migrant students is negatively associated with students’ test scores, even though the effect is very small) and Hypothesis 2a (schools’ share of Latin American migrant students is not associated with students’ math test scores, irrespective of whether they are high-, average-, or low-achieving students). In contrast, Hypothesis 2b is rejected (the effect of the schools’ share of non–Latin American migrant students is as negative for the performance of high-achieving students as for average- and low-achieving students). Even though we find statistically significant evidence in favor of Hypothesis 1b, it is important to bear in mind that the effect of the concentration of non–Latin American migrants on students’ math test scores is very small.
Robustness Check
Finally, as a robustness check, we examine whether the relationship between the school concentration of non–Latin American migrant students and students’ math test scores holds for both native and Latin American students, who are both Spanish native speakers regardless of their country of origin. To do so, we include interactions terms between our two categories of native Spanish speakers (Latin American migrants and natives) and the share of non–Latin American migrants at the school level (see Table A.3 in the appendix).
In line with our expectations, the concentration of non–Latin American migrants has the same small effect on the test scores of natives and of Latin American students, given that the interaction terms are not statistically significant. Thus, a 1 percentage point increase in the school share of non–Latin American migrants is associated with 0.3-point decrease in the test scores of both average- and low-achieving (50th and 20th percentile of the achievement distribution, respectively) natives and Latinos. Note, however, that this effect, though significant, is very small.
Conclusion
The relationship between schools’ migrant concentration and students’ academic and nonacademic outcomes has been one of the most prominent research topics in both sociology and economics. The debate regarding the effects of migrants’ concentration on outcomes has also gained relevance beyond academic circles, partly because of the alleged negative impact that the overrepresentation of migrant minorities in schools has on student performance. While the literature on school concentration has identified different potential explanations for the negative correlation between students’ performance and schools’ migrant share, in this article we concentrate on one of these mechanisms, that is, a high share of students with lower proficiency in the language of the destination country might lower teachers’ level of instruction, thus negatively affecting the academic performance of students with high language proficiency.
Spain represents an interesting case to test the empirical validity of this argument due to the significant presence of migrants born in Latin American countries, the large majority of whom are native Spanish speakers. The causal mechanism is tested indirectly, as we do not have direct indicators of teachers’ level of instruction across schools. Therefore, we rely on the schools’ share of Latin American and non–Latin American migrant students to examine the implications of the previous argument. After controlling for students’ socioeconomic background, the socioeconomic composition of the student body, and type of school, we find no significant effect of Latin Americans’ school concentration on students’ math performance. In contrast, the presence of non–Latin American migrant students (most of whom are nonnative Spanish speakers) is negatively associated with students’ performance, although the effect is very small. While we cannot ignore the negative effect that a large share of non-Spanish-speaking students has on students’ performance, we argue that the concentration of migrant students with poor Spanish language skills is not a major predictor of students’ performance, as its effect on test scores is, albeit significant, very small. Immigrant concentration is generally depicted as a source of disadvantage, particularly in public debates, but this negative relationship is mostly related to the fact that both migrants and economically disadvantaged students tend to cluster in the same schools.
Footnotes
Appendix
Quantile regression: Interactions.
| 1 | ||
|---|---|---|
| β | SE | |
| Q10 | ||
| Non-Latino migrants (%) | −0.19 | 0.26 |
| Latino (%) | −9.18 | 0.08 |
| Latino × non-Latino migrants (%) | −0.13 | 0.74 |
| Schools’ average socioeconomic status | 0.20*** | 0.00 |
| Students’ socioeconomic status | 14.89*** | 0.00 |
| Public school | −2.94 | 0.24 |
| Constant | 408.97*** | 0.00 |
| Q20 | ||
| Non-Latino migrants (%) | −0.32* | 0.04 |
| Latino (%) | −10.56** | 0.00 |
| Latino × non-Latino migrants (%) | −0.09 | 0.76 |
| Schools’ average socioeconomic status | 0.21*** | 0.00 |
| Students’ socioeconomic status | 16.58*** | 0.00 |
| Public school | −2.89 | 0.18 |
| Constant | 441.90*** | 0.00 |
| Q50 | ||
| Non-Latino migrants (%) | −0.34* | 0.05 |
| Latino (%) | −11.01*** | 0.00 |
| Latino × non-Latino migrants (%) | −0.12 | 0.70 |
| Schools’ average socioeconomic status | 0.24*** | 0.00 |
| Students’ socioeconomic status | 21.79*** | 0.00 |
| Public school | −1.96 | 0.42 |
| Constant | 504.98*** | 0.00 |
| Q80 | ||
| Non-Latino migrants (%) | −0.15 | 0.53 |
| Latino (%) | −12.60** | 0.00 |
| Latino × non-Latino migrants (%) | −0.64 | 0.06 |
| Schools’ average socioeconomic status | 0.28*** | 0.00 |
| Students’ socioeconomic status | 26.01*** | 0.00 |
| Public school | −2.15 | 0.48 |
| Constant | 575.99*** | 0.00 |
| Q90 | 0.20 | |
| Non-Latino migrants (%) | −0.31 | 0.04 |
| Latino (%) | −11.94* | 0.10 |
| Latino × non-Latino migrants (%) | −0.82 | 0.00 |
| Schools’ average socioeconomic status | 0.29*** | 0.00 |
| Students’ socioeconomic status | 29.35*** | 0.94 |
| Public school | −0.30 | 0.00 |
| Constant | 615.64*** | 0.20 |
| χ2 | 771.3 | |
| N | 25,096 | |
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This article and its open access publication is based upon work funded by COST Action 16111 EthmigSurveyData (
), supported by COST (European Cooperation in Science and Technology) and funded by the Horizon 2020 Framework Programme of the European Union; the University of Oxford; the Universidad Autónoma de Madrid, and the Casa de Velázquez in Madrid.
