Abstract
The present article investigates the relationship between the degree of tracking and inequalities in reading literacy of second-generation and non-immigrant students in 28 Western countries. The article takes into account that next to between-school tracking, there are also more subtle forms of tracking, such as tracking within schools or classes. By elaborating how the distinct mechanisms of different tracking characteristics generate achievement inequalities, I assume that any negative effects of tracking on second-generation immigrant students’ achievements are primarily driven by differences in the quality of school environments. Data from the Programme for International Student Assessment 2018 are used and multilevel regression analysis with country-fixed effects are applied. The findings reveal that a higher tracking degree is related to substantial disadvantages in reading literacy for immigrant children. Furthermore, a higher immigrant concentration in schools is associated with immigrant inequalities in reading performance as the degree of tracking increases, whereas unequal distributions of teacher and instructional quality were found to generate inequalities in countries with less tracking. Even though the results are only partly in line with the theory of tracking influences on immigrant achievement disadvantages, they suggest that the interplay between institutional tracking and school characteristics are crucial for learning inequalities.
Keywords
Introduction
Second-generation immigrant students 1 still experience systematic performance disadvantages in many European countries (Borgna, 2017; Heath and Brinbaum, 2007; Heath et al., 2008; Schneeweis, 2011; Schnepf, 2007), despite passing through the education system in the same manner as their non-immigrant classmates. To promote educational opportunities, education systems in host societies are an important tool for integration that provide young people with the capabilities they need to cope in their future lives. Recent studies comparing immigrant educational inequalities between countries have increasingly sought to explain the immigrant disadvantage by referring to the structure of host country education systems (Borgna and Contini, 2014; Buchmann and Parrado, 2006; Cobb-Clark et al., 2012; Dronkers et al., 2014).
In that sense, educational tracking, that is the ‘practice of assigning students into instructional groups on the basis of ability’ (Hallinan, 2000: 218), is regarded as a relevant education system characteristic that fosters unequal educational opportunities, especially for immigrant students (Alba et al., 2011; Crul and Vermeulen, 2003; Van de Werfhorst and Mijs, 2010). Even though there are numerous studies investigating the impact of tracking on immigrant achievement disadvantages in a comparative manner, the existing empirical evidence is far from clear. For example, some studies find a negative impact of early tracking on immigrant achievement inequalities (Cobb-Clark et al., 2012; Entorf and Lauk, 2008), whereas the effect of the number of tracks hardly impacts immigrant performance inequalities (Fossati, 2011; Riederer and Verwiebe, 2015; Verwiebe and Riederer, 2013). In contrast, when applying ability grouping in some subjects (Cobb-Clark et al., 2012) or when education systems are moderately stratified (Dronkers et al., 2014), positive effects are found for immigrant students.
These diverse findings might be due to the fact that the conceptualization and related measurement of tracking differs between studies. The majority of studies refer to between-school tracking (BS-tracking), which involves the explicit and institutionalized selection of students into different school types or overarching academic or vocational programmes, and is commonly accompanied by separate curricula for all subjects. In countries where tracking occurs within schools, students are either sorted into academic or vocational ‘streams’, where distinct educational programmes are taught in the same building (WS-tracking); or students are tracked via more subtle forms of ‘ability grouping’ or ‘course-by-course-tracking’ (CBC-tracking), where students are sorted into different ability classes only for some subjects or within classes and subjects (Bol et al., 2014; Chmielewski, 2014; Chmielewski et al., 2013; Gamoran, 2000; Schofield, 2010; Trautwein et al., 2006). Fewer international studies examining the impact of tracking on immigrant achievement inequalities have explicitly addressed these less institutionalized forms of tracking, even though they are prevalent in many comprehensive education systems and provoke similar levels of social disparities in educational achievement (Chmielewski, 2014; Schmidt et al., 2015). Moreover, these findings challenge the idea of lower achievement inequalities in so-called comprehensive systems.
Given the mixed empirical evidence and the negligence of addressing different tracking forms in research on immigrant achievement inequalities, the present article asks how the reading achievement gap between non-immigrant and second-generation immigrant students changes in response to the degree of tracking. Hence, the first contribution of the article is that it extends the common understanding of tracking by theoretically and empirically considering a more fine-grained classification of tracking and thereby includes all possible forms of tracking that can occur. This approach enables the direct testing of the theoretically proposed mechanisms on the harmful tracking effects on immigrant achievement inequalities.
The article makes a second contribution and explicitly includes school-level characteristics in order to examine the distinct features and consequences of each tracking type for the school environment. Research on tracking finds that between-school tracking reinforces social segregation in schools (Alegre and Ferrer, 2010) and leads to both fewer teaching resources (Brunello and Checchi, 2007) and reduced learning conditions in the lower tracks (Baumert et al., 2006). Yet, there is still limited evidence of tracking effects on immigrant achievement inequalities under the consideration of the role of school characteristics (Dronkers et al., 2012). The article therefore aims to illuminate how the system and the school level are interrelated and whether this relationship varies between different tracking degrees.
This article uses data from the latest wave of the Programme for International Student Achievement (PISA) 2018 and assesses reading literacy of second-generation and non-immigrant students in 28 European and OECD countries. To assess the degree of tracking, the comprehensive tracking index based on Bol and Van de Werfhorst (2013) is further developed and additionally takes into account the extent to which tracking occurs between or within schools. To measure the extent to which school characteristics account for the tracking degree effects, I investigate the role of immigrant concentration, teacher qualification, instructional quality and school resources. The findings from two-level random intercept models with country-fixed effects suggest that on a general level, the degree of tracking is substantially related to reading performance disadvantages among second-generation immigrant students. Moreover, compared to non-immigrants, second-generation immigrant students’ reading literacy suffers more from a higher immigrant concentration in schools, whereas teaching and instructional quality widen immigrant achievement inequalities in countries with less tracking. These findings hence shed new light on the role of schools in generating immigrant achievement inequalities alongside the tracking degree.
Previous findings on the effect of tracking on immigrant achievement inequalities in cross-national research
There are numerous studies on the effect of tracking on immigrant achievement inequalities. However, these studies differ in their conceptualization and operationalization of tracking and most of these studies only consider one single indicator to assess the degree of tracking in a country. With regard to the timing of tracking, Cobb-Clark et al. (2012) found that a later selection age into school tracks is positively associated with reading, math and science achievements for some selected groups of second- and first-generation immigrant students. Similarly, Entorf and Lauk (2008) found that early tracking magnifies the reading gap between immigrant students from less privileged families and non-immigrant students. By contrast, a study using a difference-in-difference design did not find an overall effect of early tracking on the migrant performance gap; it only identified a negative impact for first-generation immigrant students and students with poor test-language proficiency (Ruhose and Schwerdt, 2016). With regard to the number of tracks, some studies have found that the effects on the immigrant achievement gap are either non-significant (Fossati, 2011) or not stable over time (Riederer and Verwiebe, 2015; Verwiebe and Riederer, 2013). Finally, a study that measured tracking via a comprehensive index to determine the joint effect of age, number and duration of between-school tracking revealed disadvantages in test scores for immigrant students in secondary school and other educational outcomes (Van de Werfhorst et al., 2014).
The few cross-national studies on achievement inequalities that systematically account for different tracking forms are mainly interested in socio-economic achievement disparities. These studies show that socio-economic inequalities in achievement are more pronounced at the track level in countries that use BS-tracking, whereas in WS-tracking countries, they occur on the individual or family level (Chmielewski, 2014; Schmidt et al., 2015). Only a few studies on immigrant achievement inequalities have dealt with different tracking forms. These studies show that tracking between or within classes diminishes immigrant students’ achievements when applied to all classes, whereas they enhance immigrant students’ performances when only used in some subjects (Cobb-Clark et al., 2012). The non-immigrant–immigrant gap, however, varies considerably for different migrant groups depending on generational status, age of arrival and language proficiency, and between the three PISA domains. Indirect measures of tracking within schools based on country dummy variables using the number of tracks, the age of tracking and the presence of internal tracking forms have revealed negative effects of highly tracked systems, but they have also shown positive effects of moderately stratified systems on immigrants’ achievement levels compared to comprehensive systems (Dronkers et al., 2014).
Finally, some studies have additionally considered school-level predictors, such as immigrant and socio-economic segregation and its relation to tracking. One study solely focusing on immigrant students revealed that immigrant students’ performance is more positively affected by being in a school with a higher average socio-economic status in highly tracked systems than in a moderately stratified or comprehensive education system (Dronkers et al., 2012). Similarly, the combination of between-school tracking with a high immigrant concentration widens the immigrant performance gap in reading whereas the opposite is the case for tracking between or within classes (Teltemann and Schunck, 2016).
The present article contributes to the literature on tracking effects on immigrant achievement disparities in two ways. First, as is apparent from previous cross-national research on immigrant achievement inequalities, there seems to be no unambiguous evidence on the influence of tracking occurring between versus within schools. This may be partly due to the diverging understandings and operationalizations of the concept under study. Additionally, most studies investigate only one single aspect of tracking, and these indicators often refer to tracking between schools. However, as previous findings on social disparities reveal, tracking within schools (i.e. between or within classes) also contributes to educational inequalities. Hence, it is necessary to include all aspects of tracking at once in order to fully capture the whole spectrum of tracking effects.
Second, as it is likely that tracking effects are mediated by the school environment, the inclusion of school characteristics is crucial in order to detect the genuine mechanisms of tracking (Dronkers et al., 2012). This article thus extends the work of Dronkers et al. (2012, 2014) by examining the relationship between tracking and school-level characteristics on achievement inequalities between non-immigrant and second-generation immigrant students. 2 It also follows up on Teltemann and Schunck (2016), who distinguish between tracking forms but use socio-economic segregation as a proxy for between-school tracking. Yet, as segregation may be prevalent in both tracking types, this article uses an approach that disentangles tracking and segregation and is thus able to explicitly test whether segregation is responsible for achievement inequalities as a consequence of between-school tracking.
Theoretical framework: Explaining immigrant achievement inequalities from a cross-national perspective
As laid out in the previous section, two institutional characteristics of tracking seem to be relevant for immigrant achievement differences. Accordingly, the first important characteristic to determine the tracking degree refers to the time point at which students are selected into different ability levels. The second characteristic relates to the rigidity of tracking and refers to the extent to which tracking takes place between or within schools. Here, I make use of a characterization of tracking forms by Chmielewski et al. (2013), where tracking occurs between schools (i.e. BS-tracking, the most rigid form), within schools (i.e. WS-tracking), or between or within classes (i.e. CBC-tracking, the most flexible form of tracking). Even though these distinct tracking types exist, the tracking degree should be understood as a continuum, where different tracking types and characteristics can also coexist in one country. A higher degree of tracking occurs, the more rigid tracking is exercised (BS-tracking vs. CBC-tracking with WS-tracking in between) and the earlier tracking takes place. Based on this definition, I argue in the following section that it is the combined effect of early and rigid tracking that leads to immigrant disadvantages in achievement.
General tracking degree effects on immigrant achievement inequalities
Starting with the first factor, early tracking is associated with immigrant achievement inequalities due to an initial ability disadvantage, which stems from ‘primary ethnic effects’ (Kristen et al., 2011: 124). Lower host-language proficiency (Esser, 2006; Müller and Stanat, 2006), lower parental knowledge of the host education systems (Bauer and Riphahn, 2013: 113) or a lacking transferability of parental human capital (Verwiebe and Riederer, 2013) may result in fewer opportunities for immigrant parents to support their children, leading to more initial achievement disadvantages among second-generation immigrant students (Cebolla-Boado, 2011; Heath and Brinbaum, 2007: 298). Early tracking may also be problematic for immigrant children due to ‘secondary ethnic effects’ (Kristen et al., 2011: 125). Immigrant students more often lack parental strategic knowledge, which is required for parents to effectively navigate their children through the host education systems (Dronkers et al., 2012: 14; Kao and Tienda, 1998; Kristen, 2008). As a result of primary and secondary ethnic effects, immigrant children are more often sorted into lower school tracks in countries with earlier tracking (Contini and Azzolini, 2016; Schofield, 2010; Van de Werfhorst et al., 2014).
In BS-tracking, where sorting leads to a differentiation between school types, this information bias may be more detrimental, because an early track choice requires higher parental guidance (Pfeffer, 2008: 12), decision options may be more complex (Tjaden and Hunkler, 2017: 2) and decisions may require more information due to their long-term impact. Yet, choosing the right ability course in CBC-tracking contexts may be just as difficult due to their more informal character (Chmielewski, 2014: 296; Werum et al., 2011: 388). Knowledge disadvantages may, however, be more detrimental in BS-tracking and to some extent in WS-tracking contexts, especially when combined with early selection because parental interference in track decisions is much more consequential (Bauer and Riphahn, 2013: 111).
In contrast to this more deficit-oriented perspective, there are also positive secondary effects, where higher educational aspirations among immigrant families are assumed to be responsible for more ambitious educational choices conditional on prior performance and social origin (Brinbaum and Cebolla-Boado, 2007; Jackson et al., 2012; Jonsson and Rudolphi, 2011). The positive immigrant transition patterns into academic tracks are, however, more pronounced in systems where tracking occurs at upper secondary level compared to systems where tracking occurs earlier (Baysu et al., 2018; Lessard-Phillips et al., 2014b; Van de Werfhorst et al., 2014). Thus, immigrant children may be less able to translate their high aspirations into favourable transitions in BS-tracking systems because they are less able to anticipate institutional regulations (Becker, 2010) or because transitions are more dependent on prior performance (Heath and Brinbaum, 2007). Selection seems to be less detrimental in CBC-tracking contexts due to later ability track placement (Chmielewski, 2014: 295), giving second-generation immigrant students more time to catch up and improve performance before choosing a suitable educational route (Entorf and Lauk, 2008: 634).
Compared to early selection in BS-tracking contexts, there is less systematic information on the time point of selection in CBC-or WS-tracking contexts. In the United States, ability grouping is a practice that already occurs in elementary school (LeTendre et al., 2003), whereas in less-tracked European education systems there is more variation in the prevalence of any kind of differentiation in lower secondary education (Traini et al., 2021). However, it may not be the time point per se which is detrimental; rather the combination of early tracking and the consequences of track placement seem to be decisive in generating achievement inequalities. This leads to the second factor in determining the tracking degree, which is related to the rigidity of track placement. Once students are placed in a specific track, a lower permeability prevents students’ between-track mobility in more-tracked systems (Hallinan, 1996; Kerckhoff, 2000; Lessard-Phillips et al., 2014a; LeTendre et al., 2003; Schofield, 2010: 1498). Together with early tracking, this leads to a situation where students’ initial disadvantages are perpetuated because they have fewer opportunities to escape their initial (lower) track placement (Crul and Vermeulen, 2003: 979; Van de Werfhorst and Heath, 2019: 352). Hence, a lack of track mobility may further strengthen the initial achievement gap between second-generation and non-immigrant students.
Given the higher institutional borders implicit in the separation into school types, changing tracks is assumed to be most difficult in BS-tracking contexts (see also Buchmann and Park, 2009: 247; Felouzis and Charmillot, 2013: 191; Pfeffer, 2015: 353). In WS-tracking contexts, track mobility may be less difficult as it does not involve changing school types. The most flexible form is represented by CBC-tracking, where track mobility is fluid especially at earlier educational stages. However, since adjusted track placement is based on prior performance and the curricular content the student has covered, transitioning into another course level can also become more difficult at later educational stages (Chmielewski, 2014: 296; LeTendre et al., 2003).
To conclude, in light of the obstacles presented by early tracking and low track mobility, we can assume that the immigrant achievement gap is mainly driven by the lower track placement of second-generation immigrant students in countries where earlier tracking occurs and more rigid sorting takes place. I thus expect that
As previous literature has shown, immigrant achievement inequalities may not only be influenced by tracking per se, but might additionally be a consequence of the between-school variations of school-level factors along with the country-level tracking degree. The following section will focus on those school-level characteristics deemed to constitute a relevant mechanism of tracking in shaping immigrant achievement inequalities.
The interplay of the tracking degree and school characteristics
The extent to which tracking on the system level is exercised leads to a different allocation of students into schools (Bol et al., 2014). For example, BS-tracking groups students with similar ability levels and socio-economic characteristics, creating more homogenous learning environments compared to WS- or CBC-tracking (Baumert et al., 2006). As a consequence, scholars argue that the impact of the school characteristics in terms of peer interactions, distributed resources and instructional differences depends on the country-level tracking degree (Bol et al., 2014; Dronkers et al., 2012). Following this argumentation, I expect the main driving factors behind immigrant achievement inequalities in countries with a higher tracking degree to be caused by school-level differences in immigrant concentration, teacher qualification and instructional quality as well as the distribution of resources between schools.
Immigrant concentration
As argued, in contexts with more extensive tracking, track allocation is often associated with immigrant background, and more immigrant children are found in lower BS-track schools. Researchers therefore assume that schools in countries with early BS-tracking are more socially and ethnically homogenous (Baumert et al., 2006; Maaz et al., 2008: 102). This institutional characteristic of early sorting is thus often accompanied by an additional enhancement of segregation tendencies in BS-tracking systems (Dronkers et al., 2012; Entorf and Lauk, 2008: 635; Maaz et al., 2008: 102). Since students from different tracks attend the same school in CBC- and WS-tracking contexts and are even only temporarily grouped in CBC-tracking, these systems can be expected to be less segregated (Chmielewski, 2014). Hence, exposure to peers from different immigrant and social backgrounds and from higher tracks should be especially higher in countries using CBC-tracking (and to some extent WS-tracking), leading to more favourable peer interactions. Particularly disadvantaged students can profit from mixed learning environments due to the assistance and encouragement of peers with higher ability levels, who in turn profit as well by ‘giving the elaborated help’ (Wilkinson, 2002: 441; Zimmer and Toma, 2000). Even if homogeneous ability and socio-economic levels simply occur on the stream or course level (Trautwein et al., 2006: 789), the less institutionalized and more fluid character of ability grouping should inhibit the negative long-term effects of low track affiliation (Hallinan, 2000; Slavin, 1987). This argument is supported by empirical findings of stronger (negative) segregation effects for immigrant students’ achievement in BS-tracking systems (Baysu and de Valk, 2012; Dronkers et al., 2012; Entorf and Lauk, 2008; Park and Kyei, 2010).
Regarding differential effects for immigrant and non-immigrant children, immigrant students’ performance may depend more on the school environment due to immigrant families’ lower resource endowments (Ewijk and Sleegers, 2010: 257). In this regard, second-generation immigrant students can be particularly negatively affected by a higher immigrant concentration because they get fewer opportunities to practise the host language (Esser, 2006) and gain less from host-specific capital since they may have fewer friendships with the majority population (Kalter, 2006). If immigrant concentration is accompanied by socio-economic segregation and selection into lower ability tracks, students experience a less mixed learning environment and less exposure to higher ability peers. This additionally reduces learning opportunities for second-generation students (Thrupp et al., 2002; Zimmer and Toma, 2000).
Hence, due to the more pronounced low track assignment of second-generation immigrant students in countries with a higher tracking degree, I expect them to be more exposed to (and thus more negatively affected by) a higher immigrant concentration, whereas in CBC-tracking and to a lesser extent in WS-tracking contexts they should benefit from more positive peer interactions with students from other tracks and backgrounds. I thus expect that
Teacher qualification and instructional quality
The quality of teaching and instruction is another crucial determinant on the school level of which an uneven distribution across schools likely varies with the tracking degree. In this regard, in more-tracked systems, inequalities are magnified through the uneven allocation of teacher quality, teacher training and curricular demands between school tracks (Maaz et al., 2008). Research has found that higher ability classes or schools attract more qualified teachers (Brunello and Checchi, 2007; Hallinan, 1994; Hattie, 2002; Oakes, 1985), increasing achievement levels in higher tracks due to the allocation of better trained teachers with higher pedagogical skills (Baumert et al., 2010). Similarly, the instructional quality varies between track levels, with a more demanding curriculum and more instructional time in higher tracks (Carbonaro and Gamoran, 2002; Dreeben and Barr, 1988; Hallinan, 2000; Hattie, 2002; Oakes et al., 1990). In low-ability tracks, teachers may lower their expectations regarding students’ school success (Van Houtte, 2004) and adapt teaching practices accordingly (Hallam and Ireson, 2005).
Research has not, to date, paid much attention to the varying effects of teacher qualification and instruction dependent on the country-level tracking degree, but we can expect unfavourable learning conditions to be especially severe in BS-tracking countries, as the institutional differentiation results in cumulative disadvantage for students in the lower tracks (Baumert et al., 2006). Moreover, tracking makes students’ status and academic achievements more salient (Legette, 2020) and this awareness seems to be more prevalent in BS-tracking contexts (Dupriez and Dumay, 2006). As a result, teachers’ expectations towards students may be more closely linked to the attended track level when tracking is exercised in a more widespread manner (Domina et al., 2017). Moreover, research has also found teacher expectations to be lower in more segregated schools (Agirdag et al., 2013), which might be more prevalent in BS-tracking due to greater segregation. If teacher expectations are the basis for instructional adaptation processes, it can be assumed that a less demanding instruction is more common in lower tracks, the more extensive tracking is prevalent. Similarly, the unequal attraction of qualified teachers across tracks is assumed to be more detrimental in BS-tracking contexts, since the allocation of teachers according to their qualification takes place between school types (Baumert et al., 2010; Brunello and Checchi, 2007).
On the one hand, second-generation immigrant students may need more qualified and targeted teaching due to their more disadvantaged starting position, especially with regard to their host country language proficiency (European Commission, 2015). On the other hand, low demands and higher criticism among teachers may affect immigrant students’ learning motivation more, when these expectations are additionally stereotyped (De Paola and Brunello, 2016: 22).
As a result of a more pronounced selection into lower ability schools in more-tracked systems, second-generation immigrant students should be more affected by less qualified teaching staff and lower instructional quality (Dronkers et al., 2014). I thus expect that
School resources
A last line of argumentation states that school resources, such as textbooks, computers or other instructional materials and facilities, are distributed unequally across schools, meaning that fewer resources are available in socio-economically disadvantaged schools and in schools with a high immigrant concentration (Alba et al., 2011; Dronkers and Levels, 2007: 441; Oakes et al., 1990; Teltemann and Schunck, 2016). At the same time, resources may also be unequally distributed between tracks, with more school resources allocated to academic school tracks in BS-tracking countries (Brunello and Checchi, 2007), which might explain the stronger resource effects on inequalities in those countries (Betts, 2011: 378). Empirically, school resource effects were explored depending on the school composition and evidence is mixed. Whereas Dronkers and Levels (2007) do not find that school resources lower the effect of segregation on mathematic achievement among immigrant students, Teltemann and Schunck (2016) show that unequal resource distribution between schools combined with a high segregation lowers reading performance among second-generation immigrant students.
Since there is no direct evidence on the relationship between resource allocation and the country-level tracking degree, I test this assumption by following the argument that second-generation immigrant students are more often sorted into lower tracks in more-tracked countries and may hence additionally suffer due to the lack of schooling resources in such disadvantaged environments. The shortage of such resources in schools may be especially harmful for immigrant students, who are more likely to lack the specific resources from home to compensate for such deficiencies (Verwiebe and Riederer, 2013: 204). I thus expect that
Data
For the current analyses, I use the representative, international large-scale dataset of PISA. Since 2000, the triennial survey has measured competencies in reading, mathematics, science and problem solving among 15-year-old students at the end of compulsory schooling, measuring the extent to which young people are skilled and able to fully participate in society. This article makes use of the PISA 2018 wave, which focused on reading literacy. Standardized variables allow for a direct comparison across a wide number of OECD and partner countries. In addition to information on students’ skills, demographics and attitudes, the dataset provides a wide range of information on the school environment, which makes it especially interesting for the present research question. Due to the data’s cross-sectional nature, which does not allow us to make causal claims, the results must be interpreted as correlations (Dronkers et al., 2014).
Operationalization
Dependent and individual-level variables
The dependent variable depicts reading literacy, which is defined as being competent in ‘understanding, using, evaluating, reflecting on and engaging with texts’ (OECD, 2019: 15). Since these skills have become more important in times of content digitalization, immigrant students’ reading disadvantage has severe consequences for their active and autonomous participation in society (OECD, 2019). The central independent variable, students’ immigrant background, was coded as a dummy variable indicating whether a student is non-immigrant (students and both parents born in the respective survey country) or second-generation immigrant (students who were born in the survey country, but both parents born abroad). 3 First-generation immigrant students were excluded from the sample.
Further individual-level controls were created to account for students’ background characteristics. The modal grade in country gives a rough indication of students’ current educational situation. It measures the distance of the actual grade to the grade that a student should usually be in, depending on the country context. Foreign language at home is a proxy for language skills and measures whether the language of the survey country or another language is spoken at home. Finally, I control for gender and for students’ social background by using the PISA index of economic, social and cultural status (ESCS).
School-level variables
On the school level, I measure the immigrant concentration as the percentage of immigrant pupils (first and second generation) in a school. To control for correlations with the student body’s social composition (Stanat, 2006), I also consider the mean ESCS on the school level. Furthermore, I assess teacher qualification via the school proportion of teachers who are equipped with ISCED 5a qualifications (i.e. a university degree). Instructional quality is based on a mean index of eight student-reported items on whether teachers provide students with additional reading tasks helping them to relate and evaluate the content they have studied. The indicator is calculated as a school mean and captures whether students receive an encouraging instruction in which they are, e.g. challenged to apply their acquired knowledge. For a direct interpretation of the indicators according to the hypotheses, the scales of teacher qualification and instructional quality are reversed. Hence, for example, higher values on the instructional quality index represent lower curricular demands and vice versa. To measure school resources, I use the generated index on the shortage in educational material resources to determine the extent to which schools lack important resources necessary to provide a quality education. The index is derived from four items on how school principals perceive potential factors that hinder instruction at their schools (e.g. shortages of instructional material, computers or library materials). For a better comparison of the indicators, I standardized all school-level variables 4 (see Tables 1 and 2 for a descriptive overview of variables).
Descriptive statistics (N=124,712).
Source: PISA 2018. Note: Mean/Prop. adjusted for senate weights.
Weighted mean statistics by country (N=124,712).
Source: PISA 2018. Note: Adjusted for senate weights.
Country selection and the measurement of the tracking degree
To achieve a comparable design of the similar education systems of Western societies, only European and OECD countries were selected (Borgna and Contini, 2014). Moreover, to avoid biases due to few cases on the school level, I dropped schools that consisted of fewer than five students from the final sample (cf. Dronkers et al., 2012: 19). After the list-wise deletion of cases with missing values, I only considered countries with at least 50 second-generation or non-immigrant pupils. After applying these considerations, the final sample consists of 124,712 individuals in 6,147 schools in 28 countries.
In order to measure the tracking degree for each country, an encompassing index was built based on four indicators similar to Bol and Van de Werfhorst (2013). The first two indicators measure the number of distinct educational programmes that students are taught at the age of 15 and the age at which the selection into this programme took place. Since distinct educational programmes are either taught within one school or between schools (OECD, 2020: 70–85), I calculated a third indicator, displaying the country average number of distinct educational programmes that are offered within schools (see Table 3). This indicator is built on a PISA variable which identifies in which educational programme students are located. To give an example from Table 3, in Italy there are four distinct educational programmes available, but the average number of tracks taught within schools is 1, which means that each of the programmes is taught in a different school, which is the definition of BS-tracking. Contrarily, in the Netherlands schools provide four tracks as well, but the average number of programmes taught within schools is 3.13, meaning that students are overwhelmingly streamed within schools. The fourth indicator represents the share of students situated in schools that practise ability grouping within or between classes, either in some or in all classes. This indicator displays how strong CBC-tracking occurs and helps to furthermore accentuate the degree of tracking, especially in countries that provide only one educational programme. The index was created using a principal component analysis, with an eigenvalue of 1.95 for the first component.
Tracking index by country.
Sources: a: PISA 2018 Database, Table B3.3.3., b: PISA 2018, own calculations.
Note: Number of educational programmes on school level is set to 1 in USA, NZL and CAN, since they refer to educational or grade level and are therefore not distinct educational programmes.
From Table 3 we can see that the tracking degree varies from 2.63 (Luxembourg) to −1.69 (New Zealand). A higher tracking index represents earlier tracking, a higher number of tracks and a higher prevalence of between-school tracking. Countries at the lower end of the tracking index show a later tracking age but still use some kind of ability grouping within schools. This supports the recent observation that the majority of countries use some kind of tracking (Chmielewski, 2014), either between schools, within schools or between classes. Countries that are usually referred to as comprehensive systems, such as the Anglo-Saxon or Scandinavian countries, show a high prevalence of ability grouping within or between classes (e.g. 95 percent of students are grouped in the UK).
Analytical approach
The PISA data comes along with a two-stage stratified sampling design, where schools and individuals are selected randomly. To account for this sampling structure (students are nested in schools), I conduct two-level linear random slope models and apply country-fixed effects by using country dummy variables. The use of the country-fixed effects approach allows for holding constant all other country-specific characteristics (such as educational spending or immigrant selectivity) so that differences in the coefficients between the country samples should be attributed to the tracking degree (Cobb-Clark et al., 2012; Möhring, 2012; Teltemann and Schunck, 2016). Due to the implementation of cross-level interaction effects (see later), I apply random slope models and allow the immigrant background to vary between schools (Heisig and Schaeffer, 2019).
The analyses are carried out by separately analysing each of the 10 plausible values in reading competencies. An imputation routine with the STATA mim-ado (Galati et al., 2013) reports mean estimates over single analyses for each plausible value, which allows for consistent estimators of the population parameters, e.g. to correct for biases in students’ proficiency and standard errors (Borgna and Contini, 2014; Caro Vasquez and Biecek, 2016; OECD, 2014: 147). I use cluster-robust standard errors which produce the same point estimates compared to the recommended PISA approach, suggesting the use of 80 replicate weights in order to account for the replacement strategy of schools within the sampling procedure (Lopez-Agudo et al., 2017; OECD, 2014: 139). Cluster-robust standard errors have been shown to only marginally affect standard error estimations (Lopez-Agudo et al., 2017). Additionally, normalized student weights (i.e. senate weights) are applied to account for different sampling procedures across countries (OECD, 2014: 396).
After referring to the bivariate results, I will present results from multilevel analyses and examine the extent to which immigrant achievement inequalities are accounted for by individual- versus school-level variables. In a second step, I will test the hypotheses on the effect of school variables by immigrant background in dependence of the tracking degree and present three-way cross-level interactions visualized with the help of margin plots. Cross-level interactions are possible in country-fixed effects because the interest is not in the main effect but in the variation of the immigrant background effect in the dependency on school- and country-level characteristics (Bol et al., 2014: 8).
Results
Bivariate results
Figure 1 displays the bivariate correlations of the country-level reading performance gap between second-generation and non-immigrant students with the degree of tracking. Positive values indicate that second-generation immigrant students have higher reading skills compared to non-immigrant students, whereas the opposite is the case for negative values on the reading performance gap. As assumed in Hypothesis 1, a higher tracking index is associated with second-generation immigrant students’ worse performance compared to non-immigrant students (p<0.10).

Bivariate correlation of the tracking index and the immigrant reading performance gap (N=28).
Figure 2 displays the correlation of country mean school variables with the degree of tracking. The figure show that a higher tracking degree is correlated with a higher immigrant concentration, a lower share of qualified teachers and a slightly higher educational material shortage. In countries with a lower tracking degree, lower instructional demands are on average more common. As expected, we indeed see on a descriptive level that immigrant concentration in schools is more prevalent in countries with a higher tracking degree; however, this relationship is non-significant. A significant relationship only occurs for the tracking degree and lower teacher qualification as well as with lower instructional quality. These unconditional country correlations may give some first insights into the expected relationships, but may as well be confounded due to other (unobserved) country characteristics.

Bivariate correlation of the tracking index and school-level indicators (N=28).
Results from two-level random slope models with country-fixed effects
To test the first hypothesis on the relationship between reading performance and the tracking degree I perform multilevel random slope models with country-fixed effects and introduce an interaction effect with immigrant background and the tracking degree (Table 4). In the first unconditional model M1, the significant coefficient for the negative interaction effect is −8.07, meaning that second-generation immigrant students achieve around 8.1 points less than their non-immigrant peers when the tracking index increases by one unit. Calculating the scores for the highest tracking score (2.7) reveals that the disadvantages in reading scores between non-immigrant and second-generation immigrant students is around −41.7 scores in those highly tracked countries. This is substantial, because in practical terms it means a deficit of more than one school year.
PISA reading literacy in 28 countries. Two-level random slope models with country-fixed effects.
Source: PISA 2018. Note: Cross-level interaction with school variables (M4–M7) are separated models. Country-fixed effects are applied in each model. Standard errors in parentheses ***p<0.001, **p<0.01, *p<0.05, +p<0.10.
Model 2 accounts for students’ individual-level characteristics and the interaction coefficient is now at −4.2, almost half the size compared to Model 1. This implies that second-generation immigrant students’ disadvantage in reading literacy is to some extent due to their individual endowments. After introducing the school-level variables, the interaction coefficient slightly reduces to −3.6, which means that the allocation of students to schools with different characteristics only marginally accounts for achievement differences between second-generation and non-immigrant students. In this final model, reading scores for both groups in countries with the lowest tracking degree are now almost equal, whereas in countries with the highest tracking score, the disadvantage of second-generation immigrant students is still at −15.5 points. In sum, these models show that tracking is associated with performance disadvantages among second-generation immigrant students, lending support to Hypothesis 1.
Models 4–7 display the results for the models including the three-way cross-level interactions of immigrant background, tracking index and the respective school indicators. These models test whether the influence of school characteristics on immigrant reading inequalities vary with the degree of tracking in a country (H2 to H4). As shown in these models, all interaction effects are significant. Figures 3 to 6 visualize these interaction effects with the help of margin plots.

Margin plot of three-way cross-level interaction between the tracking index, immigrant background and immigrant concentration.

Margin plot of three-way cross-level interaction between the tracking index, immigrant background and low instructional quality.

Margin plot of three-way cross-level interaction between the tracking index, immigrant background and low teacher qualification.

Margin plot of three-way cross-level interaction between the tracking index, immigrant background and educational material shortage.
With regard to immigrant concentration, Figure 3 shows that in less-tracked systems, second-generation immigrant students outperform non-immigrant students, the higher the concentration of immigrants in schools. The higher the tracking degree, the more negative the effect of immigrant concentration develops, with lower reading scores for second-generation immigrant students in countries with a higher tracking degree. This is in line with the assumption that segregation leads to performance disadvantages the higher the degree of tracking (H2), net of the socio-economic school composition.
Turning to the instructional quality, Figure 4 shows that for second-generation immigrant students, the effect of a less demanding curriculum on reading performance is slightly more negative in countries with more tracking. The variation of this effect across the tracking index is however very small. Instead, non-immigrant children seem to be more affected by a lower curriculum demand across tracking regimes, i.e. the higher the degree of tracking, the less harmful a less demanding curriculum affects their reading performance. The comparison of the two groups reveals that with a higher tracking degree, a less demanding curriculum indeed goes along with lower reading scores for second-generation compared to non-immigrant students. Against the assumption, this effect is however driven by non-immigrants’ higher sensitivity to lower curriculum demands. Therefore, the assumption that second-generation immigrant students are more negatively affected by a less demanding curriculum in more-tracked systems is not supported (H3a).
With regard to the differential effects of teacher qualification, we see in Figure 5 that in less-tracked systems, second-generation immigrants’ reading skills are more negatively affected by a lower share of qualified teachers in a school than the skills of their non-immigrant school peers. The higher the tracking degree, the less negative the effect of a lower share of qualified teachers evolves for both groups and the more reading performances between second-generation and non-immigrant students align. Hence, the results reveal that a lower teacher qualification is more harmful for students’ reading performances in countries with less tracking, particularly for second-generation immigrant students. This clearly speaks against Hypothesis 3b.
The last assumption was that second-generation immigrant students suffer more from a shortage of educational resources. Figure 6 shows that second-generation immigrant students’ reading performance increases with a higher shortage of educational resources when countries are more tracked. However, the confidence intervals highly overlap so that no statistical validation is given. Further, results on separate two-way interactions (not shown here) reveal that the shortage in resources does not vary with the tracking degree, neither does the (almost non-existent) shortage effect significantly vary between non-immigrant and second-generation immigrant students. Therefore, Hypothesis 4 must be rejected as well.
Taken together, the analyses show that, in most cases, the influence of school characteristics on immigrant reading achievement inequalities is indeed dependent on the tracking degree. The implications of these findings will be discussed in the following section.
Discussion
The present article has examined the influence of the educational tracking degree on reading achievement inequalities between second-generation and non-immigrant students in Western secondary education systems. The aim was to address the shortcomings and contradictions from previous literature on tracking effects which mostly focus on one single aspect of tracking and thereby overlook the more complex nature of sorting students into different ability levels. One contribution of this study therefore lies in extending the common understanding of tracking, moving from tracking between schools to internal forms of streaming and grouping as another source of inequality. By following up on the previous findings on organizational differences between different tracking forms (Chmielewski, 2014; Chmielewski et al., 2013; Maaz et al., 2008), I developed theoretical arguments on the possible mechanisms on the individual and school level that help to scrutinize in what way tracking influences immigrant reading performance inequalities. To answer this question, I conducted multilevel country-fixed effects analyses in 28 European and OECD countries with data from PISA 2018.
First, the findings reveal that a higher tracking degree leads to substantial reading performance disadvantages for second-generation compared to non-immigrant students. The study therefore supports findings of immigrant performance disadvantages in other cross-comparative studies using different measurements of tracking (Cobb-Clark et al., 2012; Dronkers et al., 2012; Teltemann and Schunck, 2016; Van de Werfhorst et al., 2014). The results however also show that a substantial part of the reading performance gap is explained by individual-level characteristics, which in light of previous studies may not be an unexpected finding. Countries consist of varying profiles of immigrant children, such as immigrant students’ national origins, their level of the socio-economic background and their language proficiency (Volante et al., 2018), and these diverse profiles seem to be partly responsible for cross-national variations in achievement inequalities. The finding that school characteristics only account to a minor extent for the reading performance gap may indicate that the sorting into schools with a varying level of resources per se (i.e. segregated schools, instructional quality, teaching qualification and the availability of educational materials) is not responsible for second-generation immigrant students’ disadvantages. However, the key assumptions were that performance disadvantages are due to differential school effects stemming from second-generation immigrant students’ sorting into lower ability tracks in countries with a higher degree of tracking.
Therefore, the main set of analyses elaborated on how school characteristics influence immigrant achievement differences depending on the tracking degree. Evidence was found for the relationship between a higher immigrant concentration and lower reading competencies of second-generation immigrant students when the tracking degree is higher, supporting findings from earlier studies on more detrimental segregation effects in countries with BS-tracking (Baysu and de Valk, 2012; Dronkers et al., 2012; Entorf and Lauk, 2008; Park and Kyei, 2010). Hence, immigrant students in particular seem to be hindered in such contexts, either due to a lack of access to host-specific capital from majority peer interactions, or due to insufficient opportunities to practise the host language, which are essential for reading skills (Esser, 2006; Kalter, 2006).
Moreover, a higher lack of school resources is not related to second-generation immigrant students’ disadvantage in more-tracked systems. One explanation is offered when considering that only a small set of OECD countries was investigated. Those countries may be more equipped with financial resources and unequal material distributions between schools may just be less prevalent when compared to all countries participating in PISA (OECD, 2020: 125).
Finally, no evidence was found that second-generation immigrant students’ reading performance is reduced due to a lower share of qualified teachers or lower instructional demands, when more tracking is exercised. Hence, there is no evidence that in countries with a higher tracking degree, the attraction of less qualified teachers or the provision of a less demanding curriculum in lower tracks reduces school performances (Baumert et al., 2010; Brunello and Checchi, 2007), at least when it comes to the influence on immigrant children’s reading literacy. These findings imply that even though ample research has shown that instructional quality varies between high and low tracks (e.g. Carbonaro and Gamoran, 2002; Hallam and Ireson, 2005; Hattie, 2002), this effect was found to be stronger in countries with a lower tracking degree. Yet, teaching resources are still one of the most important sources of students’ educational success today (OECD, 2018). Curriculum demands and teacher qualifications may constitute only part of the teacher–student interactions within the classroom. Future research on tracking degree effects might focus on the interlinkage between school contexts and attitudinal dimensions, which are likely to underlie more hidden teaching practices. Important factors to study are the aforementioned teacher expectations (Kelly and Carbonaro, 2012; Van Houtte et al., 2013; Van Houtte and Demanet, 2016), but also the climate of emotional and instructional support (Donaldson et al., 2017) and teachers’ attitudes and perceptions towards immigrant students (Stevens and Vermeersch, 2010; Van den Bergh et al., 2010), which in some cases have been shown to vary between track levels.
In summary, the present findings suggest several things. First of all, it was shown that the quality of schools indeed affects second-generation and non-immigrant students’ school performance differently, depending on the degree of tracking. This is in line with evidence by Dronkers et al. (2012, 2014), who show that tracking effects are to some extent dependent on or are moderated by school characteristics, implying that the consequences of tracking may be reinforced or weakened by the school environment.
Still, a better understanding is needed on how the school level interacts with institutional characteristics. This study indicates that the extent to which more-tracked education systems produce less favourable learning conditions for second-generation immigrant students depends on the school characteristics considered. Immigrant concentration is the only school characteristic that supports the theoretical argument on generating disadvantages in immigrant children’s reading performance in higher-tracked countries. With regard to teaching, instructional and educational resource quality, the presented findings challenge the idea that more-tracked systems produce more unfavourable learning conditions for second-generation immigrant compared to non-immigrant students. In fact, this corroborates findings by Borgna and Contini (2014), who find that the negative effect of the sorting of immigrant students into low-quality schools appears in countries with low and high tracking, suggesting that the unfavourable learning conditions may unfold independent of the tracking degree. Thus, tracking may not lead to a school quality dispersion as theory would suggest and other institutional sorting characteristics may be more important. Research on socio-economic achievement inequalities already reveal that countries with more subtle forms of tracking show similar achievement disparities (Chmielewski, 2014; Traini et al., 2021). Furthermore, in countries with less tracking, such as the United States, access to good quality education more heavily depends on the school students attend, which in turn is unequally distributed geographically (Jerrim, 2014: 199). This again stresses the need for further research focusing on the school context in order to enhance our understanding of the mechanisms by which education systems provide or hinder the learning success of vulnerable students.
A last aspect to be considered when interpreting the findings of this study relates to the complexity and multi-layered notion of educational integration in general and its variation across nations, which may or may not be intertwined with the tracking structure of education systems. For example, some countries provide mother-tongue instruction or other culturally inclusive practices to improve immigrant children’s achievement. In contrast, fostering school autonomy and free parental school choice can lead to a further strengthening of sorting with more disadvantaged students in schools and thus may have negative side effects, particularly for immigrant students (Volante et al., 2018). Certainly, a successful integration can be promoted by educational policies specifically targeting minority students by taking into account their disadvantaged starting position and their specific needs (Klinger et al., 2018). Future research might therefore study how specific educational policies interact with the tracking degree in order to gain a better understanding on possible reinforcing or mitigating effects on immigrant achievement inequalities.
The present study faces several limitations that should be addressed in future research. First of all, due to the cross-sectional nature of the analysis, the effects must be cautiously interpreted in terms of causality. Thus, even when controlling for unobserved heterogeneity on the country level, there is still some uncertainty left when attempting to explain immigrant achievement inequalities across school systems. Comparative longitudinal study designs are a better way to directly measure the consequences of different sorting mechanisms and explicitly address early school career influences on the transition to secondary education. Second, the tracking degree was measured in a limited way as it relied on rough country-level indicators. With the data it was not possible to include individual track level because students’ school track level is only available for countries using BS- or WS-tracking, which did not allow me to consider track level for the whole set of countries. To enable an encompassing cross-national investigation of tracking effects, data on other distinct features, such as students’ ability group level, but also track mobility, are essential. Last but not least, the study was able to explain part of the immigrant achievement difference, but achievement disadvantages across countries still remain for second-generation immigrant students due to unexplained factors. This should be a motivation for future research to not lose sight of immigrant children’s educational disadvantages in our societies.
Footnotes
Acknowledgements
I would like to thank the anonymous reviewers and the editor for their helpful suggestions. I am also very grateful to Kathrin Leuze for her consistent support and valuable advice on this article.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
