Methodological lessons on measuring quality teaching in Southern contexts,with a focus on India and Pakistan

Abstract

Quantifying the impact of teaching quality on pupil learning, and understanding what teacher characteristics or practices are likely to improve student achievement, are pressing research questions in all countries. Empirical evidence also needs to be context specific since different education systems are likely to have different facilitators and barriers to good teaching. Existing evidence, largely from the US, suggests a number of strong research designs that enable researchers to model the impact of teaching on pupil achievement. However, operationalising these models in more resource-constrained contexts is challenging. In this paper we describe our attempt to model the impact of teachers and their practices on pupil achievement using the quantitative data generated for this research (household and school surveys with a teacher survey and an attempt to assess teacher knowledge). We describe the challenges when trying to implement this approach in the Indian and Pakistan context and the methodological adaptions needed. We reflect on the strengths and weaknesses of our approach. We note that existing literature tends to provide relatively minimal descriptions of the specific research design and instruments used to model teacher quality and hence provides a partial picture of methodological considerations. In this paper we contribute a detailed and frank account of developing a workable research design and the challenges we encountered.

Keywords

teacher effectiveness learning assessments quantitative surveys teacher knowledge India Pakistan

Introduction

As part of the Sustainable Development Goal agenda, governments across the world have committed to providing inclusive and equitable high-quality education for all children (United Nations, 2015). Yet, despite great progress in getting more children into school over the past decade, education quality remains a serious concern. In many countries, children – particularly those from the most disadvantaged backgrounds – are still experiencing poor quality education which limits their chances of fulfilling their learning potential (UNESCO, 2014; World Bank, 2018). Given this, there is an urgent need for robust research to understand why learning levels are so low and more specifically why learning is so unevenly distributed within countries. Existing literature points to teachers as the most crucial institutional input into a child’s educational experience (Hanushek and Woessman, 2011; Nonoyama-Tarumi and Willms, 2014). Hence unequal access to good teaching is likely to be a key route by which inequalities in learning arise.

In order to identify the extent to which the quality of teaching is at the heart of improving learning equitably, and what features of teaching are most effective, we need to be sure of the robustness of the evidence base. Specifically, we need to consider the methodological concerns and context specific challenges faced when measuring quality education in low and lower-middle income countries.

Existing literature on the topic of teacher quality in low and lower middle income contexts (as well as more generally in the field of education) often provides cursory information on the instruments and methods used, thus providing a partial picture of the methodological considerations. In this paper we summarise previous research that has aimed to generate a picture of quality teaching through adopting quantitative methods.¹ We make a contribution to the literature by describing the conceptual, methodological and practical challenges of measuring teaching quality in these contexts.

In addition to the existing literature on the topic, this paper draws on experience from our Economic and Social Research Council/Department for International Development funded Teaching Effectively All Children (TEACh) project which focuses on the role of teaching quality in explaining low levels of learning in India and Pakistan.² Particular issues raised in previous literature with regard to teaching quality in India and Pakistan include: teachers who lack basic subject knowledge themselves; inadequate teacher training; insufficient focus on children from poor backgrounds; weak incentives and poor governance; low motivation; and high levels of teacher absenteeism (Bau and Das, 2017; Bennell and Akyeampong, 2007; Kingdon et al., 2014; Moon, 2013; UNESCO, 2014; Westbrook et al., 2013). What is less understood is the reasons for variation in teaching quality across India and Pakistan and the extent to which it explains some of the variation we see in pupil achievement. To fill this gap in the literature, our research seeks to measure teaching quality in two locations in India and Pakistan (the state of Haryana and province of Punjab respectively) and the extent to which marginalised groups of students access high quality teaching in these contexts.

We start by detailing how teacher effectiveness has been measured in the existing literature before describing the specific country context for the research and the challenges this poses. We then describe how we measure a) student learning as one measure of teacher effectiveness and b) teacher characteristics and behaviours as another. An important aspect of our approach is that we also attempt to capture teacher knowledge directly. We discuss the implications of our approach for research in this field and conclude.

Measuring teacher effectiveness

The past few decades have seen a burgeoning interest in identifying which factors – individual, family and school – are the most critical determinants for raising pupil attainment. As a result, a robust and strong evidence base from developed countries has now emerged that indicates teacher effectiveness to be a critical determinant of student achievement. Hanushek (2011: 467) goes as far as to say that no other attribute of schools has as much influence on student achievement as teacher effectiveness. Estimates from the United States suggest that a difference of one standard deviation in teacher effectiveness can yield between 10 to 20% of a standard deviation change in pupil achievement (Hanushek, 2011; Hanushek and Woessmann, 2011). Whether these estimates apply in low and lower-middle income country contexts is, of course, an empirical question. In Table 1, we summarise relevant studies of teacher effectiveness which use quantitative methods and have been conducted in the context in which we are working, namely India and Pakistan. These have been selected as the focus as they are both contexts where studies recognise the persistence of low levels in basic competencies of literacy and numeracy and wide inequalities in learning. The existing evidence seems to confirm the potential importance of teacher quality for student achievement (Azam and Kingdon, 2015; Bau and Das, 2017; De Talancé, 2017).

Table 1.

Studies examining teacher effectiveness and its relationship with student outcomes, India and Pakistan.

	Study	Title	Context	Methodology	Results
	Studies measuring the impact of teachers on test score gains Specific teacher characteristics are not considered in the body of research below. Instead the gains in test scores of different students taught by the same teacher are used as a measure of the effect of the teacher on students’ test scores. In our work, we overcome this by focusing on the relationship between specific teacher characteristics and good student learning.This body of research does not use recent standardised tools that are inclusive of children from diverse backgrounds. Therefore, samples generally exclude children with disabilities.
1	Bau & Das (2017)	The misallocation of pay and productivity in the public sector: Evidence from the labor market for teachers	Punjab, Pakistan	Natural experiment with matched student-teacher pairs, teacher value- added approach.N=69615 (1158 schools) Grades 3–5. School-based survey (Learning and Educational Achievement in Punjab Schools Survey; LEAPS)	Teacher quality matters for student achievement.Moving a student from a teacher in the fifth percentile to the 95^th percentile leads to a 0.64 SD increase in test scores.Despite observed teacher characteristics being closely linked to compensation, they explain no more than 5% in the variation in teacher value-added.
2	De Talancé (2017)	Better teachers, better results? Evidence from rural Pakistan	Punjab, Pakistan	Teacher characteristics as teacher fixed effect, value-added model (gain model) N=15470 Grades 3-5. School-based survey (Learning and Educational Achievement in Pakistan Schools).	Teachers are one of the main drivers of learning. Some observable teacher characteristics are associated with student achievement (such as nature of contract and locally recruited or not, teacher wages).Experience and education of teachers has relatively low impact on student achievement.
3	Azam & Kingdon (2015)	Assessing teacher quality in India	Uttar Pradesh, India	Teacher quality as teacher fixed effect, value-added model.N=8382 Grade 12. Private school exam results for numeracy, literacy and science.	One standard deviation improvement in teacher quality adds 0.38 SD points in student score.Observed characteristics explain little of the variability in teacher quality
4	Rawal, Aslam & Jamil (2013)	Teacher characteristics, actions and perceptions: what matters for student achievement in Pakistan?	Punjab, Pakistan	Education production function (school fixed effect)	Teacher observable characteristics have limited impact on student outcomes. Teacher’s subject matter knowledge and attitudes to teaching seem to matter more in determining student outcomes.
5	Aslam & Kingdon (2011)	What can teachers do to raise pupil achievement?	Lahore, Pakistan	Education production function(pupil fixed effect).N=1887 per school (65 schools) Grade 8. School-based and home-based surveys, IQ (Ravens Progressive Matrices) and numeracy and literacy ability (pupil fixed effect).	Teachers’ standard resumé characteristics do not matter but teachers’ own test scores and ‘process’ variables such as ability to plan lessons, time spent quizzing children on school work, etc. matter more in determining student outcomes.
	Studies investigating specific teacher characteristics These studies do not consider that pupils are not randomly allocated to teachers and thus effects of teacher characteristics on student gains may be a result of sorting of pupils into different classes on the basis of ability and, for example, senior teachers might prefer to teach more able students.
6	Muralidharan & Sundararaman (2013)	Contract teachers: Experimental evidence from India	Andhra Pradesh, India	Randomised control trial, education production functionN=not specified (200 schools) Grade 1–5. School-based assessment of numeracy and literacy.	At the end of two years, students in schools with an extra contract teacher performed significantly better than those in comparison schools by 0.16 and 0.15 SDs, in maths and language tests respectively.Show that contract teachers are not only effective at improving student learning outcomes, but that they are no less effective at doing so than regular civil-service teachers who are more qualified, better trained, and paid five times higher salaries.
7	Muralidharan & Sundararaman (2011)	Teacher performance pay: Experimental evidence from India	Andhra Pradesh, India	Randomised control trial, education production function.N= not specified (500 schools) Grade 1–5. School-based survey, numeracy and literacy assessments.	Students in incentivised schools (teachers given a monetary incentive) performed significantly better than those in control schools by 0.27 and 0.17 SDs in maths and language tests respectively.Programme was highly cost effective and incentive schools performed significantly better than other randomly chosen schools that received additional schooling inputs of a similar value.
8	Goyal and Pandey (2011)	Contract teachers in India	Madhya Pradesh, Andhra Pradesh, India	Education production function (school fixed effects).N=7923 Grades 1–5. School-based assessment, numeracy and literacy assessments.	Contract teachers are associated with higher levels of effort than civil service teachers with permanent tenure. Higher teacher effort is associated with better student performance.Contract teachers ‘as they are’, however, appear weak as their effort levels on an absolute basis are low and appear to decline through the contract period.Contract teachers are also more cost-effective due to lower salaries.
9	Atherton & Kingdon (2010)	The relative effectiveness and costs of contract and regular teachers in India	Uttar Pradesh & Bihar, India	Education production function (school fixed effect & value-added model).N=4000 Grade 2 and 4. School-based survey of numeracy and literacy (SchoolTells)	Despite being paid just a third of the salary of regular teachers with similar observed characteristics, contract teachers produce higher student learning. Pupils with contract teachers score 0.21 SDs higher in Uttar Pradesh and 0.063 SDs higher in Bihar.
10	Kingdon & Teal (2010)	Teacher unions, teacher pay and student performance in India: A pupil fixed effects approach	Across India	Education production function (school fixed effect & pupil fixed effect).N= not specified (186 schools) Grade 10. School based and home based survey, IQ (Ravens Progressive Matrices), literacy, numeracy, science and history/geography assessments.	Union membership of the teacher is found to be associated with reduced pupil achievement. A school fixed effect shows that union membership raises teacher pay thereby suggesting clear evidence of unions raising costs and reducing achievement.
11	Rawal & Kingdon (2010)	Akin to my teacher: Does caste, religious or gender distance between student and teacher matter? Some evidence from India	Uttar Pradesh & Bihar, India	Education production function (pupil fixed effect).N= 5028 Grade 2 and 4. School-based survey of numeracy and literacy (SchoolTells).	Study finds significant positive effects of matching student and teacher characteristics. A student’s achievement in a subject in which the teacher shares the same gender, caste and religion as the child is on average nearly a quarter of a standard deviation higher than the same child’s achievement in a subject taught by a teacher who does not share similar characteristics.
12	Chudgar & Sankar (2008)	The relationship between teacher gender and student achievement: evidence from five Indian states	Andhra Pradesh, Gujarat, Uttarakhand, Rajasthan, Chhattisgarh	Education production function.N=4041 (300 schools) Grades 4 and 6. School-based assessment measuring numeracy and literacy (Education Initiative, 2007).	Study shows that male and female teachers differ in terms of their classroom management practices and their belief in students’ learning ability.In partial support of the policy of hiring more female teachers, it also shows that being in a female teacher’s classroom is advantageous for language learning but teacher gender has no effect on mathematics learning.
Contribution of our work To measure the relationship between specific teacher characteristics and student learning in our work we adopt a mixed-methods approach (questionnaires measuring teacher characteristics and attitudes and classroom observations and semi-structured interviews measuring teacher motivations and beliefs). Our mixed methods approach also enables us to examine the role of classroom practices in promoting student learning.We use assessments that enable many disabled children to be included in the sample.

Despite agreement on the important role of teachers, measuring teacher quality is notoriously difficult. This is due to the fact that teacher quality encompasses an immense range of competencies, skills, motivations, behaviours and attitudes, many of which cannot be easily observed. Additionally, the interaction of these factors and the nature of the relationships that teachers maintain with their students remain invisible to those outside the classroom in which the teaching is taking place. Yet identifying the characteristics of a ‘good quality teacher’ is not only important for recruitment but also for developing the skills and competencies of those already within the workforce. In addition to this, the key empirical challenge in identifying a ‘teacher effect’ is due to the potential non-random matching of students to school and, within schools, to particular teachers. These empirical problems plague researchers from more developed countries as well. However, better availability of linked and administrative data sets allows researchers in developed countries to overcome these challenges in a far more robust manner than those from developing countries.

In a narrow sense, teacher quality can be defined as a ‘teacher’s ability to produce growth in student achievement’ (Eide et al., 2004). Even with an arguably very narrow definition of teacher effectiveness which focuses exclusively on test score gains, identification of the specific characteristics and practices of teachers that contribute most towards improving pupil achievement has proved problematic. Whether focusing on test score outcomes or wider outcomes from education, the methodological challenge of isolating the impact of the teacher per se is considerable.

Teacher quality and student achievement

The quantitative literature that has attempted to measure teacher quality has generally adopted one of three approaches. The first approach calculates teacher quality as the value added in test scores that can be attributed to a specific teacher. These estimates of teacher effectiveness have generally been derived from what is known as a teacher fixed effect model of student achievement gain (sometimes described as a fixed effect value-added model). This method requires information on the test score gains of different students taught by the same teacher. The resulting ‘total teacher effect’ enables the researcher to define a good teacher as one who consistently produces high achievement gains for pupils. This approach, in estimating the total effect of the teacher on test scores, does not require identification of specific teacher characteristics that generate good student learning. In developing country contexts, where data constraints can prevent very sophisticated analyses, value-added measures are usually estimated by assessing children at the beginning and the end of a specified grade. The TEACh project has adopted a similar approach. Whilst value-added models provide one methodological approach to measuring teacher effectiveness, they are not free from criticism: Andrabi et al. (2011) argue that restricted value-added estimates that are based on longitudinal data with unobserved student heterogeneity and measurement error may sometimes be worse than naïve cross-sectional ordinary least squares estimates.

The second approach uses an educational production function which relates measurable teacher characteristics, and a range of other school and family inputs (such as resource inputs), to pupil achievement. For example, studies have examined the correlation between student test scores and the number of years of experience that a teacher has, using this approach. A third approach measures the impact of an intervention on teacher training or inputs, for example.

In the Pakistan and Indian context, such models are becoming more widely used (see, for example, studies in Table 1: Azam and Kingdon, 2015; Bau and Das, 2017; De Talancé, 2017). The results from these kinds of model suggest that teachers vary substantially in their effectiveness. However, these studies confirm that many of the standard teacher characteristics such as certification, training and experience level do not seem to matter much to pupil achievement (e.g. Bau and Das, 2017; De Talancé, 2017). As these resumé characteristics often underpin teacher compensation policies, these findings are controversial and widely debated. This implies that whilst we know that teachers matter to student achievement, identifying what makes for a good teacher, a priori, is difficult.

More recent research in the developing world has tried to identify the underpinning characteristics of teaching practice that makes one teacher more effective than another. Studies focusing on India and Pakistan have, for example, found that factors such as the social distance between the teacher and the students, teachers’ opinions, attitudes and perceptions, and indeed classroom practices, matter more for student achievement than teachers’ observed resumé characteristics (Aslam and Kingdon, 2011; Rawal and Kingdon, 2010; Rawal et al., 2013). A number of studies have found that the nature of the contract that the teacher is on and whether the teacher was locally recruited also predicts teacher quality (Atherton and Kingdon, 2010; De Talancé 2017; Goyal and Pandey, 2013; Muralidharan and Sundararaman 2011, 2013). Teachers on contracts that are more closely linked to performance appear to be more effective and, more specifically, lower paid less well-trained contract teachers appear no less effective than more qualified, better trained and higher paid civil service teachers on permanent contracts (e.g. Muralidharan and Sundararaman, 2013).

A methodological issue that is often ignored is that pupils are not randomly allocated to teachers. In some contexts, the low achieving and most challenging students might have fewer effective teachers, particularly if more senior teachers prefer to teach higher achieving pupils. Equally, the most talented teachers may end up in the most socio-economically advantaged schools, teaching students who are likely to make more rapid academic progress, thereby leaving less effective teachers teaching pupils in more disadvantaged schools. A teacher could be incorrectly categorised as highly ‘effective’ merely because s/he happens to be assigned to a group of pupils who are more motivated or able. This teacher could also be more effective at ‘getting assigned’ certain types of students to further exacerbate this bias. Additionally, as many teachers tend to be assigned teaching based on their previous performance as a teacher, this initial non-random sorting could lead to further non-random sorting into classrooms. This makes it difficult to separate the effect of the teacher on pupils’ test scores from the effect of pupils’ often unobserved characteristics (e.g. socio-economic disadvantage or the nature of their behavioural problems). The literature has attempted to allow for these ‘selection effects’ using a variety of methodologies such as instrumental variables, panel data and randomised experiments (e.g. Glewwe and Kremer, 2006; Hanushek et al., 2005; Kingdon and Teal, 2010; Lavy, 2002; see also Table 1). However, most estimates of teacher effectiveness rely simply on allowing for differences in students’ prior level of achievement as a way of dealing with this issue. This assumption has some empirical support in the literature though, since non-randomised studies often provide similar estimates of teacher effects as randomised studies – as long as prior student achievement, student characteristics and teacher variables are sufficiently controlled for (Burgess, 2015).

In summary, an increasing body of global evidence on teacher effectiveness has found that teachers are differentially effective in producing student achievement; what they do and how they do it is of crucial importance for student outcomes. However, this cannot be predicted by teachers’ formal qualifications or other aspects of observable teacher characteristics (Bruns et al., 2016). By contrast, the contractual arrangements for teachers and measures of teacher effort do predict student achievement in some contexts.

Classroom observation approaches to measuring teacher quality

Given the difficulty in identifying the specific (a priori) characteristics of effective teachers, another branch of research has gone in a somewhat different direction. This work has focused on observing teachers within their classrooms to unpack what happens within the black box of teaching. These classroom observations have then been linked to student learning and other outcomes to provide evidence on teacher quality.

A largescale and renowned study in the US – Measures of Effective Teaching – found that by observing teachers within their classrooms and using a variety of observation rubrics, it is possible to estimate differential teacher effectiveness (Kane and Staiger, 2012).³ The project compared five different approaches to classroom observations (including Classroom Assessment Scoring System – CLASS – discussed in the next paragraph) and found all five observation instruments to be positively associated with student achievement gains. Further, combining these teacher observation measures with data on student achievement gains improves the predictability of these measures in identifying more effective teachers. This suggests that researchers seeking to identify high quality teaching, and indeed evaluation systems trying to do the same, might need to combine teacher observation, teacher self-report data and student outcomes, rather than using observation or test score measures alone.

A similar instrument is the CLASS instrument which suggests that students taught by more ‘effective’ teachers (as measured by CLASS scores) have higher learning gains and better outcomes in relation to behaviour and self-regulation (Howes et al., 2008; Grossman et al., 2010). In the developing country context, such instruments have been used to a much more limited extent. However, Araujo et al. (2016) uses the CLASS instrument in Ecuador to estimate teacher effectiveness. Based on a random assignment of students to teachers, Araujo et al. find that an increase in teachers’ classroom quality (measured using the CLASS observation instrument) resulted in higher student test scores in language, maths, and executive function.

Bruns et al. (2016), comparing CLASS with another widely-used observation instrument (Stallings) find that when using them within the same developing country context (Chile), both instruments produce consistent assessments of teachers’ effectiveness in managing their classrooms and ensuring student learning.

A review of teacher observation literature in low and middle income countries has sought to identify opportunities to systematise observations to monitor quality at both the school and system levels (Pouezevara et al., 2016). The authors conclude that, of the admittedly limited research that attempts to relate observational measures of teaching practice to student learning outcomes, most studies have demonstrated a positive and significant association between them. However, they note that few studies actually attempt to unpack the complex relationship between specific classroom practices and student learning. They also highlight the technical and practical issues that constrain researchers from conducting classroom observations at scale as a means to obtain information on teaching practice. In addition to these limitations, they have also been found to be costly to administer.

The use of classroom observations to examine teacher effectiveness in the specific contexts of India and Pakistan has been very limited. A 2014 study in Madhya Pradesh and Tamil Nadu, India, found correlations between teacher classroom practices and student outcomes and provided some insights for policy in areas of pre-service and in-service training, as well as pedagogy (Dundar et al., 2014). Specifically, the study found that, whilst teachers spend a substantial time on instructional activities, this was not reflected in positive student learning outcomes. Their observations also suggested that teachers were not addressing the different learning and ability levels of students in the classroom, which is particularly important when considering teaching quality for children from disadvantaged backgrounds. The SchoolTells initiative in India and Pakistan also suggested that classroom observations can provide useful insights into the relationship between teachers’ classroom practices, teacher effectiveness and student outcomes (Rawal and Kingdon, 2010; Rawal et al., 2013). Recent research using classroom observations in Indian classrooms suggests that the majority of classroom interactions are characterised by teacher-centred activities and rote learning (e.g. Sankar and Linden, 2014). While these studies pay attention to the overall proportion of children engaged in the learning activities, as with other studies, they do not examine patterns in (dis)engagement or the inclusion and opportunities for disadvantaged children more specifically.

Similarly, Singh and Sarkar (2012) examine the effect of teaching quality on children’s test scores in India and use classroom observations in their analysis. They find that whilst standard characteristics of teachers like experience, gender, content knowledge and subject specialisation do not have any significant influence on children’s learning outcomes, teaching practices such as regularity in checking homework and factors such as the proximity of the teacher’s residence to the school and attitude towards the children, as well as teachers’ perceptions of their schools, have emerged as important determinants of students’ test scores. The authors state that ‘it is what the teacher “believes and does” in the classroom that has the maximum impact on children’s learning outcomes’ (ii). However, it must be noted that whilst this research did involve classroom observations, limited data on actual classroom practices was collected. The only two areas examined related to teaching style (lecture-style, group work, etc.) and whether the teacher provided feedback through marking: the study was less informative on which other classroom practices and teaching processes are more effective.

The move towards the use of classroom observation data in quantitative research on teacher quality is welcome. However, there are important limitations to their use as currently designed. Disadvantages of existing structured observations that have been identified include low levels of inter-rater reliability, an issue that necessitates observations being carried out by multiple assessors in order to reduce biased results (Darling and Hammond, 2012; Kane and Staiger, 2012). This leads to significant financial and time costs, and even then, their reliability is not guaranteed. Sampling error, which denotes variation between different observations, and measurement error, which refers to inaccuracy relating to what is measured in teaching, are further problems which can arise. Validity concerns may also occur when evaluators with different agendas focus on different factors in their evaluation (Cortez Ochoa et al., 2018).

A further problem relates to the limitations of the data that are collected through the existing instruments. The Stallings method is limited to four categories (academic activities/instruction; classroom management; teacher off-task; students off-task) with a predefined set of indicators within each category for determining teacher performance. Though this feature is efficient in the sense that it is curriculum and language neutral and minimal observer discretion is required (Bruns et al., 2016), much is lost in terms of contextual difference between schools, subjects and teachers’ pedagogical approaches. Whilst the suitability of this method for larger scale studies has been suggested, it has been found ‘too crude to be used for individual teacher performance evaluations’, as well as ineffective for capturing ‘teachers’ ability to deliver high quality instruction and support students emotionally’ (Bruns et al., 2016: 28). Moreover, of particular relevance to our study, existing approaches do not sufficiently identify how teachers tackle diversity within their classrooms.

Another potential problem is that in low and lower-middle income countries, classroom observation is a fairly new phenomenon. In Punjab, Pakistan, teachers are used to being observed by district teacher educators. However, where teachers are not used to being observed or where observation is linked to assessment of teacher performance, this may pose difficulties for research. Initial concerns about observer bias can be mitigated to some degree by assuring teachers that the research will in no way impact their assessments or rankings. A final practical issue on observational approaches is that the average student teacher ratio in schools can be high in these contexts. In some of the classrooms in our sample, it reached 50 to 1. Classrooms of this kind tend to be crowded and often loud, making it challenging for classroom observations that aim to document time spent on specific activities at any one point in time.⁴

Our approach to measuring teaching quality as described in this paper builds on this existing literature but, for the reasons outlined above, does not include structured classroom observation as part of the quantitative data analysis.⁵ Before discussing the individual elements of our proposed approach, we start by documenting the challenges of the context in which we are researching.

The context

There are some key contextual factors that need to be highlighted as they impact the design of our research. First, it is well documented that learning levels are low in both India and Pakistan. Further, the extent of the variation in children’s learning and achievement is considerable. According to recent data from annual status of education reports (ASER) in rural Pakistan, around half of poor girls are not in school and many of these girls have never attended school. By contrast, the vast majority of rich boys and girls are in school. By contrast, most children in India are in school, regardless of their background (Alcott and Rose, 2015). While patterns of educational access vary between India and Pakistan, in both there are wide disparities in achievement levels across students with different background characteristics. ASER data show large differences in achievement levels between richer and poorer states in both rural India and Pakistan, and even within the richest states, poor students perform worse on assessments than their richer counterparts (Alcott and Rose, 2015; Alcott and Rose, 2017; Aslam, 2012). The main implication of this for research on teaching quality and learning in these contexts is that learning assessments need to be able to identify both very low levels of achievement and also accommodate considerable variation.

The scale and diversity of the education systems in many countries present issues with respect to measuring teaching quality. For example, in India and Pakistan the scale of the public-sector teaching workforce is enormous. The Punjab state in Pakistan employs close to 135,000 teachers in the government sector, and has added another 80,000 in 2017 alone. There is also considerable variation in the conditions of work facing different teachers in different parts of both Pakistan and India, as well as variation in pay. For example, government teachers in Pakistan can be paid anywhere between PKR 9860 (GBP 98) to PKR 109,000 (GBP 1090). Designing the research to adequately capture the heterogeneity of the teacher workforce is essential.

In terms of the diversity of schools, many low and lower-middle income countries have a hybrid education system, with a large private sector (predominantly charging relatively low fees) existing alongside state-run schools. This is true in both India and Pakistan. However, despite the fast and significant growth in private schools, the government sector remains the main provider in both countries. Further, government schools in both countries also cater to the most socio-economically disadvantaged populations. Since our research is assessing the teaching experienced by low socio-economic status children, a focus on government schools is essential.

Teaching quality is known to be poor in both government and private schools in both India and Pakistan as evidenced by research investigating issues such as teacher competence, subject-knowledge, efficacy of training, recruitment and deployment, motivation and absenteeism (Dundar et al., 2014; Ilm Ideas, 2014, Andrabi et al., 2015). In an effort to improve teaching in government schools, Pakistan raised the minimum academic qualification requirements for teachers, and instituted continuous professional development programmes. Despite a spate of reforms, the policy perception is that teaching in government schools in India and Pakistan has not improved. Robust evidence on this is limited. This too was a pressing reason to have a focus in our research on government schools. However, this does mean that we need to understand the school choices made by different students, since this will potentially cause selection bias in our models if those choosing to attend the local government school are very different from students attending other schools. To understand this choice, we needed to collect data from households, irrespective of which particular school the child is enrolled in.

The contexts also present some very specific issues that affect the research design. To estimate a teacher fixed effects model, it is necessary to match the achievement gains of students to one particular teacher. In practice, this proves quite difficult to implement as a design in countries such as India and Pakistan. Information from schools suggests that the same teacher does not always teach a particular class. A class may have multiple teachers at one point in time, schools may have a high turnover of teachers through the year and different teachers may teach different subjects even in primary school. For our study, we managed to get schools to confirm teaching assignments and teacher in-year moves (to different classes or schools). This enabled us to be clearer about whether we could attribute the learning gain to a particular teacher. Over and above the research challenges this poses, there is, of course, a larger policy question arising from this issue. If the government is to link teacher incentives to student performance, reliability of data on teacher class and school assignment needs to be ensured.

More generally, the information held on students and schools in low and lower-middle income countries such as India and Pakistan can be highly variable in quantity and quality. A key feature of our study design is that we need to collect data from both schools on student learning and from households on the socio-economic circumstances of the child. We then need to bring these two sources of data together. With patchy class lists and incomplete data from schools, this requires some creative solutions for matching children in schools to those in our household survey. Even when full names are available, matching could still be challenging. It proved important to use as many family identifiers as possible, including father’s name, mother’s name, number of siblings, parents’ occupation. Further, in the areas we were working in Pakistan, teachers tended to know communities. In our Indian districts, teachers did not always live in the communities and knew less about their students and their families. All this affects the quality of the data produced, in common with other large-scale approaches to data collection of this kind.

Measuring student learning

As has been said, one method of measuring teacher quality is to assess the learning gain of students.⁶ To do this we measure the learning gain of all children, aged 8 to 12 within the households we surveyed in each village. We chose the age range of 8 to 12 years as we would expect these children to have had some exposure to schooling. In both countries, this age range relates to school grades 3 to 5, assuming children start on time and progress smoothly through the system.

Assessments of the children’s learning in mathematics and language (Urdu and Hindi) used measures which had been used previously in the Indian and Pakistani contexts for a similar age range to those in our study. Specifically, we used instruments based on ASER and Young Lives numeracy and literacy tests. Our team had experience of using secondary data from these sources for other purposes so were familiar with the characteristics of the tests. Using both assessments allowed us to collect discrete (ASER) and continuous (Young Lives) data that could be used for different purposes in our analysis.⁷ Importantly, the combination of measures was intended to ensure a distribution of learning levels for different groups of children, recognising that ASER might focus in particular on lower levels of learning hence the need to complement this with some more challenging questions from the Young Lives instrument.⁸ We were also particularly keen to include children from diverse backgrounds, including those with disabilities, as far as we could in the assessment process. The combination of ASER, which is administered verbally, and Young Lives, which is administered both verbally and using paper and pen, facilitated this.

Children were first tested in their own households. Specifically, we surveyed around 1000 households in 30 villages in each of Haryana, India, and in Punjab, Pakistan. This resulted in a sample of 1241 and 1600 children from households in India and Pakistan respectively in the household, respectively, for whom we have assessment data (Figure 1). The collection of data from households was important to ensure we had information on all children, regardless of whether they were in school or not, and it also enabled us to collect detailed information on the households, including on measures of wealth to assess a child’s socioeconomic background and other characteristics,⁹ as well as the attitudes of mothers towards their children’s schooling experience.

Figure 1.

Baseline sample of children assessed in households and schools in Haryana, India, and Punjab, Pakistan.

Given our particular interest of identifying the extent to which learning gains can be associated with particular school and teacher characteristics, children were also assessed using the same tools within selected government schools in each village. We have assessment data for approximately 2071 and 2125 children in schools in India and Pakistan respectively. All children in grades 3 to 5 took the Young Lives paper and pen test, with a sample of these children (15 per school) also completing the ASER verbal assessment test. This provides us with measures of learning gain for a large sample of students in grades 3 to 5. We also have rich household and school information for some of the children. Despite the challenges, it was possible to match more than child’s socioeconomic background, 400 of the children from the household survey with the school survey in both countries, and hence we have particularly rich information on these children, their households, their schools and their particular teachers.¹⁰

Amongst those in school, we tested the children using the same assessments around ten months apart, namely at the beginning and end of the school year. In Pakistan, 90% of the children from the baseline school survey were identified at the end line school survey, and were present in school. Approximately 150 children were absent, and 202 had dropped out over the course of the year (recorded as not enrolled anymore).

We made very small adjustments to the tests in the second round to avoid the possibility of copying or rote learning responses from the original instruments. These two rounds of data provided us with information on test score gains from these assessments over the school year.

Measuring the characteristics and behaviours of teachers

Teacher survey

To capture information about the teaching that each child in our sample experienced, detailed teacher questionnaires were administered at baseline and end line in our sample schools. We illustrate our approach in this section with some findings from Pakistan. In Pakistan, 190 teachers in total were surveyed at baseline. A longer questionnaire was then administered during the end line to all 96 teachers who were teaching grades 3, 4 and 5, since this was the age group for which we had student test score data. A questionnaire was also administered to head teachers who were doing some teaching in grades 3 to 5 (37 at baseline, 33 at end line). Overall, about 75% of the teachers and head teachers in the baseline could be identified easily in the end line survey. The remaining 25% were new appointments. Contrary to expectations from previous research, teacher absenteeism was not a challenge in our Pakistani sample at the time of the survey: 8 of 190 teachers were absent on the first day during the baseline surveys at the beginning of the school year, but could be interviewed when the teams returned on the second day. In the end line survey conducted towards the end of the school year, 9 of 146 were absent, but found on the second day.¹¹

Teachers were asked about their characteristics, attitudes and practices, recognising that the previous literature has clearly indicated that teachers’ experience and qualifications alone are insufficient to explain variation in students’ learning outcomes. The survey we developed was therefore designed to capture measures of: teachers’ expectations and beliefs (about the students and their own competencies); aspects of the school environment; the quality and nature of leadership in the school; resource levels and teacher satisfaction with their jobs. We drew on existing instruments such as those used by SchoolTells, Young Lives and Learning While You Teach¹² which have been used in the Indian and Pakistani contexts to look at school and teacher effectiveness. Our instrument focused on aspects that relate to teaching of children from diverse backgrounds. Appropriate questions were not readily available in existing tools and we developed questions based on our experience of working in these contexts. A summary of the contents of the survey is found in Table 2. We do not discuss each question included in the survey here but we do highlight the reasoning behind key elements.¹³

Table 2.

Summary of teacher questionnaires.

Personal /background info	Professional background(experience / qualification)	Information about classes taught	Strategies adopted / practices	Job satisfaction	Attitudes re factors influencing learning	Accountability
AgeGenderMarital statusReligionCasteMother tongueLocation of residenceDistance travelled to school	Position in schoolEducationProfessional qualificationExperience of teaching (in years)SalaryPre- and in-service training including:InstitutionMode of studyTopics covered (including disability, teaching slow learners, etc.)Number of days on topics	Knowledge of no. of children:With disabilitiesFrom poor backgroundsSlow learners	For identifying & supporting:Slow learnersDisabilitiesLength of periods / sessionsWhat happens in a period (teacher reported)Assessments – frequency, use ofNon-teaching responsibilities / activitiesOpinions about teaching practicesTime use – a school day	Reason for becoming a teacherSatisfaction with current postChanges in and satisfaction with workloadReasons for changes in workloadAttitude to various aspects of the job / school (Likert scale 5 point)	Conditions / factors necessary for learning and those that impede learningFaith in own ability to handle diversityAttitudes to students’ background and their learning (gender, poverty, etc.)Attitudes to inclusion	Frequency of evaluations of workEvaluating authorityNumber of days teacher was absent and reasons cited for it

We included a set of questions on teachers’ education (initial education, pre-service training, etc.). In the Indian and Pakistani contexts, these questions are particularly important given the diversity of teacher education routes, and the policy discourse on under qualified teachers. In our sample, 80% of teachers in Pakistan had bachelor degrees or above. There are, however, a range of pre-service degrees that teachers can attain to qualify for teaching in both countries. In Pakistan, for example, there is a requirement for teachers to have a Bachelor of Education degree but the course can range from one year to four (about one third of the sample had the one-year degree and 60% had acquired their pre-service qualification through distance learning). Further, a number of private and government universities and colleges are officially recognised as certificate awarding bodies for teachers in Punjab, and the perception is that they vary in quality. Securing granular information about the nature of the person’s teacher training was important, not least to determine whether there is indeed variation in student learning across teachers who have different types of training.

Given our focus on equity, we collected detailed information on the pre-service training teachers have received on teaching low achieving learners, children with disabilities, and children in multi-lingual and multi-grade classrooms. A very large proportion of primary schools in India and Pakistan are multi-grade environments (Aslam and Rawal, 2015), and the preparedness of teachers to handle this is likely to make a significant difference to the quality of instruction.

Teacher experience is also important in this context. This is partly because the minimum standards for teachers have changed dramatically in recent years. As discussed, Pakistan changed the minimum educational requirement from a high school certificate to a bachelor degree in 2004. Teachers hired before and after this change are therefore quite different in educational terms and hence combining information on years of experience and the nature of the training received is likely to be important.

On teacher attitudes, we included questions from established instruments. We used the Attitudes to Inclusion scale (AIS) (Sharma and Jacobs, 2016), which measures teachers’ attitudes to student diversity with a particular focus on children with special educational needs. It defines inclusion as educating students with diverse learning needs in classrooms alongside their peers, with necessary additional support. Standard AIS questions were adapted for the Pakistani and Indian context. Our instrument elicited teachers’ beliefs about children’s ability to learn and how that related to poverty, gender and disability. We also asked teachers about their own abilities to handle challenging situations. Questions on job satisfaction were also included, particularly satisfaction with extrinsic and intrinsic factors, including salary, school environment and infrastructure, community relations and relations with the school leader and colleagues.

The survey instrument also includes a number of questions about teachers’ practices and strategies. These include teachers’ self-report on issues ranging from what their typical mathematics or language lesson might contain, through to what their typical day looks like and the extent to which children with special needs and low achievement are identified and supported. On the issue of supporting specific groups of children, teachers were presented with a range of different practices and asked to select those that they use in their classrooms. Given that over-testing has emerged as a concern in both India and Pakistan, teachers were also asked about the frequency of assessment and how it is used to improve student learning.

Whilst we have drawn on existing instruments and literature to develop our survey, we are mindful that simply asking teachers about their practices may be insufficient. We therefore also collected qualitative data on a sub set of schools to triangulate findings and present a more in-depth picture of the processes within classrooms. In the qualitative component, we focused on (a) teacher practices in the classroom and (b) teachers’ beliefs and motivation with respect to teaching children from diverse backgrounds, paying particular attention to children with disabilities. We felt this was important given that these children are often most excluded during the teaching process, but can be difficult to capture through quantitative surveys given very small sample sizes (although we did attempt to do so by using established approaches to identify them in our household survey). In order to capture data on teacher practices, we employed classroom observations, while insights into their motivations and beliefs were collected through semi-structured interviews.

Measuring teacher knowledge

Thus far, we have described our use of student test scores, a teacher report survey and classroom observations to try to determine the quality of the teaching in the classrooms in our sample. However, one of the most direct ways of measuring a teacher’s skill level (and hence potentially one aspect of her quality) is through assessing the teacher’s own knowledge directly, i.e. in a test. Glewwe et al. (2011) found in their review of the literature that whilst very few school and teacher variables have significant effects on learning outcomes, one of the teacher variables that does positively and significantly determine student learning is a teacher’s knowledge of the subject they teach. Teacher competence when proxied by teacher test scores (rather than by their educational qualifications and experience) strongly ‘supports the common-sense notion that teachers who better understand the subjects they teach are better at improving their students learning’ (Glewwe et al., 2011: 22). More recent literature also supports this view (Altinok, 2013; Chudgar, 2013; Metzler and Woessmann, 2012; Mulkeen, 2013).

We therefore collected data that would allow us to identify the correlation between good teacher knowledge in a specific subject and students’ academic performance in the same subject. Because of the nature of the sample (with many rural schools having the same teacher for all subjects), this limits the possibility that parents have chosen a specific teacher for their child based on this teacher’s specific knowledge in any given subject, and reduces the chance of bias arising from non-random classroom assignment more generally. This data is potentially important not least because it is novel in the contexts in which we are working and may present ways forward for policymakers in terms of implementing interventions to improve teacher subject knowledge. In an ideal situation, newly recruited teachers should enter the profession with adequate knowledge of the subjects they are intending to teach. However, previous research has suggested this may not be true in India and Pakistan for all teachers (Dundar et al., 2014). Therefore, evidence on the value of subject knowledge for student learning, and hence the potential for in-service training to play a crucial role in filling such gaps, is especially useful.

Although there are clear advantages to collecting information on teachers’ subject knowledge, doing this in contexts such as India and Pakistan is particularly challenging. For example, in Punjab, Pakistan, while teachers are tested monthly by provincial authorities and tests are not high stakes (in that salaries and promotions are not linked with teacher knowledge tests), teachers are very reluctant to take tests. Refusal rates are high when teachers are asked directly to ‘take exams’ or answer questions. An alternative approach is to gauge teachers’ subject knowledge indirectly by observing them undertaking a common teaching task, namely marking and correcting student tests. Since marking and correcting work is part of the job, teachers respond far more positively to requests to do this, and from observing their marking and corrections we can gauge their own level of knowledge.

For our study, only those teachers who reported teaching the subject in question were asked to mark tests. Each teacher was provided with one student test for the subject they were teaching. If a teacher taught both mathematics and Urdu/Hindi, then they were asked to mark one of each. They were asked to mark tests in the following way: correct answers were simply checked; for answers they would mark as incorrect, teachers were required to provide workings and an answer. All students in grades 3, 4 and 5 were administered the same test, so regardless of grade, teachers marked the same sample test. The sample test selected for the teacher to mark was the one with the most questions attempted and the most wrong answers.

Using a similar approach, SchoolTELLS assessments of teachers in literacy and numeracy in the states of Bihar and Uttar Pradesh, India (Kingdon et al., 2008) showed extremely low levels of teacher competence with teachers scoring 47.2% in maths and 64.9% in language assessments at grade 5 curriculum level. In Pakistan, schoolTELLS data (Atherton and Kingdon, 2010) was collected in Punjab with teachers scoring 69.5% and 73.9% in language and maths respectively, illustrating that teachers’ content knowledge was higher than their pupils’, as one would expect (Dundar et al. 2014). The challenge is that these (and our) studies indicate that it is important to identify if teachers have mastered their subject knowledge, as this factor can be associated with low pupil attainment. If their subject knowledge is poor, it is important to identify the reasons for this and what can be done to address it for policy purposes. In many cases, teachers are themselves affected by the low quality of the education system from which they have graduated, so it is key to find ways to tackle this potentially vicious cycle of low achievement.

Discussion and conclusions

The quantitative data collection approaches outlined in this paper are intended to enable us to identify the within and across school differences in the quality of teaching in selected schools in India and Pakistan. The data will enable estimation of the value added to pupils’ test scores by different schools, and we can then relate these estimates to the characteristics of teachers. Further, the teacher survey and classroom observations should enable us to explore how inequalities in learning across different types of student relate to teacher characteristics, attitudes and behaviours. These data are therefore important from a policy perspective since they can inform thinking on the causes of low levels of learning among some students and answer questions such as: is the variation in pupil achievement largely due to higher achieving children being clustered in “good” schools, or are there considerable differences in achievement levels across pupils within the same school? In this article, we seek to make perhaps an unusual contribution to the literature by providing a detailed account of the reasoning behind and practical difficulties encountered when trying to research the impact of teaching quality on student learning. Our aim is not only to illuminate our own study but also to provide a resource which other researchers might draw upon when embarking on this kind of research.

We have documented the nature of the data collection needed to measure teacher quality effectively at scale. What can this account tell us about how to undertake this kind of research in the future? First, the sheer scale of data collection means that inevitably small and lower cost projects will tend to be missing parts of the story. For example, studies may have teacher survey information but lack test score data or vice versa. We also note that whilst value added measures of student learning are central to any serious investigation of teaching quality, they are not sufficient to provide insights into the sources of variation in quality that are observed. Hence collecting rich information on teachers’ characteristics, attitudes and behaviours is essential if we are to develop our understanding of what factors predict high pupil value added and variation in teaching quality, and hence how we can actually improve teaching.

In the contexts in which we are working, there is also a distinct lack of high-quality administrative data. This too increases the data collection requirements for individual studies and whilst the data collected from classroom observations and semi-structured interviews is incredibly rich, we do need to recognise the high cost of such data collection methods. If studies also always develop their own survey instruments which are not necessarily comparable with those used in other studies, there is no possibility of building a cumulative body of evidence and combining studies to get a better picture of what is happening in schools over time. One suggestion therefore is that studies embarking on this kind of work should endeavour to use similar instruments to those that have gone before them which include items that can be compared (and for this reason we are publishing both our protocols and our data). In that way, an evidence base can be built from comparable measures used over time.

Footnotes

Acknowledgements

Comments made by participants at the 2017 United Kingdom Forum for International Education and Training (UKFIET) conference improved the paper immeasurably. We would also like to thank Faisal Bari, Anuradha De, Meera Samson, Nidhi Singal, and other members of the Teaching Effectively All Children (TEACh) team for their invaluable contributions to debates that have contributed to the arguments in this paper.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Economic and Social Research Council and the Department for International Development under the Raising Learning Outcomes programme (grant number ES/M005445/1).

Notes

Author Biographies

Monazza Aslam is a research fellow at the University of Oxford. She is an education economist and has worked extensively on teacher quality, student learning and the political economy of education systems.

Rabea Malik is an assistant professor at the School of Education, Lahore University of Management Sciences (LUMS), and a research fellow at the Institute of Development and Economic Alternatives (IDEAS).

Shenila Rawal is a quantitative researcher specialising in the economics of education. She is particularly interested in teacher quality and effectiveness, the relationship between poverty and educational outcomes, and the role of gender and social distance in reducing economic and educational gaps in developing as well as developed countries.

Pauline Rose joined Cambridge University in February 2014 as Professor of International Education, where she is the director of the Research for Equitable Access and Learning (REAL) Centre in the Faculty of Education. Prior to joining Cambridge, Pauline was Director of the Education for All Global Monitoring Report.

Anna Vignoles FBA is a professor of Education at the University of Cambridge. She has published widely on social mobility, the impact of school resources on pupil achievement and on the socio-economic gap in pupil achievement.

Lydia Whitaker has worked in education examining social and emotional abilities of typical and atypical children for over 8 years. Lydia’s PhD examined Emotion Category Boundaries in children with Autism Spectrum Disorder (ASD) in mainstream and special educational needs schools.

References

Alcott

Rose

(2015) Schools and learning in rural India and Pakistan: Who goes where, and how much are they learning? Prospects 45: 345–363.

Alcott

Rose

(2017) Learning in India’s primary schools: How do disparities widen across the grades? International Journal of Educational Development 56: 42–51.

Altinok

(2013) The impact of teacher knowledge on student achievement in 14 sub-Saharan African countries. Background paper for Education for All Global Monitoring Report. Available at: https://unesdoc.unesco.org/ark:/48223/pf0000225832

Andrabi

Das

Khwaja

(2015) Delivering education: A pragmatic framework for improving education in low-income countries. In: Dixon

Humble

Counihan

(eds) Handbook of International Development and Education. Cheltenham: Elgar, pp. 85–130.

Andrabi

Das

Khwaja

et al . (2011) Do value-added estimates add value? Accounting for learning dynamics. American Economic Journal: Applied Economics 3(3): 29–54.

Araujo

Carneiro

Cruz-Aguayo

et al . (2016) Teacher quality and learning outcomes in kindergarten. The Quarterly Journal of Economics 131(3): 1415–1453.

Aslam

(2012) Gender: A persistent source of inequality of opportunity in Pakistan. Annual Status of Education Report (ASER) Policy Brief. Available at: http://aserpakistan.org

Aslam

Kingdon

(2011) What can teachers do to raise pupil achievement? Economics of Education Review 30(3): 559–557.

Aslam

Rawal

(2015) Teachers – an indispensable asset: Examining teacher effectiveness in South Asia. In Dixon

Humble

Counihan

(eds) Handbook of International Development and Education: MA: Edward Elgar Publishing, Inc, pp. 256–276.

10.

Atherton

Kingdon

(2010) The Relative Effectiveness and Costs of Contract and Regular Teachers in India. Institute of Education, University of London, pp. 256–276.

11.

Azam

Kingdon

(2015) Assessing Teacher Quality in India. Oklahoma State University and IZA, Mimeo.

12.

Bau

Das

(2017) The Misallocation of Pay and Productivity in The Public Sector: Evidence from the Labor Market for Teachers. Washington DC: The World Bank.

13.

Bennell

Akyeampong

(2007) Teacher Motivation in Sub-Saharan Africa and South Asia. Educational Papers. Researching the Issues: 71. London: Department for International Development.

14.

Bold

Filmer

Martin

et al . (2017) What Do Teachers Know and Do? Does It Matter? Evidence from Primary Schools in Africa. Policy Research Working Paper; No. 7956. Washington, DC: World Bank.

15.

Bruns

De Gregorio

Taut

(2016) Measures of Effective Teaching in Developing Countries. Research on Improving Systems of Education (RISE) Working Paper RISE-WP-16/009. Oxford, UK: RISE Programme.

16.

Burgess

(2015) Human Capital and Education: The State of the Art in Economics of Education Available at: http://www.coeure.eu/wp-content/uploads/Human-Capital-andeducation.pdf

17.

Chudgar

(2013) Teacher labor force and teacher education in India: An analysis of a recent policy change and its potential implications. In: Akiba M (ed.) Teacher Reforms Around the World: Implementations and Outcomes. Emerald Group Publishing Limited, pp. 55–76.

18.

Cortez Ochoa

Thomas

Tickly

et al . (2018) Scan of International Approaches to Teacher Assessment. Bristol Working Papers in Education No. 06/2018. Bristol: University of Bristol.

19.

Darling-Hammond

(2012) Creating a Comprehensive System for Evaluating and Supporting Effective Teaching. Stanford, CA: Stanford Center for Opportunity Policy in Education.

20.

De Talancé

(2017) Better Teachers, Better Results? Evidence from Rural Pakistan. The Journal of Development Studies 53(10): 1697–1713. DOI: 10.1080/00220388.2016.1265944.

21.

Dundar

Beteille

Riboud

et al . (2014) Student learning in South Asia: Challenges, opportunities, and policy priorities. Washington DC, USA: World Bank Publications.

22.

Eide

Goldhaber

Brewer

(2004) The teacher labour market and teacher quality. Oxford Review of Economic Policy 20(2): 230–244.

23.

Glewwe

Kremer

(2006) Schools, teachers, and education outcomes in developing countries. In Hanushek

Welch

(eds) Handbook of the Economics of Education. vol. 20(2), Massachusets, USA. 1 June 2004. Oxford: Elsevier, pp. 945-1017.

24.

Glewwe

Hanushek

Humpage

et al . (2011) School Resources and Educational Outcomes in Developing Countries: A Review of the Literature from 1990 to 2010. National Bureau for Economic Research Working Paper No. 17554, October 2011

25.

Goyal

Pandey

(2013) Contract teachers in India. Education Economics 21(5): 466–479.

26.

Grossman

Loeb

Cohen

et al . (2010) Measure for Measure: The Relationship between Measures of Instructional Practice in Middle School English Language Arts and Teachers’ Value Added Scores. National Bureau for Economic Research Working Paper 16015.

27.

Hanushek

(2011) The economic value of higher teacher quality. Economics of Education Review 30: 466–479.

28.

Hanushek

Woessmann

(2011) Overview of the symposium for performance pay for teachers. Economics of Education Review 30: 391–393.

29.

Hanushek

Kain

O’Brien

et al . (2005) The Market for Teacher Quality. Working Paper 11154. Cambridge, MA: National Bureau for Economic Research.

30.

Howes

Burchinal

Pianta

et al . (2008) Ready to learn? Children’s pre-academic achievement in pre-kindergarten programs. Early Childhood Research Quarterly 23(1): 27–50.

31.

Ilm Ideas (2014) Access to finance for low cost private schools in Pakistan. UK Aid, 2014. Available at: https://educationinnovations.org/sites/default/files/Access%20to%20finance%20for%20low%20cost%20private%20schools%20May%202014%20FINAL.pdf

32.

Kane

Staiger

(2012) Gathering Feedback for Teaching: Combining High-Quality Observations with Student Surveys and Achievement Gains. Measures of Effective Teaching Project Research Paper. Seattle, WA: Bill and Melinda Gates Foundation.

33.

Kingdon

Teal

(2010) Teacher unions, teacher pay and student performance in India: A pupil fixed effects approach. Journal of Development Economics 91(2): 278–288.

34.

Kingdon

Banerji

Chaudhary

(2008) SchoolTELLS Survey of rural primary schools in Bihar and Uttar Pradesh, 2007–08. London: Institute of Education, University of London.

35.

Kingdon

Little

Aslam

et al . (2014) A rigorous review of the political economy of education systems in developing countries. Final report education rigorous literature review. Department for International Development. London: EPPI-Centre, of Education, University of London.

36.

Lavy

(2002) Evaluating the effect of teachers’ group performance incentives on pupil achievement. Journal of Political Economy 110(6): 1286–1317.

37.

Metzler

Woessmann

(2012) The impact of teacher subject knowledge on student achievement: Evidence from within-teacher within-student variation. Journal of Development Economics 99(2): 486–496.

38.

Moon

(2013) Teacher Education and the Challenge of Development: A Global Analysis. Abingdon, UK: Routledge.

39.

Mulkeen

(2013) Teacher Policy in Primary and Secondary Education in Development Cooperation. Discussion Paper. Bonn, Germany: Ministry for Economic Cooperation and Development.

40.

Muralidharan

Sundararaman

(2011) Teacher performance pay: Experimental evidence from India. Journal of political Economy 119(1): 39–77.

41.

Muralidharan

Sundararaman

(2013) Contract Teachers: Experimental Evidence from India. Mimeo, UC San Diego. Available at: https://www.povertyactionlab.org/sites/default/files/publications/147_313%20Contract%20Teachers%20Oct2013.pdf

42.

Nonoyama-Tarumi

Willms

(2014) Family Background Versus School Resources and Teacher Quality: Findings from the 2011 Trends in International Mathematics and Science Study. Background paper for Education For All (EFA) Global Monitoring Report 2013/4. Paris: UNESCO. Available at: https://unesdoc.unesco.org/ark:/48223/pf0000225953

43.

Pouezevara

Pflepsen

Nordstrum

et al . (2016) Measures of quality through classroom observation for the Sustainable Development Goals: Lessons from low and middle income countries. Background paper for the 2016 Global Education Monitoring Report. Education for people and planet: Creating sustainable futures for all. Available at: https://unesdoc.unesco.org/ark:/48223/pf0000245841

44.

Rawal

Kingdon

(2010) Akin to my teacher: Does caste, religious or gender distance between student and teacher matter? Some evidence from India. Department of Quantitative and Social Sciences (DoQSS) Working Paper No 10–18. London: Institute of Education.

45.

Rawal

Aslam

Jamil

(2013) Teacher Characteristics, Actions and Perceptions: What matters for student achievement in Pakistan? Working Paper 19. Oxford: Center for the Study of African Economies.

46.

Sankar

Linden

(2014) How much and what kind of teaching is there in elementary education in India? Evidence from three States. World Bank South Asia Region Human Development Sector Working Paper Series, No. 67, Washington DC, USA.

47.

Seidman

Kim

Raza

et al . (2018) Assessment of pedagogical practices and processes in low and middle income countries: Findings from secondary school classrooms in Uganda. Teaching and Teacher Education 71: 283–296.

48.

Sharma

Jacobs

(2016) Predicting in-service educators’ intentions to teach in inclusive classrooms in India and Australia. Teaching and Teacher Education 55: 13–23.

49.

Singh

Sarkar

(2012) Teaching quality counts: how student outcomes relate to quality of teaching in private and public schools in India. Young Lives Working Paper 91. Young Lives 2012. Available at: http://www.younglives.org.uk/content/teaching-quality-counts-how-student-outcomes-relate-quality-teaching-private-and-public

50.

UNESCO (2014) Education for All Global Monitoring Report - Teaching and Learning: Achieving quality for all. Paris: UNESCO.

51.

United Nations (2015) Transforming our world: the 2030 Agenda for Sustainable Development.

52.

General Assembly resolution A/RES/70/1, 25 September. Available at: http://www.un.org/en/development/desa/population/migration/generalassembly/docs/globalcompact/A_RES_70_1_E.pdf.

53.

Westbrook

Durrani

Brown

et al . (2013) Pedagogy, curriculum, teaching practices and teacher education in developing countries. Education Rigorous Literature Review. London, UK: EPPI-Centre, Social Science Research Unit, Institute of Education.

54.

World Bank (2018) World Development Report: Learning to Realize Education’s Promise. Washington DC: World Bank.