Abstract
Over the decades, it is evident that exceptional learners have been excluded from participating in international assessments such as OECD’s PISA (Programme for International Student Assessment) due to their disabilities. Drawing on the interdisciplinary theories and perspectives of educational assessment, measurement, and early childhood special education, the paper discusses the potential benefits young children with special needs may gain from the International Early Learning and Child Well-being Study (IELS), as well as considering caveats and challenges accompanying the use of IELS for these young special education populations. In particular, it raises a range of questions about what and how to collect, validly interpret, and use the IELS data to enhance early learning and development of exceptional learners in participating countries. Finally, the paper discusses accommodations that promote inclusionary assessment practices and level the playing field for young children with special needs.
Introduction
Research in neuroscience, psychology and education on young children before age five has lent strong support to the notion that quality pre-primary education and positive early learning experiences have short-term and long-lasting benefits for an individual throughout the life cycle (e.g. McCain et al., 2007; Mustard, 2006, 2010; National Research Council and Institute of Medicine, 2000; Shonkoff, 2010; Sternberg, 1985; Thompson and Nelson, 2001). In particular, high-quality early childhood education leads to improved outcomes at later stages of life, including cognitive, social-emotional, behavioral, language and numeracy skills as well as health and school achievements (Berlinski et al., 2009; Mustard, 2009; Nores and Barnett, 2010; OECD, 2017a; Tinajero and Loizillon, 2012). Influenced by such findings, international agencies such as UNESCO (United Nations Educational, Scientific and Cultural Organization), UNICEF (United Nations Children’s Fund), and the World Bank have identified early childhood education as a priority. For example, the first goal of UNESCO’s extensive international initiative Education for All is “[e]xpanding and improving comprehensive early childhood care and education, especially for the most vulnerable and disadvantaged children” (UNESCO, 2011: 29). Similarly, the OECD (Organization for Economic Co-operation and Development) has paid increasing attention to early childhood education, most recently launching the IELS (International Early Learning and Child Well-being Study). This paper will review, discuss, and critique this new international assessment based on the current literature on existing international assessments, in particular from the perspectives of exceptional learners, sometimes referred to as children with special needs or disabilities, and early childhood special education for these children. In so doing, the paper will acknowledge the potential benefits that such assessments may bring for these children and the services they use, while at the same time flagging up significant caveats and challenges that must be confronted if the potential is to be fully realized.
The purposes of the IELS: Intended versus unintended purposes and uses
As the validity of a given assessment must support its interpretations of test scores for intended uses (American Educational Research Association et al., 2014), it is critical to distinguish between intended and unintended purposes and uses. The OECD launched the IELS because [I]t will provide countries with a common language and framework, encompassing a collection of robust empirical information and in-depth insights on children’s learning development at a critical age. With this information, countries will be able to share best-practices, working towards the ultimate goal of improving children’s early learning outcomes and overall well-being. (OECD, 2017c: 14)
Moreover, as the IELS aims to measure four domains of early development (emergent literacy, numeracy, self-regulation, and social-emotional skills) through direct and indirect assessments, a key question is how this comprehensive large-scale data of the IELS will be analyzed, reported, interpreted, and used to maximize young children’s learning. Although the results of international comparative assessments continue to attract public interest and attention, complex technical information related to an assessment is often not transparent or translated into a common language that can help the public and media exactly understand what data is collected and how it should be used and interpreted.
Often, the average national scores in a given international assessment are used to rank countries, and these rankings have been widely used to make comparisons between countries. The media presents these initial releases of mean scores and country rankings to the public and policy makers, often as league tables similar to those commonly seen in sports news; it is only later that the international database and further critical analyses are released. Carnoy and Rothstein (2013) caution that “[t]his puzzling strategy ensured that policy-makers and commentators would draw quick and perhaps misleading interpretations from the results” (p. 3).
Due to the complexity of the international datasets, it is time-consuming to crunch the data and conduct advanced analyses and reporting that disaggregate test results by a variety of student-, school-, and country-level variables pertaining to social and economic backgrounds and characteristics. However, the results of further analyses may reveal interpretations that differ from the messages sent out to the public in the first place (Berliner, 2011; Carnoy and Rothstein, 2013). Furthermore, the media may ignore the cautions made by data analysts about the interpretations of the scores; their oversimplifications of assessment results may not, therefore, provide an accurate picture of test results over the decades. For instance, it is problematic to make inferences about small differences in average national scores without considering the measurement concepts (e.g. standard deviations, score distributions, consequential or decision validity) and statistical methodologies that have been used to calibrate the estimates and standard deviations (e.g. item response theory [IRT]) (Torney-Purta and Amadeo, 2013); in other words, a country’s ranking may be lower than other countries even though the differences in test scores between them do not achieve statistical significance. Such uses of the international assessments may lead to negative consequences with regard to policy decisions or education reforms (e.g. making reform decisions solely based on country rankings).
A key question that needs to be asked of the IELS, the latest international assessment, is what role it might play in early childhood education, policy, and service delivery in relation to a history of tension between formative (assessment for and as learning) and summative assessment (assessment of learning) (Birenbaum et al., 2006; Brindley, 2001; Remesal, 2011; Tan, 2011; Teasdale and Leung, 2000; Volante, 2010). Summative assessments, especially through examinations that have been constructed and used to meet different purposes, go back a long way. In the past, such standardized assessments of achievement were mainly used for accountability and administrative purposes (Crundwell, 2005; Klenowski, 2011; Smyth, 2008), dating back to the standardized civil service examinations of ancient China, from 2200
There continues to be an ongoing and heated debate on formative and summative assessment in individual countries and worldwide. Educators and researchers have raised serious concerns about the use of standardized, summative assessments when administered to students in order to measure abilities or content knowledge in relation to curriculum expectations or standards. However, it has been argued that a summative assessment can be used to improve teaching practices and help professionals make important decisions about the next steps in instruction; “a summative assessment should fulfill its primary purpose of documenting what students know and can do but, if carefully crafted, should also successfully meet a secondary purpose of support for learning” (Bennett, 2011: 7). Moreover, while the theory of formative assessment is widely accepted, previous studies have found in-service and pre-service teachers may hold different assessment beliefs (Brown, 2004; Brown, Lake, and Matters, 2009; Lin and Lin, 2015a, 2015b; Volante and Beckett, 2011).
Most importantly, many factors may add to the complexity of using summative assessments in early childhood. Researchers in this field have identified challenges arising from standardized testing for young learners due to the unique developmental characteristics of young children such as behavioral fluctuations, swift developmental changes, and variations in child development (Bracken and Nagle, 2007; Nagle, 2007; Squires et al., 2015). Consequently, the theory and concepts of formative assessments and assessment for and as learning are much more appealing than standardized tests to many educators working with this young age group, because the intent or spirit underlying this approach emphasizes assessment practices that keep track of student progress, and informs teachers’ teaching practices and adaptations to advance student learning on a continuous basis (Black and Wiliam, 1998a, 1998b, 2009; Earl and Katz, 2006; Smith and Gorard, 2005; Wiliam, 2011). The proponents of this theory have urged educators to shift their assessment practices from summative or standardized assessments to assessment for and as learning.
Take a provincial testing program in Canada as an example. Saskatchewan students have participated in international assessments in Canada since 1996, and were also subject to a large-scale provincial summative assessment regime (‘Assessment for Learning’), which was administered every two years in math, reading, and writing. This regime experienced increasing push back from local teachers, teachers’ educators, policy makers, and other stakeholders, contributing to the government of Saskatchewan introducing a new assessment initiative in 2012 (‘Education Sector Strategic Plan’). Such concerns, resistances, and challenges may well build, over time, toward the IELS and its summative approach to assessment.
Equity in special education: Inclusion versus exclusion
The educational philosophy underlying inclusive education has been accepted by societies worldwide: that it is imperative to provide equitable education for all students irrespective of their varied learning needs (Florian, 2008; Skiba et al., 2008). Providing quality and equitable learning opportunities to exceptional learners, or students with special needs, can benefit both education systems and societies themselves. The OECD’s Programme for International Student Assessment (PISA) includes a wider range of disabilities in special education than other international assessments (Smith and Douglas, 2014), and this paper therefore focuses on PISA in its consideration of the inclusion of exceptional learners in such assessments.
Given that the goal of PISA is to promote education equity, it was recently criticized by Schuelka (2013) because its “exclusionary discourse establishes that students with disabilities do not belong in a culture of achievement and educational evaluation, which has an impact on policies concerning educational equity and maintains the oppression of low expectations” (p. 216). The current PISA policy allows students to be excluded either at the school level or the within-school level. In particular, individual students may be excluded if one of following criteria has been met (OECD, 2017b: 67):“
They attend schools that only instruct students with (a) intellectual or (b) functional disabilities, or those with (c) limited proficiency in the assessment language can be excluded (i.e. school-level exclusion). The overall exclusion rate that combines both within-school and school-level exclusions should not exceed 5% of the PISA students in each participating country. The school-level exclusion that occurs due to the first three reasons (1(a), (b), and (c)) should not affect more than 2% of PISA students. The within-school exclusion for the first three reasons (1(a), (b), and (c)) should be lower than 2.5% of the PISA population” (OECD, 2017b).
The results of PISA have attracted public interest and heated debates on the quality of education systems in most participating countries since it was first introduced in 2000 by OECD. PISA has driven education policies and reforms in many developed countries such as Germany, Denmark, and the USA (Carnoy and Rothstein, 2013; Dolin and Krogh, 2010; Fuchs and Wößmann, 2007; Neumann et al., 2010; Yore et al., 2010). At the same time, this international assessment has been criticized for marginalizing or even excluding students with disabilities from participation (Schuelka, 2013; Smith and Douglas, 2014). For example, the overall exclusion rate of several countries exceeded 5% in PISA 2015, including Canada (7.49%; weighted number of excluded students [WEx] = 25,340), New Zealand (6.54%; WEx = 3,112), Sweden (5.71%; WEx = 4,324) and UK (8.22%; WEx = 34,747) (OECD, 2017b). It was reported that Canada has relatively high exclusion rates for both PISA 2012 and 2015 due to the exclusion of students with special needs (OECD, 2017b). Furthermore, Smith and Douglas (2014) also indicate that Canada, among some other countries, had a high within-school exclusion rate and a low inclusion rate in PISA 2009 (5.47% and 0.66%, respectively).
Simply put, a high exclusion rate may affect how psychometricians analyze the data and how the test users interpret the data in a way that is valid and meaningful for special education student populations. These students have often been absent from the planning, development and programming for countrywide education policies and reforms. Consequently, these students may be doubly disadvantaged because they have been routinely ignored in the process of national and international program evaluation as well as in educational reforms.
This prior experience raises important issues for the IELS. How will young children with special needs, such as those with developmental delay, be included or excluded from the IELS? Eligible populations of the IELS are young children around five years old who are formally enrolled in early childhood centers and schools (OECD, 2017c). While acknowledging that prevalence rates of developmental delay in children may vary by country, ethnicity, and age, international census data suggests that a large number of young learners may experience significant developmental delays. For instance, according to the statistics of the National Health Interview Survey 2011 to 2014, a nationally representative household survey conducted by the National Center for Health Statistics of the Centers for Disease Control and Prevention in the USA, the estimated prevalence of autism spectrum disorders, intellectual disabilities, and other developmental delays range from 1.10% to 5.76% for children aged 3 to 17 years old (Zablotsky et al., 2015). Among approximately 2.3 million children younger than six years old, the prevalence rates of developmental delay rose from 0.16% to 3.25% from 1997 to 2008 based on the National Health Insurance Research Database in Taiwan (Kuo et al., 2015). Based on the Australian Bureau of Statistics, the prevalence of developmental delay is estimated to be between 1 to 3% (N = 3,000 to 9,000) (Silove et al., 2013). Another population-based cross-sectional survey, the English Spring 2008 School Census study, revealed that 4.8% of children in England aged 7 to 15 years old were identified as having intellectual or developmental delay (Emerson, 2010).
International comparative assessment of exceptional learners
Based on the World Disability Report of 2011, an estimated 978 million individuals experience moderate or severe disabilities (15.3%), with 185 million people having severe disabilities (2.9%) among a world population of 6.4 billion. Among these disability populations, an estimated 93 million children under age 14 years old have moderate and severe disabilities (5.1%) and 13 million children experience severe disabilities (0.7%) (WHO and World Bank, 2011). Survey data suggests that large numbers of children with disabilities have been excluded from education and lack opportunities to learn, especially children at pre-school or secondary levels as well as those in remote or rural areas. For example, in developing countries, an estimated 90% of children with disabilities do not attend any school (UNICEF, 2014a). While, to take a regional example, UNICEF has reported that an estimated 3.6 million children with disabilities are not officially identified and do not receive special education and services in Central and Eastern Europe and the Commonwealth of Independent States; for those who are officially identified as having disabilities, 1.5 million children are not provided with adequate special education and services and are more likely to be placed in segregated special education schools (UNICEF, 2014c).
The United Nations Convention on the Rights of Persons with Disabilities (CRPD) has substantial influence over a series of global and regional initiatives and actions in many countries, especially for developing and low-income countries. It also mandates that “States Parties undertake to collect appropriate information, including statistical and research data, to enable them to formulate and implement policies to give effect to the present Convention” (Article 31) (United Nations, 2006). Scientific evidence, such as that provided by such data, helps global and local stakeholders make well-informed decisions about varied policies, programs, and services offered to individuals with exceptionalities. Moreover, without large-scale assessments, it is likely that the voices of many children with disabilities will continue to be overlooked or hidden, leaving their stories untold.
Population-based data, if employed properly and interpreted validly and fairly, has great potential to inform local, regional and national policy makers and stakeholders about what and where additional support is most needed (UNESCO, 2014). It has also enabled international agencies to identify gaps in service delivery and equitable learning opportunities between and within different countries (Graham, 2014; UNESCO, 2014; UNICEF, 2014a, 2014b). UNICEF, for example, using such data has continued working with local stakeholders in a number of countries to increase access to quality education and promote inclusive education for students with disabilities.
In 2012, UNICEF collaborated with the Ministry of Education and Science in Georgia to promote inclusive physical education and sports for children with special needs in 65 schools (UNICEF, 2013). In the same year, UNICEF also trained 100 teachers and counselors in braille and sign language in Niger to ensure students with visual and hearing impairments gain equal access to quality education (UNICEF, 2013). In a 2014 report on gap analysis of special education, UNICEF engaged in extensive collaborative efforts with local governments and regional offices to conduct a needs assessment in Bhutan and the Maldives to identify successful practices, understand the challenges, and make recommendations for further improvements in policy, teacher education, and early diagnosis and early childhood development (UNICEF, 2014b). In addition, more than 1300 Egyptian children with disabilities were enrolled in 120 public schools equipped with inclusive learning resource rooms under the support of UNICEF (UNICEF, 2017).
There is a great potential to use the IELS to trigger new global initiatives and actions to support and advocate for young children with disabilities and their families, enabling it to fulfill its goal of “improving children’s early learning and overall well-being” (OECD, 2017: 14). But to realize this potential, rather than focus on country rankings (which as previously discussed, require extreme caution), this international assessment should focus on what and where early childhood special education programs and services are most needed, as well as help identify the gaps in policies and practices at national and international levels. To achieve these ultimate goals, the demographics, the use of accommodations, and performance of students with special needs must be recorded to allow rigorous analyses.
International large-scale assessments (ILSAs), such as the IELS, face major challenges in addressing such goals, not least how the needs of special education populations can be accurately captured and described given that different approaches and diagnostic tools have been employed to measure disabilities across countries and geographic areas and often systematically under-report the prevalence of disabilities. It is evident that reported prevalence rates of disability may vary depending on several factors, such as research methodologies, research objectives, and definitions and instruments that are used to measure and define disabilities within a region or country (WHO and World Bank, 2011). For instance, 35.9% of respondents in Swaziland reported having disabilities in a face-to-face household survey, the World Health Survey, which was used to collect disability and health data worldwide during 2002 to 2004 (WHO and World Bank, 2011); in contrast, only 2.2% of the Swaziland population was reported to be disabled in the 1991 census data (WHO and World Bank, 2011).
As this example makes clear, a common language and frame of reference is essential when countries assess and record the status of disabilities of young learners when participating in ILSAs. The International Classification of Functioning Disabilities and Health, known as ICF, provides one such conceptual framework, characterizing a disability by (a) functionality and disabilities (body functions and body structures; activities and participation), and (b) contextual factors (environmental and personal factors) (WHO, 2002, 2013). Most importantly, ICF also offers a version for children and youth (ICF-CY) adopting the same conceptual framework (WHO, 2007), but developing classifications that respond to developmental changes and the roles or activities undertaken by children and adolescents as opposed to those for adults. For example, the code “handling stress and other psychological demands” (d240) in the ICF’s Chapter 2 in the component of Activity and Participation is classified as follows. Carrying out simple or complex and coordinated actions to manage and control the psychological demands required to carry out tasks demanding significant responsibilities and involving stress, distraction or crises, such as driving a vehicle during heavy traffic or taking care of many children. Carrying out simple or complex and coordinated actions to manage and control the psychological demands required to carry out tasks demanding significant responsibilities and involving stress, distraction, or crises, such as taking exams, driving a vehicle during heavy traffic, putting on clothes when hurried by parents, finishing a task within a time-limit or taking care of a large group of children.
The comprehensive guideline of Developmentally Appropriate Practice (DAP) has been introduced to early childhood education over the past three decades in the USA (Bredekamp, 1987; National Association for the Education of Young Children, 2009). This widely adopted framework encourages educators and families to revisit their beliefs about a child’s development while taking individual differences into consideration. Caregivers and professionals need to situate a young child’s development and learning within social-cultural contexts, in addition to determining if s/he meets what is expected based on the child’s chronological age. Simply put, early childhood educators should always keep in mind what learning activities are developmentally appropriate for a child—age appropriate, individually appropriate, socially and culturally appropriate. For special education student populations, a young child may experience significant developmental delay if s/he does not achieve milestones in one or more areas of development indicated in his/her individualized family service plan (IFSP) or individual education plan (IEP): physical, cognitive, communication, social-emotional, and adaptive development) (IDEA, 2004).
Accommodation policy for young children with special needs
Appropriate test accommodations (e.g. the provision of extra time, the use of supportive technology and computers, reading instructions aloud) have proven to be important for children with special needs across different developmental stages to demonstrate what they have learned and can do in the inclusive classroom. In the past, even though some students with special needs have participated in PISA, assessment policies regarding the use of accommodations have not been put in place, despite the OECD being aware that the numbers of students identified as needing accommodations are increasing exponentially in different countries (OECD, 2013). This has resulted in higher exclusion rates of students with special needs. So even though accommodations may not guarantee an increase in a child’s assessment performance, they would help increase access to test content, and consequently improve participation rates of exceptional learners in the assessment.
Two well-known hypotheses have been developed and used to test the appropriateness and effectiveness of a specific type or multiple combinations of accommodations (Bolt and Ysseldyke, 2006; Fuchs and Fuchs, 1999; Phillips, 1994; Pitoniak and Royer, 2001; Sireci et al., 2005). The interaction hypothesis explains the valid use of accommodations by assuming that only individuals identified as needing accommodations would benefit from them, and that those who do not need accommodations should not make significant gains on a test in both accommodated and standardized testing conditions. The differential boost hypothesis suggests that both groups would benefit from the accommodations, although children needing accommodations should make significantly larger gains than those who do not need them in both testing conditions. It is worth noting that different accommodations may have differential effects on the academic performance of students with varied special needs. For example, a recent study examined the effects of multiple combinations of accommodations on the performance of secondary students with learning disabilities, emotional or behavioral disorders, or multiple exceptionalities on a provincial literacy test in Ontario, Canada, including extended time, setting, preferential seating, supervised periodic breaks, the use of computer or word processor for responses, prompts for students with severe attention problems, verbatim reading of writing prompts and tasks, scribe, or assistive technology (Lin and Lin, 2016). The result was clear: the probability of completing the mandatory literacy assessment varied among these students with special needs depending on whether or not they received specific combinations of accommodations.
It should be emphasized that accommodations, if appropriate, should maintain the integrity of a test given that certain changes have been made to test administration, format, or content. In other words, these changes should not prove a threat to test validity; a test measures what it is supposed to measure in both accommodated and standardized testing conditions (American Educational Research Association et al., 2014; National Research Council, 2004; Sireci et al., 2005). While different terms may be used interchangeably by educators, changes that do alter the underlying construct(s) that are being measured are sometimes referred to as modifications. For instance, giving extra time to a child to complete a timed task is considered to be a modification, instead of an accommodation.
Although the vast majority of standardized developmental assessments do not allow for any changes to standardized testing, the BDI-II (Newborg, 2005), a commonly used test of child development, has provisions for test adaptations for young children with visual, hearing impairments, and/or physical disabilities (Johnson and Marlow, 2006). It assesses the varied skills acquired by a child from birth to eight years old, including adaptive, communication, cognitive, personal-social and motor skills. Such provisions were developed to ensure that the use of this norm-reference assessment is not compromised by the changes made to the test.
Conclusions
This paper has discussed the potential benefits, caveats, and challenges of ILSAs such as PISA and the IELS, by reviewing interdisciplinary theories, including assessment theories (e.g. assessment for, as, of learning), and educators’ beliefs and attitudes towards different purposes of assessment, in particular the tension between assessment for and of learning (formative vs. summative assessment). It has also reviewed PISA’s existing participation policy for test taking by students with special needs. It is evident that some participating countries have higher exclusion rates of students with disabilities than other international counterparts; many exceptional learners have been excluded from assessments, especially large-scale ones, and consequently this important group of children have been historically under-represented in educational policy making. However, even if included, the abilities of many children with special needs are often masked by disabilities and they cannot participate meaningfully in assessments without the use of accommodations. Current literature supports the view that appropriate accommodation policies and practices would increase the participation rate of students with special needs (Christensen et al., 2011; Christensen et al., 2008).
Given that the mandates of international organizations such as OECD are to promote equality and equity for all people, it is recommended that future development of the IELS addresses the following issues:
create assessment and accommodation policies that are equitable for young special education student populations as well as review these policies periodically; ensure assessment measures align with developmentally appropriate practices in early childhood education to ensure the assessment itself is age, individually, culturally and socially appropriate; and consider what advantages the IELS offers compared to existing developmental assessments such as Ages and Stages Questionnaires (ASQ; Squires and Bricker, 2009; Squires et al., 2015), Battelle Developmental Inventory II (BDI-II; Newborg, 2005), and Behavior Assessment System for Children Third Edition (BASC-3; Reynolds and Kamphaus, 2015); collect student-level data (e.g. disability and accommodation status, specific types of disabilities, and the accommodations a student receives for the assessment) to allow rigorous analyses and reports on performances of students with special needs within and across countries. Without such specific student indicators in the datasets, it is not possible to perform the analyses that can describe these young exceptional learners’ development and inform important national and international stakeholders about their special needs. Most importantly, the IELS should use these data to help identify the gaps in special education policies and practices at local, national, and international levels; the importance of developing accommodation policies that consider: (a) young learners’ special needs, (b) the purposes and characteristics of the IELS, (c) test validity, and (d) the permitted types of accommodations, and (e) exceptional learners’ developmental milestones and learning outcomes (Lin et al., 2018). Note that the policies should be ones that maintain the IELS test validity. To address the concerns regarding the use of test accommodations for young learners with special needs, the IELS should consider the following questions about its test development, implementation, and data analysis: what activities and tasks does each participating child with special needs have to perform? What accommodation(s) can be provided to a child without compromising the integrity of the assessment? Will a global accommodation policy be developed and implemented given that there is a great variation in assessment capacities across countries? How will the complex accommodation data be analyzed, interpreted, and reported? This article has discussed the core hypotheses and principles of test accommodations in relation to educational measurement and special education theories (Lin et al., 2018). Each theory is supported by a strong body of theoretical and empirical work and lends itself well to the design and use of the IELS; in future developments of the IELS, the OECD should pay close attention to these theories and their implications to make sure that the IELS enhances equity in early childhood special education; and rather than focusing on country rankings, the IELS needs to pay close attention to how these data can be used to determine what and where early childhood special education programs and services are most needed.
These are all important caveats and challenges. But if successfully addressed, and if designed, used, and interpreted validly, then ILSAs, such as the IELS, may have the power to improve the quality of educational policies, instruction, and learning progress of exceptional learners educated in inclusive settings (Bennett, 2011; Popham, 2011; Stiggins, 2006; Wiliam, 2011).
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
