Abstract
Purpose:
Making policy makers, researcher, education leaders, and assessment developers aware that what matters in education assessment is a wicked problem that cannot be easily solved following traditional approaches.
Design/Approach/Methods:
Starting from the questions that what matters in education assessment, this article presented such questions as a wicked problem because there is no consensus, not right or wrong answer, and certain solutions may lead to side effects on students and society. Therefore, a new approach of ecology should be involved, and different education outcomes or intended qualities of learners are presented in complex relationships.
Findings:
Deciding what matters in education assessment is a wicked question. It is not a tame or technology problem and can be resolved by any conventional approaches. What is pivotal now is to decipher what matters in education and then what should be measured and ultimately how to measure. The ecology and collaborate approach deliberated in this article could expedite such a process.
Originality/Value:
This article advocates paradigm change in understanding and resolving one of the most urgent problems in education. It provides an ecology explanation of the relationships that exist among the different education outcomes and students’ qualities. By guiding through the dissecting of the problem step by step, this article has demonstrated a unique angle of understanding the wicked problem.
What is measured in education represents a society’s view of what is important for schools to teach and what matters for children to learn. What gets measured reflects the outcomes a society expects of its education system and what its future citizens should be equipped to do. Paradoxically, once an educational outcome is measured, it becomes what matters, even if it turns out to be an unimportant, or irrelevant, outcome. What we measure is what schools and students pursue. As a result, what is measured has a significant impact on the curriculum, the educational experiences of children, and the qualities of future citizens. For example, the
We are on the brink of a significant shift in education measurement that can have long-lasting impact. Recent years saw increasing discontent with traditional measures of content or cognitive skills across a narrow spectrum of subjects (e.g., math and literacy) because these measures seemingly reflect only a minimal part of what is needed for success in the new world (Brunello & Schlotter, 2010; Duckworth & Yeager, 2015; Levin, 2012; Wagner, 2012; Zhao, 2016a). Accompanying this discontent has been a growing awareness of the role that nonacademic (in the traditional sense) capabilities have in student success, resulting concomitant proposals outlining what ought to be measured and taught in schools (Brunello & Schlotter, 2010; Duckworth & Yeager, 2015; European Communities, 2006; Levin, 2012; Partnership for 21st Century Skills, 2007; Wagner, 2008, 2012; Zhao, 2016a). It also appears that policy makers and education leaders are increasingly willing to consider expanding the definition of educational outcomes and, consequently, what should be measured (Perkins-Gough, 2013). Thus, there are signs that the education measurement enterprise will undergo major systemic change and, by extension, so will students’ education experiences.
The list of what ought to be measured is long and wide-ranging. In addition to traditionally measured subjects, such as math, language arts, social studies, and science, there have been calls to add assessments of other subjects, such as arts, music, foreign languages, and financial, digital, or global literacy (Zhao, 2016a). Furthermore, it has been suggested that we also measure other capabilities, variously labeled as noncognitive skills, dispositions, human qualities, or 21st-century skills (Duckworth & Yeager, 2015). These capabilities include a set of sometimes overlapping and potentially competing concepts, including innovation skills (Wagner, 2008; Wagner, 2012), creativity (Beghetto, 2017; Kaufman & Beghetto, 2009), entrepreneurial skills (Aspen Youth Entrepreneurship Strategy Group, 2008; Zhao, 2012), happiness (Seligman, 2006, 2011), physical well-being, self-determination (Ryan & Deci, 2000), social–emotional well-being (Wentzel, 1991), mind-set (Dweck, 2006), grit (Duckworth, Peterson, Matthews, & Kelly, 2007), resilience, communication skills, and collaboration skills (Partnership for 21st Century Skills, 2007; Trilling & Fadel, 2009).
Given the potential gravity of the consequences of expanding educational outcomes, we must carefully define the
What matters as a wicked problem
The term
The first step to resolve or manage a wicked problem is to recognize it as such. Successfully tackling wicked problems requires new ways of thinking about problems and solutions. In this article, we discuss some of the wickedness of the problem of defining what matters in education assessment. Our purpose is to bring awareness to policy makers, researchers, education leaders, and assessment developers, so that we recognize this as a wicked problem that cannot be easily resolved following traditional approaches, in large part because measurement is often treated as tame technical problem that can be solved following traditional linear, analytic approaches. We also suggest a set of questions that can guide collaborative efforts to support better solutions to this wicked problem.
No consensus, not right or wrong answers
One of the defining characteristics of wicked problems is that their solutions are not completely or verifiably right or wrong, but rather better or worse depending on perspective of the particular stakeholder (Conklin, 2006; Rittel & Webber, 1973). Different stakeholders will have differing views of and solutions to wicked problems, and often there is no clear consensus.
What matters in education is a classic wicked problem. There seems to be consensus that we need to measure what matters, but there is no consensus on what exactly does matter. Disputes over educational outcomes are not a stranger to education. The academic traditionalists have battled the progressive educationalists over academic skills versus child development for over a century (Hirsch, 2010; Norris, 2004; Ravitch, 2001; Wraga, 2001). Different stakeholders have made various proposals: some want to hold onto the foundational skills in core subjects (e.g., math, language arts), some want to add noncognitive skills but still keep the traditional, some want to replace the traditional with the new, and still others want to expand the list of what has been traditionally measured. Each group of stakeholders has their evidence and rationale, and no one proposal can be shown to be entirely wrong or right scientifically.
Even among newly proposed capabilities and qualities (e.g., noncognitive), there exists a wide range of competing ideas, with each proposed list vying to be most important. There is the definition of College and Career Readiness (CCR), a generally accepted goal for today’s schools in the United States, which includes five categories of outcomes: (a) academic knowledge; (b) critical thinking/problem-solving; (c) social and emotional learning, collaboration, and/or communication; (d) grit/resilience/perseverance; and (e) citizenship and/or community involvement (Mishkind, 2014). There are the popular 4Cs—creativity, communication, critical thinking, and collaboration—that are deemed the most important skills for the 21st century by the Partnership for 21st Century Skills (2007), as well as the 6Cs proposed by Canadian education thinker Michael Fullan: character, citizenship, communication, critical thinking and problem-solving, collaboration and teamwork, and creativity and imagination (Fullan, 2013). Pennsylvania State psychologist Angela Duckworth champions the importance of grit (e.g., Duckworth et al., 2007), while Yale’s Zorana Ivcevic and Marc Brackett suggest personality traits matter more (Ivcevic & Brackett, 2014). Mind-set matters a lot, according to Stanford psychologist Carol Dweck (Dweck, 2008). Yong Zhao of the University Kansas suggests that the entrepreneurial mindedness as necessary (Zhao, 2012). And there are more: dispositions (Costa & Kallick, 2013), emotional intelligence (Goleman, 1995), self-determination (Ryan & Deci, 2000; Wehmeyer, Shogren, Little, & Lopez, 2017), and the seven survival skills for the future identified by Harvard’s Tony Wagner (Wagner, 2008, 2012).
No single proposal is the
Wicked problems: Negative side effects
Solutions to wicked problems always carry unforeseen or unforeseeable consequences because so many factors work at the same time and they are highly interactive. When a solution is applied, it may solve part of the problem, but it may also create new problems. Even worse, they could leave a host of negative side effects on students and the society (Zhao, 2017, 2018b).
For example, the almost exclusive focus on testing math and literacy from
Wicked problems: No immediate or ultimate test of a solution
Another characteristic of wicked problems is that there is no immediate and ultimate test of the impact of the solution. In the case of deciding what to measure in education, we cannot immediately know if what is measured really matters, nor can we ultimately verify if what is measured actually matters in the future. Although we can use past evidence to infer what will matter in the future, there is no assurance that what matters now or in the past will matter equally in the future. Moreover, the concept of
The predictive power of the commonly used measures in education has been very limited. IQ scores, grade point average, and standardized tests such as the SAT and ACT have not been able to predict life success or even very narrow definition of success: college academic success (Zhao, 2016c). Early reading has been found to be associated with early education success (The Annie E. Casey Foundation, 2013), but it has also been found to be “associated with worse long-term outcomes including less overall educational attainment, worse teenage and adult adjustment, and increased alcohol use” (Kern & Friedman, 2009, p. 428). This is why there is increasing interests in finding more constructs to measure—to better capture the range of factors that influence success.
In essence, solutions to wicked problems are a bet, which cannot be proven right or wrong. Thus, when faced with a wicked problem, it is all the more important to ensure that the solution is as good as possible to begin with. Coming up with as-good-as-possible solutions for wicked problems requires unconventional strategies.
A collaborative approach to tackling wicked problems
There are different approaches for tackling wicked problems (Brown, Harris, & Russell, 2010; Roberts, 2000). According to Roberts (2000), there are three primary approaches to solving wicked problems. The authoritative approach is a top-down process through which decisions can be made quickly by a designated group of experts. By definition, the authoritative approach has misalignments with wicked problems, including not being able to identify recognizable and widely accepted experts. Thus, while this approach has the potential to bring an immediate, ideally workable solution, this solution causes disagreement and has difficulty gaining wide acceptance within the field of practice.
On the other hand, the competitive approach for tackling a wicked problem has various groups competing for the winning solution. As compared to the authoritative approach, the competitive approach has the potential to support more idea generation, yet it also has the potential to reinforce warring factions within the solution. This can lead to extraneous problems, supporting greater unrest, and consuming needed resources. In the end, the competitive approach may or may not produce workable and accepted solutions.
Finally, the collaborative approach to a wicked problem focuses on bringing various stakeholder teams and ideas together to work toward agreed-upon solutions. In the short term, the collaborative approach may seem slower and less ordered than the authoritative and competitive approaches. In the end, however, the collaborative approach has the potential to provide a host of workable solutions that are more widely accepted across the field. The key to supporting a collaborative approach is to establish purposeful support structures and agreed upon processes in place for encouraging open problem-solving.
Too often, deciding what matters in education has followed the authoritative approach. The decision about
A collaborative approach is considered better in the field. This approach requires all who are impacted by the problem to actively participate in formulating solutions. Stakeholders of education outcomes include students, parents, teachers, school leaders, employers, the public, and policy makers. In deciding what matters in education, all these stakeholders should be given the opportunity to contribute to developing workable solutions. Notably, students, parents, and teachers—the three groups of stakeholders with the most at stake in the solution—have traditionally played only marginal roles in what have been top-down approaches to school reform (Sarason, 1990) and must be supported to be involved as meaningful, active contributors to any solution to the
Focal questions
In the collaborative process, stakeholders can have different opinions, but it would be more productive to focus the exchange of opinions on a set of meaningful questions. The tradition of deciding educational outcomes follows a winner-takes-all approach. That is, the prevailing opinion is applied to all students. In other words, whichever side wins the argument gets to decide and consequently impose the decided set as expected outcomes for all children. The winning set of outcomes become codified as curriculum standards, accountability measures for schools and teachers, and bases for high-stakes decisions about the life of students, for example, college admissions, grades retention, or designation for special or gifted education status.
As a result, the debates about what to count as outcomes that mater in education have been fierce, with different side working hard to convince the others and policy makers as well as the public that their proposed set has more merits than those of others. However, given the wickedness of the problem as discussed before, the disputes cannot be settled this way. Instead, we should challenge the winner-takes-all mind-set. Instead of asking which proposed set is superior, we can frame the debates in a more productive way with different questions: Is it necessary for all individuals to be equipped with all the proposed qualities to be successful, however defined? Is it possible for all children to acquire all the qualities? Are all qualities of equal importance all the time? Can all proposed qualities be measured? If not, are the unmeasured qualities less important?
Do all children need the same set of qualities?
Traditionally, once what matters is decided, it is applied to all children. There is often the assumption that all children need the same set of capabilities to succeed in life. But, upon close examination, this assumption may be mistaken. It is without question that everyone needs a set of basic
The foundation of modern economy—the division of labor—demands specialization. As a result of the Fourth Industrial Revolution (Schwab, 2015; World Economic Forum, 2016) with technology performing even more complex but repetitive, identical tasks, the need for more specialization has grown even more acute. What makes an individual successful is often a unique combination of qualities that may not be replicated in another person or smart machine. The combination includes cognitive skills, noncognitive skills, and domain-specific knowledge and abilities. There is no one profile of qualities that is universally applicable to all tasks, jobs, and professions (Rose, 2016; Zhao, 2012, 2016b, 2018a), nor is it technically possible for all students to develop the same qualities (Barrett, 2017). For example, what makes a musician successful is certainly quite different from the qualities that help an engineer perform well. Even within a profession, there are different tasks, for example, there are routine engineering jobs that require different qualities from creative engineering jobs. Research has found that creative individuals in fine and performing arts have different profiles of personalities and skills from creative individuals in technical/engineering fields (Kerr & McKay, 2013).
Basic primary and secondary education have been tasked with equipping children with both sets of qualities—floor and
Hence, discussions about what matters in education need to focus on what is the minimal set of qualities that everyone should acquire and how much effort should be devoted to ensuring that all children acquire the same set of basic qualities. Once answered, then discussions can focus on other qualities. For example, when should children be given the opportunities to specialize to develop their unique set of qualities? What kind of exceptions should be given to students in special circumstances when demanding the basis qualities may hinder their long-term development?
Is it possible to pursue all proposed outcomes?
Another important question to consider when attempting to resolve the wicked problem of what matters in education is whether it is desirable to pursue all proposed outcomes, even if it were possible for all children to be equipped with all the qualities. The answer to this question lies with an understanding of the relationships among the different qualities.
Ecology provides a useful framework for understanding the relationships among the different outcomes. Ecologists have identified five important types of interactions between two organisms: (a) competition—both organisms have some kind of negative effect on each other; (b) predation—positive for one (the predator) and negative for the other (the prey); (c) parasitism—negative for one (the host) and positive for the other (the parasite); (d) commensalisms—positive for one (the commensal) and no effect on the other; and (e) mutualism—positive for both (Odum, 1997). We can imagine individual students as an ecosystem in which the different qualities interact with each other as organisms. The different qualities (what matters) can be imagined to have similar types of relationships as living organisms in an ecosystem. There are perhaps four types of relationships that exist among the different education outcomes or intended qualities.
Competition
Two qualities compete with each other for resources. In an ecosystem, lizards and frogs, for example, are in competition because they both eat small insects. This is a win–lose relationship. This relationship exists among educational outcomes all the time. For example, different subjects are in constant competition for time and other instructional resources, as well as students’ attention. A student cannot simultaneously be devoting time to music and math at the same time because time is constant. For the same reason, a school cannot possibly increase time for math or reading without taking time away from other activities. Increasing time for one subject necessarily reduces time for other subjects. This win–lose relationship was evidenced in the effect of
Predation
Desired qualities can also have a predatory relationship. In ecosystems, a predatory relationship is one in which the growth of the predator relies on the disappearance of the prey. This is also a win–lose relationship. For instance, birds gain energy by eating earthworms, but earthworms gain no benefit. Predatory relationships exist among educational outcomes because the growth of some outcomes depends on the decrease of others. For example, increased levels of obedience and compliance rely on reduction in the willingness to question the status quo and authority or to express one’s own thoughts and opinions. An individual cannot be compliant and creative at the same time. Academic performance typically reflects one’s willingness to follow instructions and provide predetermined answers, while creativity reflects more on one’s confidence and courage to question the status quo and express one’s own views. There is evidence (e.g., Pretz & Kaufman, 2017) suggesting that high levels of academic performance in the form of high school class rank can come at the cost of creative confidence. Thus, educational strategies that focus on increasing academic performance can prey on creative expression.
Commensalism
Some qualities can benefit from other outcomes without benefiting or harming these other outcomes, a commensalism relationship. The transparent shrimp, for example, lives in a reef that provides benefits to the shrimp, such as camouflage, but the reef does not receive any benefit or damage from this relationship. This is a win–neutral relationship. In education, such relationships exist as well because the improvement in some outcomes is dependent on the increase of others, but the relationship is unidirectional. For example, evidence suggests that grit and growth mind-set can improve academic outcomes (Claro, Paunesku, & Dweck, 2016; Duckworth et al., 2007), but there is little evidence that academic outcomes also increase or decrease grit or growth mind-set.
Mutualism
It is also possible for some qualities to have a relationship in which they benefit from interacting with one another. An example of this type of relationship in ecology is that of bees and flowers in which bees get nectar from flowers and in return spread pollen so the plant can reproduce. This is a win–win relationship. In education, mutually beneficial relationships exist among outcomes as well. For instance, it is possible that self-determination and emotional well-being are mutually enhancing. When one is able to experience more autonomy, he or she has increased sense of well-being psychologically (Ryan & Deci, 2017).
The types of relationships between qualities are presented in Table 1. Of the four types of possible relationships, one is mutually enhancing, which means improving one quality can improve another. One is commensalism, meaning that efforts to increase one quality do not help or hurt the others. Two types of relationships indicate that an intervention or change intended to improve one quality can hurt the development of another, resulting in negative relationship.
Types of relationships between two qualities.
The negative relationship is what we need to pay attention to, and there is evidence that confirms the negative relationship between some outcomes. For example, international assessment programs such as the PISA and Trends in International Mathematics and Science Study (TIMSS) show a negative correlation between test scores and students’ confidence in enjoyment and value of learning (Loveless, 2006; Zhao, 2017). Talent and grit may have a negative relationship as well (Perkins-Gough, 2013). Studies have also shown that short-term instructional outcomes may come at the cost of long-term outcomes. For example, explicitly teaching students may result in immediate success in learning target content, but it can cause a loss of curiosity, creativity, and deep understanding (Bonawitza et al., 2011; Buchsbauma, Gopnika, Griffithsa, & Shaftob, 2011; Kapur, 2014, 2016; Kapur & Bielaczyc, 2012; Peterson, 1979).
The existence of negative relationships between qualities suggests that an individual cannot possibly pursue all qualities equally successful. It is thus unreasonable to hold schools and teachers accountable for ensuring that all students be equipped with all the proposed qualities. Is it necessary, then, for stakeholders to consider what qualities are important? Do we want creative and curious children or high scores on standardized tests? Do we want children to be good at math and literacy at the cost of music and arts?
Moreover, stakeholders may also reimagine an education system that would support individuals to pursue their own interest and strength as a way to develop their unique profiles of qualities. In other words, if it is unnecessary and impossible for individuals to acquire the same set of qualities, schools can create an environment that supports each individual to develop their strengths and passions (Zhao, 2018a). As a result, measuring what matters becomes measuring what matters for individuals instead of the average of groups of individuals (Basham, Hall, Carter, & Stahl, 2016).
Are outcomes outcomes?
There is much confusion about outcomes. In the long list of proposed outcomes, some are treated as input variables sometimes; at other times, they are treated as output variables. For example, growth mind-set is often treated as an input variable that can enhance academic achievement, so are grit, confidence, and mental health. But should these be educational outcomes in their own right? At the same time, academic outcomes, such as grades, test scores, and education attainment, have been treated as output variables, but they can negatively or positively affect mind-set, confidence, grit, and mental health. A more productive approach is to treat all outcomes as both input and output variables at the same time—then causality is bidirectional.
This approach is especially important considering that some of the outcomes are more short term than others. For example, academic achievement is often a measure of short-term effects of schooling. Although there are measures of long-term cumulative effect of schooling in some academic domains, such as the ACT, PISA, and TIMSS, most of the time academic achievement is a measure of the degree to which students have mastered the intended knowledge and skills within a relatively short period of time. Grades, one of the most commonly used measures of academic achievement, are typically given at the end of a course. Moreover, grades are based on even smaller units of measurement, such as weekly quizzes, end of an instructional unit exam, daily homework completion, and midterm and end-of-course exams. Standardized tests are often given annually in a limited number of subjects.
In research, longitudinal studies of academic achievement are not common and short-term academic achievement has had more influence on practice and policy. The effectiveness of interventions is more often than not judged by its effect on short-term academic achievement. Schools and teachers are held accountable for improving academic achievement in the short term, as exemplified in the annual academic measures discussed in
Efforts to boost short-term academic outcome can negatively affect important long-term outcomes. There is emerging evidence that short-term positive academic outcomes do not necessarily translate into long-term life quality outcomes. For example, the longitudinal study by Howard Friedman found that early literacy was negatively associated with important indicators of life quality such as social emotional well-being and adjustment in long term (Kern & Friedman, 2009).
There is also evidence that instructional interventions that resulted in more effective information acquisition or imitation can cause a suppression of curiosity and creativity (Bonawitza et al., 2011; Buchsbauma et al., 2011). Researchers have also found that extracurricular activities tended to be a stronger predictor of creative expression in college applicants than traditional admissions factors, such as SAT scores and high school rank (Cotter, Pretz, & Kaufman, 2016).
A focus on short-term academic outcomes, then, can negatively affect long-term academic outcomes. For example, research has found that teaching decoding skills does not improve reading comprehension; however, decoding skills can be viewed by some as necessary part of reading achievement. Thus, efforts and time devoted to learning decoding skills is time and efforts away from actually learning to read (Zhao, 2018b). Some strategies that teach children to memorize facts can show immediate positive effect, but they may negatively damage children’s interest in the subject or constrain them from developing deeper conceptual understanding of the subjects (Kapur, 2014, 2016).
It is thus important for stakeholders to consider the short-term outcomes in light of possible long-term outcomes. Is it worth it, for example, to make sure students memorize the multiplication table at the cost of their losing interest in math? Or is it worth it if they passed tests in a course but developed a negative attitude to the subject? Furthermore, is any knowledge acquisition as important as maintaining a creative inquisitive mind-set?
Conclusions
In summary, we believe that deciding what matters is a wicked problem and should be treated as such. Wicked problems are drastically different from tame or technical problems and thus require unconventional approaches. Historically, education has used both authoritative and competitive approaches to solve problems. Unfortunately, these approaches have supported little to no meaningful lasting impact on the education system. It is time to reflect on what matters in education, then in turn what should be measured, and eventually how to measure it. System-level reflection on how to support massive collaboration in other areas has the potential to provide a structure of this type of collaboration in education. Baselining this collaboration with a basic set of underlying guiding questions can start this process. We hope a collaborative approach suggested in this article can result in a better solution to this wicked problem.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
