Sage Journals: Discover world-class research

Abstract

In 2020–2021, during the COVID pandemic, the federal government offered states the opportunity to request waivers from federally mandated standardized testing accountability protocols. For this study, we examined the contents of requests submitted by educational leaders from 12 states and the subsequent responses provided by federal government leaders. We applied concepts of framing and frame articulation, political spectacle, and policy paradox to examine how state and federal leaders positioned their views of the role and purpose of statewide accountability systems. Using a mixed-methods design, we found that state and federal leaders stridently disagreed about the role of standardized tests during COVID, with each leveraging frames of civil rights and moral imperatives to justify keeping the test (federal leaders) versus abandoning it (state leaders). Of the 12 waivers submitted, eight were denied, three were considered not needed, and one was granted.

Keywords

accountability assessment educational policy high stakes testing mixed methods policy analysis politics

Introduction

As the spring of 2020 unfolded, the world experienced an unprecedented (Karaian & June, 2020) set of disruptions due to the novel coronavirus (COVID) pandemic. All forms of daily life were affected as this contagious virus spread, shutting down businesses, keeping us home, and preventing us from physically interacting with one another. Among global institutions, and for this study, United States (U.S.) institutions hit especially hard were education systems whose leaders were forced to make immediate and sharp adjustments in how they functioned. Teachers moved schooling online, causing remarkable shifts in instruction, learning, and subsequent levels of student achievement (e.g., learning loss; see Fahle et al., 2023; Kane & Reardon, 2023; Tan, 2021).

As the pandemic progressed, state education leaders (e.g., state superintendents of education and governors) varied in how they approached the crisis. In typically Democratic-leaning states like Washington and Minnesota, state leaders mandated masks and extended school shutdowns throughout the first year of the pandemic. Other state leaders, in typically Republican-leaning states like Arkansas and Texas, took different approaches, resisting mask mandates and allowing schools to reopen early in the pandemic, or remain open throughout (Jack & Oster, 2023). Concurrently, the federal government (led by a Republican Presidential administration at the start of the crisis and a Democratic Presidential administration as the pandemic persisted and ended) had its own response, conveying to states how federal mandates should be carried out. It was the intersection of these forces (federal and state responses to COVID) that served as the backdrop of this study.

As the pandemic continued, state-level leaders wrestled over how they would handle the federal mandates of the Every Student Succeeds Act (ESSA, 2015) that required summative achievement testing every year in mathematics and English/language arts (ELA) in grades 3–8 and once in high school. In March 2020, the federal government waived ESSA’s testing and accountability provisions for all states, recognizing the challenges that schools across the U.S. faced in the wake of sending students home. However, during the next academic year (2020–2021), the federal government reinstated the full weight of ESSA, requiring states to return to pre-pandemic routines and administer standardized tests to all students every year.

Acknowledging the ongoing challenges posed by the pandemic, however, the federal government offered states the opportunity to apply for a waiver from ESSA’s testing mandates for the 2020–2021 academic year. Most states sought relief from some of the requirements, with a much smaller number of states (n = 12) seeking relief from virtually all of the accountability requirements of ESSA (including federally mandated testing). In these waiver requests, however, states offered alternative strategies for still meeting the goals of measuring student achievement and making results public.

Purpose of the Study

In the immediate years following the height of COVID, there were a few key publications where authors weighed in on the pandemic-era waivers and testing mandates. De Voto et al. (2023) explored how school leaders in two districts with different resource levels generally responded to the COVID–19 pandemic. They found that stringent testing policies exacerbated inequities and that districts with more limited resources confronted relatively more challenges when adapting to testing policies. Elsewhere, in a legal analysis of the waiver opportunity presented by the U.S. Government, Lam (2021) argued that the U.S. Secretary of Education, Betsy DeVos, had ample discretion to grant testing waivers under ESSA. Yet, as it was rolled out, there was a lack of clarity and direction for states on how to address “whole child equity”—a key requirement for waiver requests. Lastly, in an editorial written by Bruno and Goldhaber (2021), they discussed how states’ testing waivers revealed states’ related priorities and concerns. Collectively, these discussions helped shape the goals of this study, in which we more fully and empirically examined how state-level education leaders and federal policymakers expressed their perspectives about their educational priorities and concerns as they grappled with federal testing requirements.

Partly informed by these extant discussions, the purpose of this study was to further examine the landscape of federal and state interactions during the COVID pandemic. We did this in two ways. First, we employed quantitative analyses of state demographics and political leanings to determine if patterns existed related to the types of waiver requests that state leaders submitted. Second, and after identifying the 12 states that submitted detailed waiver requests to be relieved from all or most of federally required accountability testing, we engaged in qualitative methods to describe the policy stories (e.g., Stone, 2002) that unfolded from state and federal leaders’ written interactions. Considering decades of federal support for mandatory testing, these exchanges provided a unique opportunity to examine how state and federal policy stakeholders framed the language of accountability, testing, equity, care, and student success.

Federal and State Testing in the U.S

Situating the Study

Since the 1980s, after the 1983 National Commission on Education report A Nation at Risk was published (U.S. Department of Education [USDOE], 1983), the use of standardized achievement tests has grown and increasingly informed collective beliefs about U.S. public schools’ performance. While increases in test use have also occurred over the same time across the globe (Levy et al., 2019; Sahlberg, 2011; Sørensen, 2016), in the U.S. A Nation at Risk spurred a move toward more rigorous standards and increased and better tests, both of which were to help bring the U.S. out of its purported at-risk status (D. C. Berliner & Biddle, 1995; see also Koretz, 1996, 2017). A Nation at Risk subsequently inspired a series of test-based educational policy initiatives that eventually had every state develop its own standards across most subject areas and grade levels, and its own standardized achievement tests to help gauge students’ progress in their learning of said standards. The expectation, or theory of change, was that adopting improved standards and test-based measures to help the U.S. assess the extent to which students met higher standards would inspire more effective educational practices and augment student learning and achievement in effect (Center on Organization and Restructuring of Schools, 1995; Meyer, 1997).

Since the turn of the millennium, this use of standardized tests for holding schools, teachers, and students accountable has been manifested through three major policy episodes, two of which were propagated by U.S. federal policies. First, the U.S. federal government passed No Child Left Behind (NCLB, 2002) which, by reauthorizing the federal government’s Elementary and Secondary Education Act (ESEA), the federal government supported standards-based education reform based on setting higher standards, establishing measurable goals (e.g., using Adequate Yearly Progress [AYP] measures), and using large-scale standardized tests (i.e., mandated for mathematics and ELA in grades 3–8 and once in high school) to measure whether students were meeting the higher standards set. In addition, federal- and state-level governments suggested that states should attach consequences (e.g., student grade promotions, high school diploma conferrals, teacher and principal bonuses and dismissals, school reconstitutions and closures) to help guarantee better outcomes. During NCLB (i.e., 2002–2015), the goal was to reach 100% student proficiency across states by 2014.

NCLB marked the first time such test-based accountability policies were federally mandated, with consequences attached to test-based outcomes if states were to continue to receive federal education funding. States varied widely in the consequences they imposed (Collins & Amrein-Beardsley, 2014), as also related to states’ political Democratic or Republican leanings. The latter set of states attached relatively stronger consequences to students’ test scores than the former set of states, as also related to states’ general alignments with NCLB’s more general emphases on accountability-centered, performance-based, and market-driven reforms (e.g., school choice, charter schools). For more on this, please see Chubb and Moe (2007); Hamilton et al. (2008); and Klein (2015).

As states’ NCLB-inspired policies and practices unfolded, the U.S. began to realize a range of positive (i.e., intended) consequences, which were used to praise NCLB, but also negative (i.e., unintended) consequences, which brought into doubt the NCLB’s effectiveness. On the positive side, many argued that high-stakes testing accountability improved achievement scores, at least moderately, promoted greater accountability, improved data collection and transparency, increased attention toward closing the still-persistent achievement gap, increased focus on curricula in core subject areas (also seen as a weakness, by some; see more next), increased per-pupil spending, and enhanced students’ behavioral engagement (e.g., attendance, on-task behaviors, classroom involvement). See, for example, Ballou and Springer (2017), Bonilla and Dee (2018); Dee and Jacob (2011); Gershenson (2016); Grissom et al. (2014); Harris et al. (2023); Holbein and Ladd (2017); and Whitney and Candelaria (2017).

In contrast, NCLB’s negative consequences included overemphasis on testing and memorization instead of critical thinking, narrowing of curricula by reducing instructional time for non-tested subjects, and teaching to the test (some of which are also seen as a strength by some; as noted prior), as well as increased tendencies toward cheating, greater signs of educational triage (e.g., prioritizing attention toward students nearest to proficiency levels, also known as bubble kids), or, added stress, anxiety, and distress for students and teachers, and manipulations of tests (e.g., student exclusions) to meet higher accountability-based expectations (e.g., often leading to artificial inflation; see Haladyna et al., 1991). For references about NCLB’s negative consequences, see, for example, Amrein-Beardsley et al. (2010); Ballou and Springer (2017); D. Berliner (2011); Hursh, (2008); Koretz (2017); Nichols and Berliner (2007); Sunderman (2008); and Whitney and Candelaria (2017).

The second major policy push that relied on testing as a centerpiece of federal legislation came in the form of the Obama administration’s American Recovery and Reinvestment Act (2009) and the federal government’s subsequent Race to the Top (RttT) initiative (2011; see also Duncan, 2009). For this policy round, states were to adopt and implement growth or value-added measures (VAMs) to hold teachers more accountable for their purportedly causal effects on their students’ achievement from one year to the next. In line with the same theory of change, states receiving RttT awards were incentivized to use students’ test scores for more consequential purposes, but primarily at the teacher-level (i.e., teacher evaluation, termination, and compensation) (Dillon, 2010; Layton, 2012; Obama, 2011). After a series of setbacks including, but not limited to, tight timelines, state-level allegations regarding the federal government’s overreach, and heightened pressures to produce measurable gains over time, again, using students’ test scores but now aggregated at the teacher-level (Shah, 2013; USDOE, 2017; Weiss, 2013), as well as a series of lawsuits that ensued soon after states implemented RttT (Education Week, 2015; see also Amrein-Beardsley, 2019; Amrein-Beardsley & Close, 2019; Geiger et al., 2020; Paige & Amrein-Beardsley, 2020; Paige et al., 2019), RttT was essentially disbanded and replaced by ESSA (2015), although ESSA was officially a reauthorization of, or replacement for NCLB, also noting that the testing requirements written into NCLB, noted prior (i.e., large-scale standardized tests for mathematics and ELA in grades 3–8 and once in high school) still stand.

ESSA constituted yet another revision of ESEA; however, the large-scale standardized tests mandated by NCLB remained, although ESSA has allowed states more latitude for determining how states might comply with ESSA’s test-based policies and provisions, with some states increasingly using tests as flashlights into what works in education (e.g., for formative purposes) instead of as hammers to force certain (and uncertain) outcomes (e.g., for consequential, summative purposes; see, e.g., Stanford, 2023). Under ESSA (in place at the time of this study), the same theory of change held, with similar debates still surrounding the efficacy of this theory (e.g., Carter & Welner, 2013; DeSilver, 2017; Goldstein, 2019; Hanushek et al., 2019; Koretz, 2017; Reardon, 2018; Timar & Maxwell-Jolly, 2012). A handful of political proponents have continued to argue that this theory of change works (Jeb Bush as featured in Rotherham, 2011; David Coleman as featured in Lewin, 2012 and Rotherham, 2011; Betsy DeVos, 2020; Betsy DeVos as featured in Turner, 2020; Sandy Kress, 2011; Michelle Rhee, 2011; Margaret Spellings, 2012). Considerably fewer political leaders have more generally argued the opposite (e.g., John King Jr., U.S. Secretary of Education after Arne Duncan; the late Senator Lamar Alexander, former chair of the U.S. Senate Education Committee; and Senator Hillary Clinton, former U.S. Secretary of State, First Lady, and Democratic Nominee for President).

As noted, standardized test scores have been the centerpiece of several iterations of legislation aimed at reforming public education. The pandemic offered a unique and rare opportunity to examine more recently how federal and state educational policy leaders viewed (and debated) the role, purposes, and value of such testing. These debates during COVID are the center of this study.

Theoretical Frameworks

Deborah Stone (2002) described how policies become problems (or onstage dramas) only when groups demand specific actions be taken (e.g., due to changing times or significant events). As such, public education policies rarely emerge from straightforward, technical concerns; instead, they often stem from private experiences or localized crises recast as urgent social problems (Itkonen, 2007). In the wake of the COVID–19 pandemic, U.S. schooling switched into crisis mode, prompting the federal government to issue blanket testing waivers during the 2019–2020 school year. The magnitude of the pandemic, resulting in the national testing pause, opened a window for stakeholders to challenge established accountability norms and propose alternative approaches (Kingdon, 2011). This public exchange of ideas provided an opportunity to examine federal-state discussions related to the virtues and pitfalls of assessment-based accountability in the U.S.

For our analysis, we drew upon theories of framing to help us examine how symbols, language, and narratives were used to define the purpose(s) of testing, articulate stakes in the current context, and mobilize action toward their goals (Edelman, 1988; Nelson & Kinder, 1996; Stone, 2002). By invoking terms like equity, integrity, and unprecedented times, state leaders argued that existing federal testing mandates were ill-suited to the current crisis, while the USDOE argued for upholding its mandates. It is through these differences in frames that a reexamination of U.S. accountability norms emerged via our data.

Stories in Policy Debates

An initial look at our data revealed that a core element of this symbolic framing involved the construction of policy stories, which we saw as powerful constructions that shaped how issues were defined and communicated (Stone, 1997, 2002). Through a narrative arc (i.e., a beginning, middle, and end), actors frame a policy problem as a story of decline, which depicts how conditions are rapidly worsening, or a story of control, which implies that targeted policy interventions can restore stability or empower local actors (Stone, 1997, 2002). Stories are spun by toggling between instrumental discourse that proposes concrete requests or technical solutions and expressive discourse that relies on emotional or moral appeals to stir public sentiment or urgency (Hall Jamieson & Waldman, 2003; Itkonen, 2007).

In the waiver communications, textual and rhetorical cues revealed how federal and state leaders defined the problem, assigned responsibility, and advanced crisis narratives that eclipsed empirical evidence and earlier accountability commitments. Because policy stories intensify during an unfolding crisis, such as the COVID–19 pandemic, these stories can become even more potent (Hall Jamieson & Waldman, 2003). As such, we use the term story purposefully. It encompasses the logic of stories in political communications, where compelling stories use symbols, frames, and imagery to overshadow contradictory facts by offering coherence and emotional resonance (Edelman 1988, 2001; Gusfield, 1986; Hall Jamieson & Waldman, 2003). In analyzing the written waiver exchanges, we situated the story as a key unit of rhetorical construction, revealing how state and federal actors shaped the meaning of testing and how they exercised power during the pandemic.

Framing Contests in Policy Stories

Policy stories utilize different frames to highlight specific dimensions, such as civil rights, local autonomy, or system integrity (e.g., Edelman, 1988, 2001; Itkonen, 2007; Nelson & Kinder, 1996). These frames are dynamic sites where actors strategically adapt or amplify cultural values or beliefs to elevate versions of a problem. In what Dodge and Metze (2024) call framing contests, competing coalitions vie to define how an issue is framed and understood (p. 236). In these contests, the same frame can carry diverse and sometimes conflicting meanings (e.g., Dodge & Metze, 2024; Nelson, 1999; Stone, 2002), where a single value, such as equity, can support opposing policy arguments. Frames, then, serve as the mechanisms through which policy narratives continuously shift, both shaping and being shaped by policy actors and broader societal discourse. New frames are rarely created. Instead, actors adapt existing frames to ensure that their policy stories resonate and appear credible to their intended audiences (Dodge & Metze, 2024; Hawkins & Holden, 2013; Sinha & Gasper, 2010). For example, using a civil rights frame, equity can be framed to describe how uniform testing helps identify inequities, or it can be framed to suggest that testing under crisis conditions creates inequities because it disproportionately harms students who are already most at risk. In both cases, policy actors invoke the same frame (i.e., civil rights) and cultural value (i.e., equity) to justify positions that are, in fact, conflicting and contradictory (Stone, 2002).

In line with social movement scholarship, policy actors engage in these framing contests through frame amplification, defined as the “idealization, embellishment, clarification, or invigoration of existing values or beliefs” to galvanize public support or reposition a policy debate (Benford & Snow, 2000, p. 624). When examining state waiver requests and federal responses, we explicitly considered how both state and federal actors employed frame amplification by invoking resonant values, such as equity, safety, and integrity, to bolster their positions. These frame amplifications served as rhetorical leverage, attempting to form alliances and shape the public perception of how testing during the pandemic should be interpreted. In this way, frames and their amplifications not only helped define and emphasize aspects of ESSA mandates as a pandemic policy problem, but they also helped construct stories through which power was negotiated. As Stone (1997, 2002) explains, through stories, policy actors compete to define problems and responses, creating a policy paradox where the same policy is interpreted in contradictory ways. While a policy paradox explores the complex values behind policy decisions, political spectacles shift focus to how politics is performed and perceived by the public.

Political Spectacles

Stories and frames illuminate how problems are defined; political spectacles (Edelman, 1988, 2001) clarify how these definitions interact with the public arena. Political spectacles are constructed performances that rely on symbolic language, such as transparency or accountability, to evoke strong emotional responses that can overshadow deeper power asymmetries (Edelman, 1988). These spectacles function like plays where onstage dramas distract and entertain the populace while backstage negotiations determine policy decisions.

Specific to this study, onstage, the public experienced the illusion of democratic participation, while backstage, federal prerogatives shaped which policies gained traction (Smith et al., 2003). During the illusion of democratic participation, policy conflicts often feature paradoxes in which actors hold apparently contradictory goals. This malleable logic allows policy actors to conceal factual inconsistencies by foregrounding dramatic appeals for civic duty or student welfare. In doing so, political spectacles both intensify and disguise power struggles, masking them beneath emotive appeals to shared values.

Integrating framing contests (Dodge & Metze, 2024) and frame amplifications (Benford & Snow, 2000) with the illusion of democratic participation (Edelman, 1988) prompted our analysis of waiver exchanges that went beyond surface-level rhetoric. By tracing frames, we describe how particular stories gained salience during the COVID–19 pandemic, thus shaping debates over the role of federal authority and the boundaries of state autonomy, and illustrating how the interplay of competing frames, strategic story construction, and staged political performances collectively shaped accountability debates during this unprecedented moment in U.S. education.

Study Objectives

Informed by these lenses, our study objectives were twofold. First, we wanted to understand the demographic and political leanings of all 50 states and Washington, DC, and how that related to their waiver request contents and goals. Second, we wanted to analyze and better understand the stories and frames leaders used in the waiver request exchanges between 11 states and Washington, DC (hereafter referred to as a state for simplicity) and the U.S. federal government (i.e., the USDOE). Drawing on our theoretical frameworks we examined not only the content of the waiver requests submitted by these 12 state leaders seeking release from ESSA’s test-based protocols, but also how state leaders navigated stories of decline (emphasizing a crisis that justifies deviating from normal testing) alongside stories of control (affirming standardized tests and proposing solutions to the current context).

Accordingly, we organized this study using the following two research questions (RQs): (1) What were the state-level demographics of states for which state leaders submitted testing waiver requests as compared to the state-level demographics of states for which state leaders did not submit testing waivers? (2) What stories and frames emerged from an analysis of the exchanges between USDOE and state-level leaders?

Methods

Design and Sample

We used a sequential mixed-methods exploratory design in which our quantitative data collection and analysis approaches helped to inform our subsequent and more heavily weighted qualitative data collection and analysis approaches (Creswell & Plano Clark, 2010). Quantitative data included all 50 states’ and Washington, DC’s publicly available demographic information. For our qualitative data, we collected all waiver requests submitted for our analytical sample of the 12 states that submitted waivers from all accountability mandates. Consequently, the bulk of our qualitative analyses focused on the exchanges between the leaders of these 12 states and federal policymakers.

We conceptualized the study as a single case study with multiple embedded units of analysis (Yin, 2009). While we bounded this case through 2021, we limited subunits to July 2020 through April 2021 (i.e., USDOE’s initial letter to states about waivers through states’ waiver requests and USDOE’s final responses). We examined each state’s exchanges with the USDOE independently and then collectively.

Data Collection

We collected data in two iterative phases organized by our two research questions. For the quantitative component, we collected publicly available demographic data per state and indicators capturing states’ political typologies. Data included the number of students enrolled in states’ K–12 schools; the percentages of children under age 18 living in poverty; the percentages of white, Black, Hispanic, Asian, Pacific Islander, American Indian and students reporting two or more races; the percentages of students eligible for free/reduced lunch (FRL); and how citizens of each state voted in the 1992, 1996, 2000, 2004, 2008, 2012, 2016, and 2020 presidential elections (i.e., Democratic or Republican).¹

For our qualitative component (RQ2), we focused on gathering publicly available documents that illustrated what states submitted via their written ESSA assessment waivers to the USDOE during the 2020–2021 school year. We identified and retrieved all written communications from Washington, DC, and 50 states to the USDOE in response to the USDOE’s waiver call during the 2020–2021 school year. The USDOE’s Office of Elementary and Secondary Education maintained a robust archive of communications between its office and states. This database enabled us to identify the 12 states of primary interest in this study, their written communications, and the ultimate outcomes of their exchanges. By examining these high-leverage challenges (i.e., cases in which states explicitly questioned or resisted the federal testing norm), we could observe the most distinctive narratives and rhetorical appeals. The written exchanges included an initial assessment waiver request made in writing from each of the 12 states, the USDOE’s written response to those requests, state leaders’ written responses to the USDOE, and subsequent communications. Some states had a single exchange, where state leaders wrote to the USDOE, and the USDOE responded, while other states had up to three back-and-forth exchanges. Data included, on average, four documents per state, with each document ranging from 2 to 204 pages, depending on the document or state. We analyzed a total of 438 pages across the dataset.

Data Analyses

We organized our analytic strategy around our research questions. For our quantitative work (RQ1), we used publicly available data that we collected to examine potential differences between states that did (n = 12) and did not (n = 39) submit waivers (totaling 51, comprising 50 states plus Washington, DC). We calculated a series of Pearson chi-square tests and Pearson point-biserial correlations to examine findings among these dichotomous and categorical variables. We also performed a regression analysis using the 18 independent variables noted previously, also to examine these variables’ relationships with our dependent variable, or whether leaders submitted testing waivers on behalf of their states.

For our qualitative work (RQ2), we employed a three-phased content analysis to iteratively examine themes and patterns within and across the 438 pages of text retrieved from the waivers and communications between state leaders and the USDOE (Miles & Huberman, 1994; Saldaña, 2013). During these analyses, we engaged in a comprehensive and organized reading of the text to identify patterns, intent, and themes. We detail each of these phases, the coding schemes we employed, and how our approach helped us capture the policy narratives, frames, paradoxes, and instances of political spectacle in Appendix A below.

Our iterative coding process revealed several key framing devices and themes that we used to organize our findings. First, we identified four key frames (1) moral imperative, (2) civil rights, (3) system integrity, and (4) local autonomy (Table 1) and then examined how these frames were deployed in ways that contested (or contradicted) each other, or overlapped (Dodge & Metze, 2024). At times, frames coexisted within a single waiver request. For instance, a letter might have begun by lamenting unprecedented burdens on students (moral), yet concluded that system-level academic information remains essential (system integrity). Table 1 summarizes these four frames and (when present) their corresponding contests (or contradictions), providing a definition and a brief example from the data.

Table 1

Issue Frame (and Contests) Coding Scheme

Frame	Frame Contests	Definition	Example
Moral Imperative	Safety Focus	Prioritizes safety and references moral duty to extend common sense grace in the name of protection.	“Our success marker should be that our children are safe, healthy, and nurtured. This is a time for extending grace.” (Georgia)
Moral Imperative	Care Focus	Ensure children’s socio-emotional health with time instead of collecting data about their learning.	“We need time for important socio-emotional learning components, allowing educators to assess psychological and behavioral impacts of the pandemic.” (South Carolina)
Civil Rights	Equity as an Imperative	Argues that testing upholds equity by revealing gaps and distributing appropriate resources.	“Even amid crisis, we must continue measuring achievement to protect at-risk students.” (USDOE)
Civil Rights	Equity as a Barrier	Argues that testing is inequitable in a crisis because it exacerbates historically marginalized students’ educational access.	“Schools districts that offered some form of in-person instruction in mid-October serve a disproportionate share of the State’s White students (70.4%).” (New Jersey)
System Integrity	Questioning the Tests	Questioning the validity of collecting data during a pandemic.	“(We) have significant concerns regarding the validity and possible misuse of data collected from these assessments.” (Oregon)
System Integrity	Tests as anchor	Emphasizes summative assessments as vital for accountability and transparency.	“Without standardized data, we risk losing transparency and accountability.” (USDOE)
Local Autonomy	N/A	Highlights state/district prerogative to make context-specific decisions, questioning uniform federal mandates.	“We are experiencing firsthand the legal definition of the ‘act of God,’ and because these conditions are outside human control, all schools should be held harmless and granted the maximum flexibility to determine locally how best to serve their students.” (Montana)

Our in-depth iterative coding process led us to consider additional themes that would help us describe exchanges between states and the federal government (Maxwell, 2013). For example, we identified elements of political spectacle, what Edelman (1988, 2001) referred to as the illusion of democratic participation. Additionally, we identified instances where state or federal officials praised the necessity of standardized testing yet simultaneously sought or granted exemptions and labeled these instances as a policy paradox (Stone, 1997). For example, an excerpt describing tests as indispensable for equity, only to request a full waiver, exemplified this paradoxical stance. The USDOE, for its part, championed accountability norms yet sometimes granted flexibility and one full waiver. This duality underscored the rhetorical power of testing in U.S. education and the pragmatic acceptance that unprecedented times warranted exceptions. Once these elements emerged in our findings, we aligned the chronological timeline of communications to triangulate findings.

Across all phases of coding (see Appendix A below), we independently coded the data and then met to identify levels of convergence and divergence to generate higher-level axial codes (Saldaña, 2013). Throughout these axial coding steps, we also authored a codebook and set of analytic memos we used to summarize the key features of the waiver exchanges, highlighting the symbolic language, framing contests, and power and authority dynamics at play. These categories provided an understanding of the strategic amplification of frames and the use of stories to describe who should retain power and authority between and among federal and state actors during this unprecedented time.

Results

States and the Waivers State Leaders Sought

As illustrated in Table 2, virtually all states submitted waivers to be released from general accountability structures (i.e., school report card and improvement requirements, n = 46). These state leaders planned to conduct statewide assessments; however, they requested that they not be required to enforce accountability reporting. A smaller number of state leaders sought waivers from having to test and report on the achievement of students with significant disabilities (n = 19; see also Strassfeld & Voulgarides, 2022), and a smaller number of states (who are the focus of this study) sought waivers from doing broadscale statewide tests for all students that year (n = 12 inclusive of 11 states and Washington DC).

Table 2

Overview of State Waiver Requests

	Type of Waiver
State	Accountability*	1% Cap**	Assessment***
Alabama	X	X
Alaska	X
Arizona	X
Arkansas
California	X	X	X
Colorado			X
Connecticut	X
Delaware	X		X
Florida	X	X
Georgia	X	X	X
Hawaii	X
Idaho	X
Illinois	X	X
Indiana	X	X
Iowa	X
Kansas	X
Kentucky	X	X
Louisiana	X	X
Maine	X
Maryland	X	X
Massachusetts	X
Michigan	X	X	X
Minnesota	X
Mississippi	X
Missouri	X
Montana	X	X	X
Nebraska	X	X
Nevada	X
New Hampshire	X
New Jersey	X		X
New Mexico	X
New York	X		X
North Carolina	X	X
North Dakota	X
Ohio	X	X
Oklahoma	X	X
Oregon	X		X
Pennsylvania	X
Rhode Island	X
South Carolina	X		X
South Dakota	X
Tennessee
Texas	X	X
Utah	X
Vermont	X	X
Virginia	X	X
Washington	X		X
Washington, DC			X
West Virginia	X	X
Wisconsin	X
Wyoming

Waive accountability, school improvement and school report card requirements.

Waive requirements of participation of students with the most significant cognitive Disabilities.

***

Waive requirement of state-wide summative assessments to all public elementary and secondary students.

State Demographics

To better understand the contexts of the states for which state leaders submitted assessment waiver requests (RQ1), our quantitative findings demonstrated that Democratic-leaning states were significantly more likely than Republican-leaning states to have submitted assessment waivers. See a list of states by political leaning in Table 3.

Table 3

States’ Political Leanings by States with (and Without) Waiver Requests

	12 States with Waiver Requests	Political Leaning
State	12 States with Waiver Requests	Republican	Democratic
Alabama		X
Alaska		X
Arizona		X
Arkansas		X
California	X		X
Colorado	X		X
Connecticut			X
Delaware	X		X
Florida		X
Georgia	X	X
Hawaii			X
Idaho		X
Illinois			X
Indiana		X
Iowa		X
Kansas			X
Kentucky			X
Louisiana			X
Maine			X
Maryland		X
Massachusetts		X
Michigan	X		X
Minnesota			X
Mississippi		X
Missouri		X
Montana	X	X	X
Nebraska		X
Nevada			X
New Hampshire		X
New Jersey	X		X
New Mexico			X
New York	X		X
North Carolina			X
North Dakota		X
Ohio		X
Oklahoma		X
Oregon	X		X
Pennsylvania			X
Rhode Island			X
South Carolina	X	X
South Dakota		X
Tennessee		X
Texas		X
Utah		X
Vermont		X
Virginia		X
Washington	X		X
Washington, DC	X		X
West Virginia		X
Wisconsin			X
Wyoming		X

Note 1. Political leaning categories were determined using nine indicators: (1) Political party of the current governor; (2) How each state’s populace voted in the 1992 presidential election; (3) How each state’s populace voted in the 1996 presidential election; (4) How each state’s populace voted in the 2000 presidential election; (5) How each state’s populace voted in the 2004 presidential election; (6) How each state’s populace voted in the 2008 presidential election; (7) How each state’s populace voted in the 2012 presidential election; (8) How each state’s populace voted in the 2016 presidential election; and (9) How each state’s populace voted in the 2020 presidential election. No states were inconsistent in their political leanings on 1–9 above; hence, researchers’ categorizations of states’ political leanings above are likely highly reliable (and valid).

Note 2. Important, descriptively (and of statistical significance), is that of the 12 states for which state leaders submitted waivers, 10 were (or are still) Democratic leaning (i.e., 83.3%).

Of note is that of the 12 states for which state leaders submitted assessment waivers, 10 were (and are likely still) Democratic leaning (i.e., 83.3%). This difference was statistically significant at X²(1, N = 51) = 5.67, p < 0.05. This same trend was evidenced over time, as per how states’ residents voted in presidential elections from 1992–2020, with the highest coefficient yielded for the latest 2020 presidential election, at X²(1, N = 51) = 6.57, p < 0.05.

Related, our regression analysis revealed that the strongest predictor of whether state leaders submitted federal testing waivers (i.e., our dependent variable) was how citizens in each state voted (i.e., Republican or Democratic) in the 2020 presidential election (β = 0.305, 95% CI, R = 0.359). This variable accounted for 12.9% of the variance in our regression model.

Otherwise, states’ waivers were not statistically significant when correlated with all other state-level demographic variables that we collected and analyzed. One variable was statistically significant, albeit using a higher p-value (p < 0.10) with less confidence, which is sometimes acceptable in the social sciences (McLeod, 2019). It was a small correlation we observed between whether states applied for assessment waivers and the percentages of white students educated in their states (r = −0.26), suggesting that states with whiter students were less likely to apply for federal assessment waivers, which may have also been interrelated with our finding regarding states’ political leanings.

Assessment Waiver Requests

We compiled all the communications and documentation associated with our 12 states for which leaders submitted requests for full waivers from statewide testing (see Table 4) to answer RQ2. Our analysis of these documents revealed that virtually every state’s letter author(s) followed a narrative arc (e.g., Stone, 1997), where they began with a statement acknowledging the importance of summative assessments, typically followed by a statement regarding the challenges or contexts of education during COVID. Once the conflict in their narrative arc was revealed, they argued that the unique circumstances surrounding COVID required that they prioritize the values of equity, care, or the socio-emotional well-being of students in their states through their waiver requests. Although state leaders varied in their approaches, all state leaders seemed to follow this basic narrative arc in their requests.

Table 4

Descriptive Statistics Describing Total Pages of States’ Waiver Requests and Communications with the U.S. Department of Education

States	Total Pages	Pages to U.S. Department of Education	Pages from U.S. Department of Education
California	10	8	2
Colorado	54	52	2
Delaware	2	–	2
Georgia	13	11	2
Michigan	10	8	2
Montana	36	30	6
New Jersey	9	8	1
New York	206	204	2
Oregon	36	32	4
South Carolina	16	14	2
Washington	30	27	3
Washington, DC	16	14	2

Stories of Decline and Control

A defining feature of states’ waiver requests was their dual reliance on stories of decline and stories of control, woven through appeals to the USDOE. In every waiver, state leaders asserted a belief that summative assessments provided states with important learning indicators and valuable data for making statewide decisions, such as resource allocations. For example, Oregon leaders opened their waiver request by describing state testing as providing “systems-level academic information that helps inform the equitable distribution of education resources.” New York leaders described a commitment to state testing because it provided “essential data for parents, teachers, school leaders, and the public about how public schools [were] performing.” Indeed, virtually all waiver authors incorporated an assertion about the immense value of summative assessments. Likewise, in most cases, state leaders’ statements seemed to gratify, indulge, and pander to the federal entities who were upholding ESSA law, and clearly held the power over whether states’ waivers would be approved.

In these analyses, we read many of these explicit statements of commitment to summative assessments as an effort to grease the wheels or gain favor. We saw them as efforts to pander or attempts to please the USDOE by writing what they thought USDOE leaders wanted to hear before asking for an assessment waiver. To us, this pandering seemed to be a way state leaders attempted to engender approval prior to outlining their requests for a waiver. An example that captures this theme was seen in the opening of a waiver request submitted by policy actors in Washington, DC, with waiver authors describing DC districts’

deep and long-standing commitment to using statewide assessment data to drive student academic achievement, especially for those students most at risk. As [they illustrated], statewide assessment data are used to meet critical needs: to direct resources to students who need them most, to create research on academic outcomes that drive programmatic change to contribute to decision making for families.

While not all states were as direct in their respect for (and in some cases admiration of) their summative assessment data, every state’s letter included such explicit statements illustrating states’ appreciable commitments to testing. Only Michigan authors described their pledge less directly, stating that “in a normal environment, trustworthy summative assessment data can be used to help increase student learning over time.”

Beyond this, all waiver requests included decline narratives that depicted the pandemic as creating unprecedented disruptions, rendering summative testing logistically and ethically untenable. Washington, DC, for instance, reported that “96% of students were participating in full-time distance learning. The remaining 4% of students were participating in some variation of hybrid instruction,” capturing the logistical and ethical impossibility of administering standardized tests. Montana leaders went a step further, calling the pandemic “the legal definition of an ‘act of God’” and suggesting that these conditions were “outside human control,” descriptively laying out why adhering to federal testing mandates would exacerbate equity gaps. Washington officials similarly drew attention to the “major mental health challenges” faced by students, linking a crisis of well-being to a rationale for suspending conventional accountability mechanisms, at least temporarily.

These same waiver requests also vividly conveyed stories of control, in which state actors outlined how exemptions could restore some stability in such unprecedented times. However, states varied in the kinds of waivers and considerations they were explicitly seeking. Some state leaders wanted to retain their summative assessments, but they requested adjustments to the sampling, timing, or scope of assessment. Other state leaders requested that summative assessments be stopped completely. This combination of expressing dire crisis appeals, paired with a steadfast belief in the underlying merit of testing, reveals the intricate policy stories at play.

The Paradox of Upholding Accountability While Seeking Exemptions

Throughout these exchanges, states repeatedly affirmed their commitment to testing as crucial for equity and transparency, even as they pursued waivers from summative testing. This apparent contradiction, again, reflects a policy paradox (Stone, 1997, 2002). States publicly endorsed standardized tests in principle but deemed them unworkable or not the appropriate response under pandemic conditions. South Carolina, for example, still acknowledged that summative assessment data “provide vital information to educators, parents, state leaders, and the public on how well students are progressing” but then insisted that assessments “however, are not strong diagnostic measures of students’ learning needs, and because they are administered at the end of the school year, do not yield timely results.” The USDOE similarly framed ESSA mandates as indispensable to uphold “accountability and transparency,” yet ultimately allowed one full exemption. By invoking unprecedented (a common word employed across most communications), both the federal government and state leaders could champion accountability in theory while forging arrangements that temporarily bypassed their normal protocols. This tension pervaded every category of waiver requests, underscoring how standardized tests remain symbolically and politically central.

Framing Contests: Civil Rights, Moral Imperatives, and System Integrity

Across waiver requests and federal responses, most actors presented the problem of the pandemic through three frames: civil rights, moral imperatives, and system integrity. When constructing stories of decline, our analysis revealed how instrumental and expressive discourse (Itkonen, 2007) was employed to frame the values of equity, safety, care, and integrity. These values were used to justify policy stances; however, the framing of these values often led to the justification of opposing policy stances across waiver request communications, even within a single request (Dodge & Metze, 2024; Hawkins & Holden, 2013). State and federal actors strategically amplified specific frames (Benford & Snow, 2000) to construct salient and compelling stories that described how these values would be upheld or threatened if ESSA mandates were reinstated in the 2020–2021 school year (and beyond).

Civil Rights Frame Contests

Many states used a civil rights frame (see Table 1) to portray summative assessments as, in the words from Washington, DC, a “trusted source of information. . .used for important decision-making by stakeholders.” Authors who used this frame argued that test-based data help identify students who are most at risk, and as such, testing was considered imperative for addressing equity. Washington, DC, and many other states employed such instrumental language (focusing on practical constraints and solutions) to describe tests as tools used to rectify systemic injustices by diagnosing and providing the information needed to address inequities.

The first communication from the USDOE to states in September of 2020 that initiated the waiver process, perhaps, encouraged this “data as an imperative for equity” framing. In Betsy DeVos’s letter, she employed expressive language (highlighting moral duty and building emotional resonance) and the same civil rights framing to garner support for reinstating ESSA mandates during a pandemic, writing:

school closures this past spring disproportionately affected the most vulnerable students, widening disparities in achievement for low-income students, minority students, and students with disabilities. Almost every student experienced some level of disruption. Moving forward, meeting the needs of all students will require tremendous effort. To be successful, we must use data to guide our decision-making.

In this way, the pandering we identified in states’ stories may have been a response to the USDOE’s frame amplification of equity to mean the need for data to justify reinstating ESSA mandates.

Civil rights framing and the value of equity were also used to justify the need for a waiver of ESSA mandates. State leaders stressed that the pandemic introduced overwhelming logistical burdens and barriers to learning that fell disproportionately on marginalized students. For example, Washington (state) leaders said that their limited resources should be used to “focus on equity, prioritizing the learning needs of students furthest from educational justice.” Michigan’s leaders agreed, writing, “We must adjust how we operationalize our commitment to equity by acknowledging the differences in student access to the resources (technology and otherwise) that are needed to provide an adequate opportunity to learn.” Through this framing of civil rights, state actors argued that equity-focused aims would be better served if resources were allocated to support students in real time, rather than assessing where to distribute resources later.

Taken together, the civil rights framing contests hinged on whether summative assessments were seen as essential for shedding light on inequities or as impediments that divert scarce resources away from students in need during the COVID crisis. While the USDOE and some states depicted test-based accountability as a cornerstone of civil rights, others argued that equitable outcomes demanded suspension of mandates so that efforts could be channeled into real-time interventions. This tension reflected the broader pattern across waiver communications, where a single value, in this case, equity, anchored opposing policy stances. Here, both federal and state policy actors invoked civil rights language to position their approach as the more just path forward.

Moral Imperative Frame Contests

While civil rights discourses largely invoked testing as an equity measure, states and federal authorities also turned to moral imperatives to justify positions on ESSA mandates. In these moral framing contests, two interrelated themes emerged: safety as a paramount concern and care as a guiding principle for socio-emotional well-being. This dual emphasis often employed both instrumental and expressive discourse.

In several state waiver requests, leaders emphasized the moral responsibility to protect students and staff amid pandemic-related health risks. For example, Oregon highlighted how “rigorous health and safety requirements. . .reduce the number of students who can participate in state assessments at the same time,” using instrumental discourse to demonstrate the ethical tension between safely accommodating students and meeting federal mandates. New York’s waiver highlighted their Board of Regents Chancellor’s comment in support of the request: “throughout the pandemic, the Board’s priority has been the physical and mental health, safety, and well-being of the children and adults in our schools.” An argument framing testing as secondary to the moral obligation of safeguarding students’ physical well-being. Georgia warned that schools would be “implementing intensive protocols to ensure the safety of their students and staff,” indicating the practical strain testing could introduce.

Meanwhile, the USDOE also deployed a safety-oriented moral frame, though their framing used expressive discourse with a different emphasis. In the first communication to states in the waiver episode, then-Secretary Betsy DeVos urged that:

Just as doctors, nurses, police officers, grocery clerks, and other essential workers have demonstrated their resolve, now is our opportunity to show that the same spirit is present in America’s education leaders as we work to safely reopen schools and to successfully educate our nation’s children.

By likening the work of educators to frontline responders, the USDOE framed the reopening of schools as both a civic and moral duty. This narrative implied that, despite the risks and challenges of safely and faithfully adhering to federal guidance, doing so would demonstrate the heroic resolve needed to safeguard students’ futures. In this way, the USDOE’s reference to safety, while echoing the concerns of state actors, ultimately served to reinforce the frame that moral imperatives warranted proceeding with assessments despite the complexities introduced by the pandemic.

In a parallel framing of the moral imperative, care is highlighted as a higher priority than collecting student data. Multiple states urged that mental health services and wrap-around supports be prioritized over high-stakes testing. In California, authors emphasized the need for “mental health services, access to school meal programs, and programs to address pupil trauma.” Meanwhile, Georgia leaders wrote, “Our marker for success should be that our children got through this time healthy, safe, and nurtured.” This same frame was echoed by Michigan’s waiver, insisting, “this is not the time for high-stakes assessment or accountability. This is the time for care, connection, and support.” Such framing blends instrumental discourse with an expressive call for compassion. As such, these state leaders amplified that funneling resources toward socio-emotional care, rather than administering assessments, was the morally sound course of action, coupling practical with ethical appeals for students’ holistic welfare.

The absence of explicit federal language around care or emotional well-being was notable and contrasted with states’ calls to redirect resources to more immediate and compassionate interventions. This divide suggested a moral imperative framing contest between the USDOE, which framed high-stakes testing as an essential moral duty for post-pandemic recovery, and state actors, who framed students’ safety and socio-emotional well-being as paramount.

System Integrity Framing Contests

Beyond civil rights and moral imperatives, a third frame of contestation centered around system integrity. States typically used instrumental discourse to frame logistical and technical disruptions as compromising the data’s integrity. Meanwhile, the USDOE, while also using instrumental discourse, framed annual tests as indispensable to preserving the integrity of the accountability system’s infrastructure.

Many state leaders contended that irregular testing conditions in 2020–2021 would render it impossible to collect accurate and fair results, questioning the logic and validity of conducting summative assessments. For instance, Washington, DC, explained, “statewide assessment results would not be valid for their intended uses, reliable, or comparable, and would misrepresent the academic performance of students.” Michigan agreed: “conditions for summative assessment cannot be met, which means summative test results will not be reliable, comparable, generalizable, or valid.” Montana struck a similar tone, insisting that “testing. . .[could not] reasonably provide technically sound, relevant, and accurate information to the public and parents to support the education processes at the local and state level.” This framing underscored the technical constraints of low participation rates, lack of uniform administration, and insufficient resources, arguing that unreliable and misleading data would actually endanger the integrity of the accountability system.

In contrast, the USDOE framed canceling or reducing annual assessments as undermining the very foundation of accountability. Secretary Betsy DeVos (2020) stated that if testing were halted, “transparency and accountability will soon follow out the door,” framing summative data as crucial for protecting system integrity. A blanket form letter from the USDOE, much later in the process and under the Biden administration, that denied most waiver requests, echoed this framing:

While the Department acknowledges the challenges facing all States. . . the assessment, accountability, and reporting elements are central to the purpose of the ESEA in general and to Title I of the ESEA in particular.

In an apparent effort to align with this federal framing, many state leaders publicly endorsed the value of testing even as they requested waivers, a strategy that we, again, labeled pandering. For instance, Georgia conceded that “assessment and accountability have a place in our educational system,” yet critiqued “high-stakes roles” as disconnected from student-centered goals. Oregon described statewide summative assessments as “highly effective. . .as long as foundational conditions are met,” and New Jersey emphasized that they are “critical to the success of accelerating learning.” Despite these nods to federal principles, each state ultimately sought waivers or modifications, reflecting the tension between expressing a commitment to accountability and securing flexibility in the face of pandemic realities.

Overall, these system integrity framing contests centered around whether testing in a disrupted year would bolster accountability or, paradoxically, compromise it by producing flawed data.

Local Autonomy Frames and the Waiver Requests

Alongside civil rights, moral imperatives, and system integrity, all state leaders framed local autonomy as important in their waiver requests to justify context-specific assessment choices. This framing amplified district and classroom prerogatives over uniform federal mandates, positing that educators on the ground could better determine how, when, and whether to test students during the pandemic. Most of the discourse framing local autonomy was instrumental. State actors provided logistical and technical reasons why, as Montana stressed, “schools should be held harmless and granted the maximum flexibility to determine locally how best to serve their students.” These local autonomy frames fell into three general categories by request type: (1) give a version of state summative tests to some students (i.e., partially comply with ESSA), (2) use local benchmark-aligned formative assessments for in-the-moment data (i.e., not comply with ESSA), or (3) use portfolios or screeners (e.g., pre-ACTs) because of distance learning (also not in compliance with ESSA).

Partially Complying with ESSA (Shortened or Modified Summative Assessments)

Leaders in California, Colorado, New Jersey, and Washington requested partial waivers that would retain summative tests but adapt them to local realities by shortening exams, sampling fewer students, or adjusting the timing and administration modalities. In doing so, these state leaders framed local autonomy as a needed compromise between federal accountability and state-level constraints, largely through instrumental discourse. Colorado leaders reported that “two out of three [of our] elementary students and three out of four [of our] middle and high school students [were] receiving remote learning,” arguing that uniform federal mandates could not accommodate such widespread virtual instruction.

All state leaders in this category framed their requests by suggesting a compromise of control, noting that they were willing to retain some of the federal government’s ESSA mandates while recognizing federal oversight, but they requested in return that they maintain local control to meet their students’ needs. Washington leaders described a unique testing sampling plan to create a

state-level snapshot of student performance across a continuum of grades for all three content areas as well as student groups from across Washington. This snapshot would provide the state-level system the necessary information to support system decisions. This approach would not adversely affect or burden schools or individual students with concerns of administering a large-scale summative assessment in multiple content areas but rather allow [schools and teachers] to focus on the important work of instruction.

Additionally, these state leaders suggested that their proposed ESSA adaptations would provide more resources and flexibility for both schools and districts, if granted. New Jersey leaders operationalized this flexibility by detailing that

this waiver [would] enable LEAs [i.e., Local Education Agencies] to concentrate their staffing and scheduling resources on instruction and high-quality, formative assessments expected to provide more immediate and actionable student feedback

That California, Colorado, New Jersey, and Washington leaders were requesting to comply with parts of ESSA’s testing requirements, but not all of them, suggests that they were framing local autonomy to bridge the gap between federal and local control over their educational system

Replacing ESSA Tests with Local Benchmark-Aligned Formative Assessments

Georgia, Michigan, Montana, Delaware, and Oregon leaders argued that district-chosen and benchmark-aligned formative assessments offered more timely and relevant data than full-scale ESSA testing. State actors framed local autonomy as vital for in-the-moment insights, claiming that federal summative data would be neither valid nor actionable amid ongoing disruptions. These state leaders generally used instrumental discourse, such as the idea that local benchmarks provide faster and more actionable feedback. However, they also employed expressive discourse that referenced moral imperatives, stressing the importance of caring for students and communities in real time.

Oregon leaders, for example, emphasized the importance of “adjusting how [they] operationalize[d] [their] commitment to equity by focusing on differences in student access to resources,” indicating a local lens on the pandemic’s uneven impact. Michigan, as quoted earlier, agreed with Oregon and critiqued the springtime utility of ESSA tests, further insisting that “summative assessments could not meet their intended aims” in addressing the ongoing crisis, and thus, local formative measures would better serve instructional decisions.

Collectively, these state leaders went beyond simply requesting not to administer their state assessments; they framed local autonomy as a means to collect more sensitive, helpful, and timely data. They proposed tools that included dashboards with benchmark-aligned formative assessments and other survey instruments to assess students’ mental health issues, access to resources, and family concerns. While these states acknowledged the theoretical value of ESSA testing, thereby pandering to federal norms, they ultimately amplified the frame of local autonomy as the most practical and compassionate solution for assessing student needs during this crisis.

Opting for Portfolios or Screeners Instead of ESSA Tests

A third framing of local autonomy emerged among New York, South Carolina, and Washington, DC, whose leaders proposed using portfolios or pre-/post-screeners instead of ESSA summative assessments. For these states, actors framed local autonomy through instrumental discourse, explaining why more flexible and contextualized methods could yield reliable data, despite limited numbers of students receiving in-person instruction. For example, South Carolina laid out the merits of local autonomy, writing:

These familiar solutions (e.g., pre-screeners) provided districts with both a [sic] historical data perspective as well as a continuum of data to reference and compare the status of student learning pre-pandemic to what is currently taking place in every classroom to date. These assessments were also chosen because each one had the capability to offer a virtual platform for administration, yielded valid and reliable data, and accurately tracked progress and identified challenges for all students. Moreover, districts and schools were able to maximize the time needed to comply with the law in an efficient manner because of their familiarity with these assessments and of the results yielded, and no additional training was needed for the teachers.

State actors collectively argued they wanted to provide teachers with actionable student data and more instructional time to meet student needs.

New York highlighted the “uneven effects of the pandemic” on student growth, arguing that portfolios and screeners offered “immediate and actionable” information for guiding instruction. Washington, DC, embraced the same premise but stood out in this group by praising ESSA testing data as critical for “ambitious goals” and “advancing outcomes.” DC’s request also differed in another pivotal aspect from the waiver requests of New York, South Carolina, and every other waiver seeking state. Due to a congressionally driven transparency statute enacted when DC won mayoral control of its schools in 2007 (Public Education Reform Amendment Act, 2007), they were already required to upload interim-assessment results to a citywide dashboard (named LearnDC) so Congress could monitor student progress. With that infrastructure already in place, DC’s waiver pledged to provide district-by-district reports of their proposed pre/post screeners and automatic communication of how ESSA funds would be distributed based on the results. By contrast, New York and South Carolina lacked any statewide platform or legal mandate. DC could operationalize local autonomy overnight, while its peers would first have needed to build the scaffolding.

Besides differences in ability to report out results, each of these states framed local autonomy as indispensable for accommodating students’ academic and socio-emotional realities. By endorsing locally developed tools and measures, these state leaders suggested that districts and classrooms (rather than federal mandates) could best decide how and when to assess progress. As with other frames, these waiver requests also acknowledged ESSA’s principles (e.g., pandering to federal expectations) while ultimately concluding that not complying with ESSA mandates was the only viable way to move forward during a crisis of this scale.

Chronology of Waiver Submissions and Responses

Examining framing contests and amplifications in combination with how the waiver episode unfolded chronologically provides a compelling story of these exchanges (Table 5) and shows that the federal review window had two distinct phases. Secretary DeVos’s September 3, 2020, letter invited states to “submit assessment waivers should circumstances warrant,” but no review rubric was issued until February 22, 2021, when the Biden administration released a streamlined template. As it turned out, in the first phase, seven states (i.e., Georgia, Michigan, Montana, New York, Oregon, South Carolina, and Washington) filed full state testing waivers before the template was introduced. All of these states were (eventually) denied, even Washington, which resubmitted a modified version after the new template was released (thus presumably submitting a more robust application that met federal requirements). After February 22, 2021, phase two began, where six states submitted template-aligned requests (i.e., California, Colorado, Delaware, New Jersey, Washington, and Washington, DC), two of which were flat-out denied (i.e., Delaware and Washington). Table 5 summarizes the filing and response dates for all 12 states; the most interesting of which was that Washington, DC, filed and was granted its waiver request on the same day.

Table 5

Chronology of State Waiver Submissions and Federal Responses. 2020–2021

State	First WaiverRequest	Final Waiver and Request	Waiver Template Used?	USDOE response letter	Outcome
Colorado	–	17 Mar 2021Grade-band Sampling	Yes	26 Mar 2021	No waiver needed
California	–	2 Apr 2021Shortened Test	Yes	6 Apr 2021	No waiver needed
Delaware	–	5 Mar 2021Suspend State Testing	Yes	1 Apr 2021	Denied
Georgia	–	18 Feb 2021Suspend State Testing	No	26 Mar 2021	Denied
Michigan	–	25 Jan 2021Suspend State Testing	No	6 Apr 2021	Denied
Montana	8 Jan 2021	5 Feb 2021Suspend State Testing	No	6 Apr 2021	Denied
New Jersey	–	18 Mar 2021Shortened Test	Yes	6 Apr 2021	No waiver needed
New York	–	12 Feb 2021Suspend State Testing	No	6 Apr 2021	Denied
Oregon	22 Jan 2021	1 Apr 2021Grade-band sampling	Yes	6 Apr 2021	No waiver needed
South Carolina	–	11 Jan 2021Suspend State Testing	No	26 Mar 2021	Denied
Washington (state)	21 Aug 2020	25 Mar 2021Suspend State Testing	Yes	6 Apr 2021	Denied
Washington, DC	–	6 Apr 2021Suspend State Testing	Yes	6 Apr 2021	Full assessment waiver granted

Federal Government Response to States

In the end, the federal government responded to all 12 states’ assessment waiver requests in one of three ways: (1) they uniformly denied the request (n = 8), (2) they granted the request (n = 1), or (3) they considered the waiver request unnecessary because the proposed alternative plans were considered in compliance with ESSA mandates (n = 3; see Table 5). Put differently, one full waiver request was granted outright, and that was the request submitted from Washington, DC. Eight were denied. Three others (i.e., California, Colorado, New Jersey) were considered in compliance enough not to need a waiver. These states’ leaders aligned their requests with the February 22, 2021, waiver template and requested to postpone the assessments or conduct them only for certain grade levels for feasibility purposes.

USDOE letters to these states were short and formulaic. In fact, these three states received the same letter with their state information filled into default sections. As such, these letters provided very little insight into how USDOE leaders made their decisions about the acceptance or denial of these waivers. Also, while the USDOE’s responses to states were formulaic and ambiguous, the initial letter sent to all states by Secretary of Education Betsy DeVos, employed expressive discourse to garner support and urgency for reinstating ESSA test accountability. What provided the context of our findings is what DeVos initially wrote to all:

Make no mistake. If we fail to assess students, it will have a lasting effect for years to come. Not only will vulnerable students fall behind, but we will be abandoning the important, bipartisan reforms of the past two decades at a critical moment. Opponents of reform, like labor unions, have already begun to call for the permanent elimination of testing. If they succeed in eliminating assessments, transparency and accountability will soon follow.

Here we see DeVos leaning into expressive and symbolic language while amplifying the frame of ESSA testing as the right thing to do, even during the midst of a global pandemic.

The Political Spectacle and Backstage Decision-Making

Secretary DeVos’s September 3, 2020 letter functioned as opening night on what soon became what we viewed as pandemic political theater. Using expressive discourse, she called for state leaders to show the same “resolve” as “doctors, nurses, police officers, grocery clerks, and other essential workers” and insisted that “now is our opportunity to show that same spirit is present in America’s education leaders as we work to safely reopen schools and to successfully educate our nation’s children” (DeVos, 2020). By casting educators as front-line heroes, the Secretary amplified the frame that annual testing was not merely a compliance exercise; it was a moral imperative woven into the nation’s pandemic response.

States answered on the same public stage with lengthy waiver submissions (one over 200 pages) that blended instrumental details with expressive appeals. Several tried to bolster legitimacy by highlighting their results of comment periods or surveys. For example, South Carolina pointed to “over 33,000 public comments” opposing spring testing, while New York reported that “88.5 percent of respondents did not support administering any state assessments.” Although these numbers represented only small fractions of their total school-aged populations (and no sampling frames or response rate data were supplied), they were amplified as evidence that ordinary citizens stood behind the waivers.

Yet the USDOE’s replies were uniform one- or two-page form letters that ignored the tallied voices and arrived with little explanation beyond does not meet requirements. The striking asymmetry between expansive state requests and perfunctory federal response staged what Edelman (1988) calls the illusion of democratic participation: elaborate on-stage rituals that invite citizens to speak, even as the decisive criteria are set elsewhere and shielded from view. These patterns foreshadow the discussion that follows, where we interpret the divergence as a classic political spectacle (Edelman, 1988).

Discussion

The 2020–2021 federal assessment waiver episode offered a rare window into the symbolic underpinnings of U.S. accountability policy. Overall, our data revealed a compelling story of how education leaders leveraged instrumental (concrete requests or technical solutions) and expressive (emotional or moral appeal) discourse tactics (Hall Jamieson & Waldman, 2003; Itkonen, 2007) to justify and advocate for their position to keep (mostly USDOE actors) or reject (mostly state actors) summative testing during COVID. Three overarching findings frame our discussion.

First, partisan alignment shaped waiver behavior. Second, the waiver correspondence revealed framing contests in which state and federal actors deployed the same civil-rights, moral, and system-integrity rhetoric to justify opposite courses of action. Third, stakeholder voice and federal response diverged sharply, a pattern that becomes even more apparent when the review timeline is overlaid on the correspondence. The sections that follow examine each of our three overarching findings and consider what they imply for future federal-state negotiations over accountability.

Whose Story Matters? Partisanship Breakdown

Our analysis revealed that how states voted in the most recent Presidential election (relative to all elections dating back to 1992) was the strongest predictor of who submitted a full assessment-waiver request, with Democratic leaning states more likely to submit than Republican ones. These data indicate that party affiliation, more than the other demographic characteristics of the states analyzed in this study (albeit correlated), determined which state leaders stepped onto the waiver stage and spoke into the microphone. This confirms Bruno and Goldhaber’s (2021) descriptive observation that state decisions around waivers were politically structured, with Democratic states more willing to press for broad relief. We also build on this observation in two ways.

First, regression analyses explained that states with a more extended history of voting patterns associated with electing Democrats were more likely to seek waivers from testing mandates; ten of the twelve waiver-seeking states had supported the Democratic nominee in every presidential election since 1992 (i.e., the last nine presidential elections). Second, regardless of partisanship, the waivers themselves followed a similar narrative arc. Each began with a nod to the value of statewide testing (i.e., what we called pandering), then pivoted to a story of decline, in which the pandemic rendered high-stakes tests statistically unreliable and ethically fraught. The narrative then closed with a story of control, laying out an alternative plan that waivers could support. The partisan divide, then, was not about how the crisis was narrated; both parties used the same dramatized rhetoric. It was about who stepped onto the stage and whose voices were heard in the waiver debate. Apart from Montana and South Carolina, Republican-leaning states sought only narrow flexibilities, or requested no waiver at all, leaving their political leaders’ views of federal accountability during COVID off the public stage. In this way, partisanship determined the cast list, while the actors, regardless of their party affiliations, delivered similar scripts.

Symmetric Stories, Divergent Remedies: Framing Contests and Amplifications

Once on stage, state and federal actors strategically leveraged frame amplification by calling for values like equity, care, and integrity and framing these values in ways that strengthened their positions (Benford & Snow, 2000). These amplifications strategically wove compelling and connected stories of decline and control (e.g., Stone, 2002).

Broadly, states told stories of decline that amplified empty classrooms, remote testing security dilemmas, unstable internet connections, and escalating mental health needs to show why pre-pandemic statewide testing accountability practices were impractical. Expressive discourse was drawn upon to amplify civil rights and moral imperative frames, where state authors argued that students and communities needed relief from federal mandates for reasons of equity and care (e.g., Hall Jamieson & Waldman, 2003; Itkonen, 2007). Then, states pivoted to stories of control that showcased how local autonomy was the most practical remedy to address these values. Using instrumental discourse, waiver-seeking states amplified a system integrity frame that questioned the validity of annual exams under these conditions and how local autonomy could uphold the spirit of accountability in ways that would provide valid and actionable data for the challenges they were facing.

The USDOE took a different approach by inverting the narrative arc employed by states. Acknowledging disruption, the USDOE positioned federal testing mandates as the needed stabilizing device for such unprecedented times. They did this by amplifying a system integrity frame, using a connective thread that pulled through the idea Betsy DeVos introduced to the public sphere in her initial letter to states, that without uniform data, “transparency and accountability will soon follow out the door.” In the USDOE narrative, the story of decline utilized expressive language to center the loss of integrity (comparability) if exams were not given and called upon educators to uphold their moral duty to accountability. This was followed by instrumental discourse used to describe what federal accountability actions should be taken to protect system integrity; the amplification of this frame told a story of control where care and equity were best served by staying the course and upholding pre-pandemic federal mandates.

In a telling policy paradox (Stone, 2002), three overarching frames (civil rights, moral imperatives, and system integrity) were employed by both the USDOE and waiver-seeking states, but amplified opposing ends. Using a civil rights frame, states cast equity as a shield against the pandemic’s unequal burdens, while the USDOE wielded equity as a sword to keep vulnerable students visible. Through a moral imperative frame, states elevated care as an immediate moral duty to nurture students’ well-being during the pandemic. In contrast, the USDOE described care as educators’ duty to uphold accountability in service of the collective good. States drew on a system integrity frame to identify threats the pandemic created for validity in longitudinal data and to argue for local flexibility. While the USDOE warned that abandoning mandates would undermine the integrity of these same data. These framing contests (Dodge & Metze, 2024) revealed how the very values that legitimize high-stakes assessments in normal times can justify their suspension during crises. They also exposed what Bruno and Goldhaber (2021) described as a persistent absence of a coherent theory of action for U.S. accountability policies (i.e., no shared understanding of who would use test data, how, and to what ends). In the resulting vacuum, stakeholders can repurpose values like equity, care, and integrity into rhetorical currency, mobilized for political gain rather than genuine guidance for policy.

Participation, Power, and the Pandemic Spectacle

Betsy DeVos’s essential worker call to duty (i.e., calling U.S. teachers essential workers to push for school reopenings nationwide) exemplified the use of symbolic language to mobilize action. It also raised the curtain on “a carefully staged drama that evokes approval while leaving the distribution of real power untouched” (Edelman, 1988, p. 10). Once onstage, state leaders delivered monologues in which instrumental plans were woven with expressive calls that amplified the civil-rights, moral imperative, and system integrity frames traced earlier. The USDOE answered each speech with a curt, uniform letter that ignored the public tallies and expressive framing, denying the majority of requests. Here, the substantive deliberation remained backstage, shielded from scrutiny, while the surface exchanges projected openness. Such a sequence fostered an illusion of democratic participation (Edelman, 1988) because the waiver process appeared negotiable but offered little observable room to deviate from the standing testing mandate.

Timing as Stagecraft

The theatrics of the waiver episode become clearer when the correspondence was read against its timeline, or in this case, the two-act play. In the first act, stretching from DeVos’s invitation on September 3, 2020, to the release of the Biden administration’s waiver template on February 22, 2021, seven states filed full suspension waivers. All were denied. In the second act, after the February 22, 2021, template release, six states filed (or refiled) waiver requests. Of these six, Colorado, New Jersey, and California were told no waiver was necessary. Delaware was denied. Washington, the only state that chose to refile its waiver when the template appeared on stage, was denied a second time. Finally, Washington, DC, secured a full exemption (see more next).

These two acts confirm Lam’s (2021) observation that ESSA’s discretionary language offered too little guidance, forcing states into a guessing game of shifting standards. The record then suggests that while template alignment became the entry ticket, annual statewide testing remained the default norm, and only proposals that matched the template and met additional unstated evidentiary expectations cleared the federal bar.

The Washington, DC, Finale

On April 6, 2021, seven waiver-seeking states received their USDOE decisions, and Washington, DC’s application was both filed and granted. While DC’s waiver story did not differ markedly from that of other states, what distinguished its waiver was the mechanics and timing. An existing statute required DC districts to report interim-assessment data on a public dashboard for the U.S. Congress, providing DC the unique ability to slot pandemic screeners into this existing infrastructure. This operational advantage allowed DC leaders to explain how their proposal of screeners would immediately communicate data and resource allocation decisions to stakeholders. Additionally, DC was the one waiver-seeking state that filed and was granted its waiver on the same day; some states waited months for a response. Although the written archive does not show how reviewers weighted each factor, Washington, DC’s lone and lightning-fast approval stole the final spotlight in this pandemic theater, prompting speculation about a potential backstage pass. Simultaneously, it underscored the need for more precise and transparent waiver criteria, if future crises are to yield much more than another round of spectacle.

Limitations and Implications for Policy and Practice

We note at least two limitations of this study. First, our evidence and findings are limited to a single, unprecedented policy window. The 2020–2021 COVID testing waiver cycle was a specific and unique moment in time when pandemic conditions, partisan politics, and a change of federal administration converged. Second, the qualitative evidence relied on letters, templates, and attachments released by states and the USDOE. This reliance on public correspondence does not include informal negotiations that may have shaped outcomes. As such, our reading of federal decision-making is necessarily inferential.

Even with these limitations, the findings we present herein offer a few key implications for policy and practice that can help actors mitigate future framing contests and political spectacle surrounding accountability debates.

Publish the evidentiary bar before invitations are sent out. When the USDOE exercises discretionary authority, publishing criteria in advance that describe the balance of instrumental criteria (e.g., participation thresholds, psychometric standards) and other justifications (e.g., how civil rights safeguards must be demonstrated) would reduce the need for expressive frame amplifications and minimize the perception of moving targets.

Weave alternative metrics into a statute-anchored story. Technical critiques of annual tests gained traction only when cast inside a civil-rights frame that the USDOE already embraced. Reformers who hope to loosen the grip of high-stakes testing must craft a new story of control that merges local evidence with a re-imagined—but legally durable—vision of system integrity. Without that expressive reframing, additional datasets are unlikely to shift policy.

Set minimum federal guidelines for interim and benchmark tests. Nine states proposed locally validated assessments, but the lack of federal parameters left those plans vulnerable. A concise set of validity, reliability, and reporting expectations or blueprints (e.g., as informed by the Standards of Educational and Psychological Testing, co-developed and endorsed by the American Educational Research Association [AERA], American Psychological Association [APA], and National Council on Measurement in Education [NCME], 2014) would provide an off-the-shelf option in future emergencies and lessen reliance on discretionary waivers.

Conclusion

In many ways, the 2020–2021 waiver episode functioned less as a routine bureaucratic procedure and more as political pandemic theatre. Partisan alignment determined who claimed the microphone; competing frames supplied the dialogue; and asymmetrical replies revealed where authority ultimately resided. However, when viewed through the long lens, across four decades of federal policy (i.e., A Nation at Risk, NCLB and its AYP measurement tenets, RttT, the flexibility bargains of ESSA), a single story of control reveals itself as remarkably resilient: Annual, comparable test scores are the surest guarantor of upholding civil rights in education.

Our data reveal how deeply that story is woven into the fabric of U.S. education policy. The pandemic did crack open a fleeting policy window (Kingdon, 2011), but that window closed as soon as the USDOE judged that threats to comparability outweighed the risks of incomplete data. In that moment, the system integrity frame eclipsed the competing appeals to care, equity, and local autonomy.

As Stone (2002) reminds us, policy stories possess a symbolic resilience that data alone seldom dislodge, an insight echoed by D. C. Berliner and Biddle (1995) and Koretz (1996, 2017), whose critiques of test-based accountability show how grand narratives about standards and equity can outlast a mountain of technical counterevidence. Until reformers can craft resonant stories of decline and control, stories that anchor moral and civil-rights imperatives with a reimagined (but legally durable) vision of system integrity, the same script of annual comparable exams will continue to dominate the stage, no matter how many crises, stakeholders, or datasets question its logic.

The stakes are rising. At this writing, we are experiencing unprecedented changes at the federal level aimed at education. Among them is the systematic dismantling of the USDOE itself, as well as many federally supported data systems, many of which are used to help trace the quality of (and equality and equity throughout) the U.S.’s public schools. Additionally, we see many federal leaders arguing that states (not the federal government) should have more power over their education systems. Suppose states are given more power to determine their educational destinies. In that case, we may end up with a patchwork of accountability systems that are dramatically different from state to state. As results from our study suggest (or predict), we may see Republican-leaning/voting states continue to take a hard-lined approach in embracing test-based high-stakes accountability systems, and Democratic-leaning/voting states continue to take a more humanistic and holistic approach that may lessen the importance of tests.

While we do not know the future, it does seem clear that we do not agree on the best approach to ensure equity across educational opportunity inputs and outcomes. A durable, nation-spanning commitment to educational equity, then, demands more than technical tweaks; it requires a compelling counter-narrative capable of transcending partisan lines. That story must protect the transparency and comparability that parents and civil-rights advocates rightly demand while acknowledging the profound human costs of reductive, one-size-fits-all testing regimes. Crafting such a story is the unfinished work, but if we fail to seize this moment, the next crisis will cue the same play, performed on an ever more fractured stage. Our students deserve a better script.

Footnotes

Appendix

Appendix A

Detailed Description of Three-Phase Qualitative Data Analysis

Phase One	We first employed structural and open coding to rearrange each state’s waiver request communications into smaller pieces (Saldaña, 2013). We constructed structural categories (i.e., setup/background/context, the ask, the plan, the justification, and how teachers and students were described) to help us organize the large amounts of content across waiver requests. We individually coded the data and then met to discuss discrepancies to bring us to agreement across coding decisions. This structural pass helped us “break the text down” (Saldaña, 2013, p. 90) into manageable segments while preserving when and how each request, rationale, or appeal was laid out. During this stage, we also wrote analytic memos noting recurrent themes such as “equity,” “well-being,” or “unprecedented.” These memos narrated our decision-making processes, and decisions informed our subsequent theoretical coding (Saldaña, 2013).
Phase Two	In Phase Two, we revisited each excerpt and applied theoretical coding (Saldaña, 2013). While state waiver requests were geared toward obtaining some aspect of relief from federal testing mandates and broadly had a solution-oriented focus, within that context, we differentiated each excerpt (e.g., set-up, ask, justification) as using expressive or instrumental discourse (i.e., Itkonen, 2007). We coded an excerpt as instrumental when its authors’ main argument rested on a concrete policy or logistical objective, such as partial waiver requests or data validity considerations. We coded an excerpt as expressive when a substantial portion of the text centered on emotional appeals or moral considerations, such as teachers’ anxiety or pandemic hardships. During this coding exercise, we also captured a new category in which we collected instances evidencing how states were what we called pandering, defined when we collectively believed that state leaders were attempting to engender approval prior to, and in favor of the subsequent waiver request by avowing allegiance or reliance on ESSA mandates during normal times.We then looked across these discourses for policy stories using Stone’s (1997) conceptualization of decline versus control narratives. Excerpts were coded as part of a story of decline if they portrayed conditions as worsening, for example, by emphasizing “unprecedented times” or “remote learning equity concerns” that rendered testing impossible or unjust. By contrast, a story of control emerged when policy actors’ discourse centered on either the continued legitimacy of standardized tests under normal circumstances or they proposed solutions to the challenge of testing during this crisis (e.g., sampling students, partial waivers). If an excerpt began with a description of worsening conditions (decline) but excerpt authors ultimately argued that upholding testing was still paramount, we considered it a control story because the excerpt concluded on a note of how “things could be better” with a waiver.Like phase one, the three of us independently coded excerpts and then met to discuss areas of disagreement. We refined our definitions of stories of decline and control and what constituted expressive discourse versus instrumental discourse. We wrote additional analytic memos documenting overlaps, for example, when letters began with emotional pleas about student well-being but also cited test score data to reinforce an argument (Saldaña, 2013).
Phase Three	In Phase 3, we employed axial coding (Saldaña, 2013) to examine how issues were presented across state excerpts and pinpoint how the policy problem was framed, as well as evidence indicating public engagement in the waiver process (i.e., evidence of political spectacle). To do this, we systematically examined ways in which frames overlapped or clashed within, across, and in response to waiver requests. We began by revisiting the analytic memos and excerpts we coded in Phase Two, searching for patterns in how state and federal letter authors framed their arguments around testing waivers. We identified four dominant frames: (1) moral imperative, (2) civil rights, (3) system integrity, and (4) local autonomy.

Acknowledgements

No Acknowledgements

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Open Practices

Open Practices Statement Forthcoming

ORCID iDs

Imogen R. Herrick

Audrey Amrein-Beardsley

Sharon Nichols

Notes

Authors

IMOGEN R. HERRICK is an assistant professor of STEM Education at the University of Kansas; email: iherrick@ku.edu. For her research, she examines the intersection of emotion, place, and environmental justice. She also examines youth agency, teacher motivation, and deeper learning through STREAMS (Science, Technology, Reading, Engineering, Arts, Mathematics, and Social Studies).

AUDREY AMREIN-BEARDSLEY is a professor in the Mary Lou Fulton Teachers College at Arizona State University; email: audrey.beardsley@asu.edu. For her research, she focuses on the use of tests and value-added models (VAMs) in and across states before and since the passage of the Every Student Succeeds Act (ESSA). More specifically, she is conducting validation studies on multiple test-based accountability systems and policies, as well as serving as an expert witness in many legal cases surrounding the (mis)use of test-based output.

SHARON NICHOLS is department chair and professor in Educational Psychology at the University of Texas at San Antonio; email: sharon.nichols@utsa.edu. Her current work focuses on the impact of test-based accountability on teachers, their instructional practices, and adolescent motivation and development.

References

American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME). (2014). Standards for educational and psychological testing. Author. https://www.testingstandards.net/open-access-files.html

American Recovery and Reinvestment Act of 2009, Public Law 5, U.S. Statutes at Large 123 (2009): 115–521.

Amrein-Beardsley

(2019). The Education Value-Added Assessment System (EVAAS) on trial: A precedent-setting lawsuit with implications for policy and practice. eJournal of Education Policy. https://files.eric.ed.gov/fulltext/EJ1234497.pdf

Amrein-Beardsley

Berliner

D. C.

Rideau

(2010). Cheating in the first, second, and third degree: Educators’ responses to high-stakes testing. Education Policy Analysis Archives, 18, 14. https://doi.org/10.14507/epaa.v18n14.2010

Amrein-Beardsley

(2019). Teacher-level value-added models (VAMs) on trial: Empirical and pragmatic issues of concern across five court cases. Educational Policy, 35(6), 1–42. https://doi.org/10.1177/0895904819843593

Ballou

Springer

M. G.

(2017). Has NCLB encouraged educational triage? Accountability and the distribution of achievement gains. Education Finance and Policy, 12(1), 77–106. https://doi.org/10.1162/edfp_a_00189

Benford

R. D.

Snow

D. A.

(2000). Framing processes and social movements: An overview and assessment. Annual Review of Sociology, 26(2000), 611–639. https://doi.org/10.1146/annurev.soc.26.1.611

Berliner

(2011). Rational responses to high-stakes testing: The case of curriculum narrowing and the harm that follows. Cambridge Journal of Education, 41(3), 287–302. https://doi.org/10.1080/0305764X.2011.607151

Berliner

D. C.

Biddle

B. J.

(1995). The manufactured crisis: Myths, fraud, and the attack on America’s public schools. Addison-Wesley.

10.

Bonilla

Dee

T. S.

(2018). The effects of school reform under NCLB waivers: Evidence from focus schools in Kentucky. Education Finance and Policy, 15(1), 75–103. https://doi.org/10.3386/w23462

11.

Bruno

Goldhaber

(2021). What pandemic-related test waiver requests suggest about states’ testing priorities. Phi Delta Kappan, 103(3), 48–53. https://doi.org/10.1177/00317217211058525

12.

Carter

P. L.

Welner

K. G.

(Eds.). (2013). Closing the opportunity gap: What America must do to give every child an even chance. Oxford University Press.

13.

Center on Organization and Restructuring of Schools. (1995). Successful school restructuring. Newmann & Wehlage.

14.

Chubb

J. E.

Moe

T. M.

(2007). Politics, markets, and America’s schools. Brookings Institution Press.

15.

Collins

Amrein-Beardsley

(2014). Putting growth and value-added models on the map: A national overview. Teachers College Record, 16(1), 1–32. http://www.tcrecord.org/Content.asp?ContentId=17291

16.

Creswell

J. W.

Plano Clark

V. L.

(2010). Designing and conducting mixed methods research (2nd ed.). Sage.

17.

De Voto

Superfine

B. M.

DeWit

. (2023). Navigating policy and local context in times of crisis: District and school leader responses to the COVID–19 pandemic. Educational Administration Quarterly, 59(2), 339–383. https://doi.org/10.1177/0013161x231163870

18.

Dee

T. S.

Jacob

B. A.

(2011). The impact of no Child Left Behind on student achievement. Journal of Policy Analysis and Management, 30(3), 418–446. https://doi.org/10.3386/w15531

19.

DeSilver

(2017). U.S. students’ academic achievement still lags that of their peers in many other countries. Pew Research Center. https://www.pewresearch.org/short-reads/2017/02/15/u-s-students-internationally-math-science/

20.

DeVos

(2020). Secretary’s letter to Chief State School Officers regarding administering summative assessments during the 2020–21 school year. Department of Education. https://www.ed.gov/secretarys-letter-chief-state-school-officers-regarding-administering-summative-assessments-during

21.

Dillon

(2010). Obama to seek sweeping change in ‘No Child’ law. New York Times. http://www.nytimes.com/2010/02/01/education/01child.html?pagewanted=all

22.

Dumas

Anyon

(2006). Toward a critical approach to education policy implementation. In Honig

(Ed.), New directions in education policy implementation: Confronting complexity (pp. 149–168). State University of New York Press.

23.

Duncan

(2009). The race to the top begins: Remarks by secretary Arne Duncan. https://www2.ed.gov/news/pressreleases/2009/07/07242009.html

24.

Dodge

Metze

(2024). Approaches to policy framing: Deepening a conversation across perspectives. Policy Sciences, 57(2), 221–256. https://doi.org/10.1007/s11077-024-09534–9

25.

Edelman

(1988). Constructing the political spectacle. University of Chicago Press.

26.

Edelman

(2001). The politics of misdirection. Cambridge University Press.

27.

Education Week. (2015). Teacher evaluation heads to the courts. http://www.edweek.org/ew/section/multimedia/teacher-evaluation-heads-to-the-courts.html

28.

Every Student Succeeds Act (ESSA) of 2015, Pub. L. No. 114–95, § 114 Stat. 1177. (2015). https://www.congress.gov/bill/114th-congress/senate-bill/1177

29.

Fahle

E. M.

Kane

T. J.

Patterson

Reardon

S. F.

Staiger

D. O.

Stuart

E. A.

(2023). School district and community factors associated with learning loss during the COVID–19 pandemic. Center for Education Policy Research at Harvard University. https://cepr.harvard.edu/sites/hwpi.harvard.edu/files/cepr/files/explaining_covid_losses_5.23.pdf

30.

Geiger

T. J.

Amrein-Beardsley

Holloway

(2020). Using test scores to evaluate and hold school teachers accountable in New Mexico. Educational Assessment, Evaluation and Accountability, 32(2), 187–235. https://doi.org/10.1007/s11092-020-09324-w

31.

Gershenson

(2016). Performance standards and employee effort: Evidence from teacher absences. Journal of Policy Analysis and Management, 35(3), 615–638. https://doi.org/10.17848/wp15–217

32.

Goldstein

(2019). ‘It just isn’t working:’ PISA test scores cast doubt on U.S. education efforts. New York Times. https://www.nytimes.com/2019/12/03/us/us-students-international-test-scores.html

33.

Grissom

J. A.

Nicholson-Crotty

Harrington

J. R.

(2014). Estimating the effects of No Child Left Behind on teachers’ work environments and job attitudes. Educational Evaluation and Policy Analysis, 36(4), 417–436. https://doi.org/10.3102/0162373714533817

34.

Gusfield

J. R.

(1986). Symbolic crusade: Status politics and the American temperance movement. University of Illinois Press.

35.

Haladyna

T. M.

Nolen

N. S.

Haas

S. B.

(1991). Raising standardized achievement test scores and the origins of test score pollution. Educational Researcher, 20(5), 2–7. https://doi.org/10.2307/1176395

36.

Hall Jamieson

Waldman

. (2003). The press effect: Politicians, journalists, and the stories that shape the political world. Oxford University Press.

37.

Hamilton

L. S.

Stecher

B. M.

Yuan

(2008). Standards-based reform in the United States: History, research, and future directions. RAND Corporation. https://www.rand.org/content/dam/rand/pubs/reprints/2009/RAND_RP1384.pdf

38.

Hanushek

E. A.

Peterson

P. E.

Talpey

L. M.

Woessmann

(2019). The achievement gap fails to close. Education Next, 19(3), 8–17. https://www.educationnext.org/achievement-gap-fails-close-half-century-testing-shows-persistent-divide/

39.

Harris

D. N.

Liu

Barrett

(2023). Is the rise in high school graduation rates real? High-stakes school accountability and strategic behavior. Labour Economics, 82, 102355. https://doi.org/10.1016/j.labeco.2023.102355

40.

Hawkins

Holden

(2013). Framing the alcohol policy debate: Industry actors and the regulation of the UK beverage alcohol market. Critical Policy Studies, 7(1), 53–71. https://doi.org/10.1080/19460171.2013.766023

41.

Holbein

J. B.

Ladd

H. F.

(2017). Accountability pressure: Regression discontinuity estimates of how No Child Left Behind influenced student behavior. Economics of Education Review, 58, 55–67. https://doi.org/10.1016/j.econedurev.2017.03.005

42.

Hursh

D. W.

(2008). High stakes testing and the decline of teaching and learning: The real crisis in education. Rowman & Littlefield.

43.

Itkonen

(2007). Politics of passion: Collective action from pain and loss. American Journal of Education, 113(4), 577–604. https://doi.org/10.1086/518489

44.

Jack

Oster

(2023). COVID–19, school closures, and outcomes. Journal of Economic Perspectives, 37(4), 51–70. https://pubs.aeaweb.org/doi/pdfplus/10.1257/jep.37.4.51

45.

Kane

Reardon

(2023). Parents don’t understand how far behind their kids are in school. New York Times. https://www.nytimes.com/interactive/2023/05/11/opinion/pandemic-learning-losses-steep-but-not-permanent.html

46.

Karaian

June

. (2020). The things our bosses said a lot this year. New York Times. https://www.nytimes.com/2020/12/29/business/dealbook/words-of-the-year-mute-unprecedented.html

47.

Kingdon

J. W.

(2011). Agendas, alternatives, and public policies (Updated 2nd ed.). Longman.

48.

Klein

(2015). No Child Left Behind: A summary of the law. Education Week. https://www.edweek.org/policy-politics/no-child-left-behind-an-overview/2015/04

49.

Koretz

(1996). Using student assessments for educational accountability. In Hanushek

E. A.

Jorgenson

D. W.

(Eds.), Improving America’s schools: The role of incentives (pp. 171–195). National Academy Press.

50.

Koretz

(2017). The testing charade: Pretending to make schools better. University of Chicago Press.

51.

Kress

(2011). On ‘No Child,’ no going back: An architect of education reform says Obama’s wobbly on accountability. New York Daily News. http://www.nydailynews.com/opinion/child-back-architect-education-reform-obama-wobbly-accountability-article–1.113184#ixzz27Osjk8Jq

52.

Lam

(2021). Advancing student achievement through Elementary and Secondary Education Act waivers. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3793128

53.

Layton

(2012). Rethinking the classroom: Obama’s overhaul of public education. Washington Post. http://www.washingtonpost.com/local/education/rethinking-the-classroom-obamas-overhaul-of-public-education/2012/09/20/a5459346-e171–11e1-ae7f-d2a13e249eb2_print.html

54.

Levy

Brunner

Keller

Fischbach

(2019). Methodological issues in value-added modeling: An international review from 26 countries. Educational Assessment, Evaluation and Accountability, 31, 257–287. https://link.springer.com/article/10.1007/s11092-019-09303-w

55.

Lewin

(2012). Backer of Common Core school curriculum is chosen to lead College Board. New York Times. http://www.nytimes.com/2012/05/16/education/david-coleman-to-lead-college-board.html

56.

Maxwell

J. A.

(2013). Qualitative research design: An interactive approach (3rd ed.). SAGE.

57.

McLeod

S. A.

(2019). What a p-value tells you about statistical significance. Simply Psychology. www.simplypsychology.org/p-value.html

58.

Meyer

R. H.

(1997). Value-added indicators of school performance: A primer. Economics of Education Review, 16(3), 283–301. https://doi.org/10.1016/s0272–7757(96)00081–7

59.

Miles

M. B.

Huberman

A. M.

(1994). Qualitative data analysis: A sourcebook. Sage.

60.

Nelson

T. E.

(1999). Group affect and attribution in social policy opinion. Journal of Politics, 61(2), 331–362. https://doi.org/10.2307/2647507

61.

Nelson

T. E.

Kinder

D. R.

(1996). Issue frames and group-centrism in American public opinion. Journal of Politics, 58(4), 1055–1078. https://doi.org/10.2307/2960149

62.

Nichols

S. L.

Berliner

D. C

. (2007). Collateral damage: How high-stakes testing corrupts America’s schools. Harvard Education Press.

63.

No Child Left Behind (NCLB) Act of 2001, Pub. L. No. 107–110, § 115 Stat. 1425. (2002). https://www2.ed.gov/policy/elsec/leg/esea02/index.html

64.

Obama

B. H.

(2011). The President in Miami: Winning the future through investments in education. The White House. https://obamawhitehouse.archives.gov/blog/2011/03/04/president-miami-winning-future-through-investments-education

65.

Paige

M. A.

Amrein-Beardsley

(2020). “Houston, we have a lawsuit”: A cautionary tale for the implementation of value-added models (VAMs) for high-stakes employment decisions. Educational Researcher, 49(5), 350–359. https://doi.org/10.3102/0013189X20923046

66.

Paige

M. A.

Amrein-Beardsley

*Close

. (2019). Tennessee’s national impact on teacher evaluation law & policy: An assessment of value-added model litigation [Law Review]. Tennessee Journal of Law & Policy, 13(2), 523–574. https://doi.org/10.70658/1940–4131.1007

67.

Race to the Top Act of 2011, S. 844-–112th Congress. (2011). https://obamawhitehouse.archives.gov/issues/education/k–12/race-to-the-top

68.

Reardon

S. F.

(2018). The widening academic achievement gap between the rich and the poor. In Murnane

Duncan

(Eds.), Whither opportunity? Rising inequality and the uncertain life chances of low-income children (pp. 91–116). Russell Sage Foundation Press.

69.

Rhee

(2011). The evidence is clear: Test scores must accurately reflect students’ learning. The Huffington Post. http://www.huffingtonpost.com/michelle-rhee/michelle-rhee-dc-schools_b_845286.html

70.

Rotherham

A. J.

(2011). School of thought: 11 education activists for 2011. Time. https://content.time.com/time/specials/packages/completelist/0,29569,2040867,00.html

71.

Sahlberg

(2011). Finnish lessons: What can the world learn from educational change in Finland? Teachers College Press.

72.

Saldaña

(2013). The coding manual for qualitative research (2nd ed.). Sage.

73.

Shah

(2013). “Race to the Top” a flop. Politico. https://www.politico.com/story/2013/09/race-to-the-top-for-education-a-flop-report-finds-096709

74.

Sinha

Gasper

(2010). How can power discourses be changed? Contrasting the “daughter deficit” policy of the Delhi government with Gandhi and King’s transformational reframing. Critical Policy Studies, 3(3–4), 290–308. https://doi.org/10.1080/19460171003619717

75.

Smith

M. L.

Miller-Kahn

Heinecke

Jarvis

P. F.

(2003). Political spectacle and the fate of American schools. Routledge.

76.

Sørensen

T. B.

(2016). Value-added measurement or modelling (VAM). Education International. http://download.ei-ie.org/Docs/WebDepot/2016_EI_VAM_EN_final_Web.pdf

77.

Spellings

(2012). What if? Huffington Post. http://www.huffingtonpost.com/margaret-spellings/what-if_5_b_1910679.html

78.

Stanford

(2023). Education Secretary: Standardized tests should no longer be a ‘hammer.’ Education Week. https://www.edweek.org/policy-politics/education-secretary-standardized-tests-should-no-longer-be-a-hammer/2023/01

79.

Stone

D. A.

(1997). The doctor as businessman: The changing politics of a cultural icon. Journal of Health Politics, Policy and Law, 22(2), 533–556. https://doi.org/10.1215/03616878–22–2–533

80.

Stone

D. A.

(2002). Policy paradox: The art of political decision making (Rev. ed.). Norton.

81.

Strassfeld

N. M.

Voulgarides

C. K.

(2022). Federal oversight, district compliance, and IDEA waivers during the COVID–19 pandemic. Journal of Disability Law and Policy in Education, 1(1), 1–9.

82.

Sunderman

G. L.

(Ed.). (2008). Holding NCLB accountable: Achieving accountability, equity, & school reform. Corwin Press.

83.

Tan

(2021). School closures were over-weighted against the mitigation of COVID–19 transmission: A literature review on the impact of school closures in the United States. Medicine, 100(30), e26709. https://doi.org/10.1097/MD.0000000000026709

84.

Timar

T. B.

Maxwell-Jolly

(Eds.). (2012). Narrowing the achievement gap: Perspectives and strategies for challenging times. Harvard Education Press.

85.

Turner

(2020). Betsy DeVos says students will need to take federal standardized tests this year. National Public Radio. https://www.npr.org/2020/09/17/911482773/betsy-devos-says-students-will-need-to-take-federal-standardized-tests-this-year

86.

U.S. Department of Education (USDOE). (1983). A nation at risk: The imperative for educational reform. https://files.eric.ed.gov/fulltext/ED226006.pdf

87.

U.S. Department of Education (USDOE). (2017). Race to the Top: Implementation and relationship to student outcomes. https://ies.ed.gov/ncee/pubs/20174001/pdf/20174001.pdf

88.

Weiss

(2013). Mismatches in Race to the Top limit educational improvement. Economic Policy Institute. https://www.epi.org/publication/race-to-the-top-goals/

89.

Whitney

C. R.

Candelaria

C. A.

(2017). The effects of No Child Left Behind on children’s socioemotional outcomes. AERA Open, 3(3), 2332858417726324. https://doi.org/10.1177/2332858417726324

90.

Yin

R. K.

(2009). Case study research: Design and methods (4th ed.). Sage.

A Political Spectacle: A Policy Analysis of States’ Responses to Federal Assessment Waivers During the Coronavirus (COVID) Pandemic

Abstract

Keywords

Introduction

Purpose of the Study

Federal and State Testing in the U.S

Situating the Study

Theoretical Frameworks

Stories in Policy Debates

Framing Contests in Policy Stories

Political Spectacles

Study Objectives

Methods

Design and Sample

Data Collection

Data Analyses

Results

States and the Waivers State Leaders Sought

State Demographics

Assessment Waiver Requests

Stories of Decline and Control

The Paradox of Upholding Accountability While Seeking Exemptions

Framing Contests: Civil Rights, Moral Imperatives, and System Integrity

Civil Rights Frame Contests

Moral Imperative Frame Contests

System Integrity Framing Contests

Local Autonomy Frames and the Waiver Requests

Partially Complying with ESSA (Shortened or Modified Summative Assessments)

Replacing ESSA Tests with Local Benchmark-Aligned Formative Assessments

Opting for Portfolios or Screeners Instead of ESSA Tests

Chronology of Waiver Submissions and Responses

Federal Government Response to States

The Political Spectacle and Backstage Decision-Making

Discussion

Whose Story Matters? Partisanship Breakdown

Symmetric Stories, Divergent Remedies: Framing Contests and Amplifications

Participation, Power, and the Pandemic Spectacle

Timing as Stagecraft

The Washington, DC, Finale

Limitations and Implications for Policy and Practice

Conclusion

Footnotes

Appendix

Acknowledgements

Declaration of Conflicting Interests

Funding

Open Practices

ORCID iDs

Notes

Authors

References