The purpose of this article is to put in the hands of researchers, practitioners, and policy makers a powerful framework for building and studying the effects of high-quality assessment and accountability programs. The framework is illustrated through a description and analysis of the assessment and accountability program in the School District of Philadelphia.
Get full access to this article
View all access options for this article.
References
1.
American Educational Research Association. (2000). AERA position statement: High-stakes testing in preK-12 education.Washington, DC: Author.
2.
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing.Washington, DC: American Educational Research Association.
3.
BakerE. L. (1997). Model-based performance assessment. Theory into Practice, 36(4), 247–254.
4.
ChesterM. D. (2003). Multiple measures and high-stakes decisions: A framework for combining measures. Educational Measurement: Issues and Practice, 22(2), 32–41.
5.
ChesterM. D., OrrM., & ChristmanJ. (2001, April). Consequential validity of Philadelphia's accountability system: Triangulating four years of multiple sources of evidence. Paper presented at the annual meeting of the American Educational Research Association, Seattle, WA.
6.
ClotfelterC. T., & LaddH. F. (1996). Recognizing and rewarding success in public schools. In LaddH. F. (Ed.), Holding schools accountable: Performance-based reform in education (pp. 23–63). Washington, DC: Brookings Institution.
7.
Consortium for Policy Research in Education, Research for Action, & OMG Center for Collaborative Learning. (1998). Children achieving: Philadelphia's education reform progress report series 1996–97.Philadelphia: Author.
8.
CorcoranT., & ChristmanJ. B. (2002). The limits and contradictions of systemic reform: The Philadelphia story.Philadelphia: University of Pennsylvania, Consortium for Policy Research in Education.
9.
Educational Testing Service. (2001). Using assessments and accountability to raise student achievement.Princeton, NJ: Author.
10.
GoertzM. E., DuffyM. C., & Carlson Le FlochK. (2001). Assessment and accountability systems in the 50 states: 1999–2000 (CPRE Research Report Series RR-046). Philadelphia: University of Pennsylvania, Consortium for Policy Research in Education.
11.
HaertelE. H. (1999). Validity arguments for high-stakes testing: In search of the evidence. Educational Measurement: Issues and Practice, 18(4), 5–9.
12.
HessG. A.Jr. (2002). Accountability and support in Chicago: Consequences for students. In RavitchD. (Ed.), Brookings papers on education policy.Washington, DC: Brookings Institution.
13.
HeubertJ. P., & HauserR. M. (Eds.) (1999). High stakes: Testing for tracking, promotion, and graduation.Washington, DC: National Academy Press.
14.
JacobsB. A. (2001). Getting tough? The impact of high school graduation exams. Educational Evaluation and Policy Analysis, 23(2), 99–121.
15.
JaegerR. M., MullisI. V. S., BourqueM. L., & ShakraniS. (1996). Setting performance standards for performance assessments: Some fundamental issues, current practice, and technical dilemmas. In PhillipsG. W. (Ed.), Technical issues in large-scale performance assessments (pp. 79–115). Washington, DC: U.S. Government Printing Office.
16.
KaneT. J., & StaigerD. O. (2002). Volatility in school test scores: Implications for test-based accountability systems. In RavitchD. (Ed.), Brookings papers on education policy (pp. 235–269). Washington, DC: Brookings Institution.
17.
KleinS. P., HamiltonL. S., McCaffreyD. F., & StecherB. M. (2000). What do test scores in Texas tell us?Santa Monica, CA: RAND.
18.
KoretzD. M., & BarronS. I. (1998). The validity of gains in scores on the Kentucky Instructional Results Information System (KIRIS).Santa Monica, CA: RAND.
19.
KoretzD. M., & HamiltonL. S. (2000). Assessment of students with disabilities in Kentucky: Inclusion, student performance, and validity. Educational Evaluation and Policy Analysis, 22(3), 255–272.
20.
KoretzD. M., McCaffreyD. F., & HamiltonL. S. (2001, April). Toward a framework for validating gains under high-stakes conditions. In KoretzD. M. (Chair), New work on the evaluation of high-stakes testing programs. Symposium conducted at the annual meeting of the National Council on Measurement in Education, Seattle, WA.
21.
LinnR. L. (1997). Evaluating the validity of assessments: The consequences of use. Educational Measurement: Issues and Practice, 16(2), 14–16.
22.
LinnR. L. (2000). Assessments and accountability. Educational Researcher, 29(2), 4–16.
23.
LinnR. L., KoretzD., BakerE. L., & BursteinL. (1991). The validity and credibility of the achievement levels for the 1990 National Assessment of Educational Progress in Mathematics (CSE Technical Report No. 330) Los Angeles: University of California, Center for the Study of Excellence.
24.
McLaughlinM. W., & ShepardL. A. with O'DayJ. A. (1995). Improving education through standards-based reform: A report by the National Academy of Education Panel on Standards-Based Education Reform.Stanford, CA: National Academy of Education.
25.
McNeilL. M. (2000). Contradictions of school reform: Educational costs of standardized testing.New York: Routledge.
26.
MessickS. (1993). Validity. In LinnR. L. (Ed.), Educational measurement (3rd ed., pp. 13–103). Phoenix, AZ: Oryx Press.
27.
MeyerR. H. (1996). Value-added indicators of school performance. In HanushekE. A., & JorgensenD. W. (Eds.), Improving American schools: The role of incentives (pp. 197–223). Washington, DC: National Academy Press.
28.
National Research Council. (2001). Knowing what students know: The science and design of educational assessment (P. Pellegrino, N. Chudowsky, & R. Glaser, Eds.). Washington, DC: National Academy Press.
29.
No Child Left Behind Act of 2001, Pub. L. No. 107–110, 115 Stat. 1425 (2002).
30.
PhillipsS. E. (1994). High-stakes testing accommodations: Validity versus disabled rights. Applied Measurement in Education, 7(2), 93–120.
31.
PophamW. J. (2000). Modern educational measurement: Practical guidelines for educational leaders.Needham, MA: Allyn & Bacon.
32.
PorterA. C. (1993). School delivery standards. Educational Researcher, 22(5), 24–30.
33.
PorterA. C. (1994). National standards and school improvement in the 1990s: Issues and promise. American Journal of Education, 102(4), 421–449.
34.
PorterA. C. (1995). The uses and misuses of opportunity-to-learn standards. Educational Researcher, 24(1), 21–27.
35.
PorterA. C. (2000). Doing high-stakes assessment right. School Administrator, 11(57), 28–31.
36.
PorterA. C., & ChesterM. (2002). Building a high-quality assessment and accountability program: The Philadelphia example. In RavitchD. (Ed.), Brookings papers on education policy (pp. 285–337). Washington, DC: Brookings Institution.
37.
RichardsC. E., & SheuT. M. (1992). The South Carolina School Incentive Reward Program: A policy analysis. Economics of Education Review, 11(1), 71–86.
38.
RoderickM., & EngelM. (2001). The grasshopper and the ant: Motivational responses of low-achieving students to high-stakes tests. Educational Evaluation and Policy Analysis, 23(3), 197–227.
39.
ShepardL. A. (1990). Inflated test score gains: Is the problem old norms or teaching the test?Educational Measurement: Issues and Practice, 9(3), 15–22.
40.
ShepardL. A., & SmithM. L. (1989). Flunking grades: Research and policies on retention.Philadelphia: Falmer Press.
41.
SmithM. L. (1991). Put to the test: The effects of external testing on teachers. Educational Researcher, 20(5), 8–11.
ThurlowM. L., ElliottJ. L., & YsseldykeJ. E. (1998). Testing students with disabilities: Practical strategies for complying with district and state requirements.Thousand Oaks, CA: Corwin Press.
44.
WebbN. L. (1997). Criteria for alignment of expectations and assessments in mathematics and science education (National Institute for Science Education and Council of Chief State School Officers Research Monograph No. 6). Washington, DC: Council of Chief State School Officers.