AbramsL. M., PedullaJ. J., and MadausG. F. (2003). Views from the classroom: Teachers’ opinions of statewide testing programs. Theory Into Practice, 42(1), 8—29.
2.
AmreinA. L., and BerlinerD. C. (2002a, March 28). High-stakes testing, uncertainty, and student learning. Education Policy Analysis Archives, 10(18). Retrieved September 12, 2006, from http://epaa.asu.edu/epaa/v10n18/.
3.
AmreinA. L., and BerlinerD. C. (2002b, December). An analysis of some unintended and negative consequences of high-stakes testing. Education Policy Research Unit, Arizona State University, Tempe. Retrieved September 6, 2006, from http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-0211-125-EPRU.pdf.
4.
AndersonJ. R. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press.
5.
AndersonJ. R. (1990). The adaptive character of thought. Hillsdale, NJ: Erlbaum.
6.
BazermanC. (1988). Shaping written knowledge: The genre and activity of the experimental article in science. Madison: University of Wisconsin Press.
7.
BlackP., and WiliamD. (1998). Assessment and classroom learning. Assessment in Education, 5(1), 7—73.
8.
BransfordJ., BrownA., and CockingR. (Eds.). (1999). How people learn: Brain, mind, experience and school. Washington, DC: National Academy Press.
9.
California Assessment Policy Committee (1991). A new student assessment system for California schools (Executive Summary Report). Sacramento, CA: Office of the Superintendent of Instruction.
ChaseW. G., and SimonH. A. (1973). The mind's eye in chess. In and and ChaseW.G. (Ed.), Visual information processing (pp. 215—281). New York: Academic Press.
12.
ChiM. T. H., FeltovichP. J., and GlaserR. (1981). Categorization and representation of physics problems by experts and novices. Cognitive Science, 5, 121—152.
13.
CoburnC. E., HonigM. I., and SteinM. K. (in press). What is the evidence on districts’ use of evidence? In BransfordJ., GomezL., VyeN., and LamD. (Eds.), Research and practice: Towards a reconciliation. Cambridge, MA: Harvard Educational Press.
14.
CronbachL. J. (1957). The two disciplines of scientific psychology. American Psychologist, 12, 671—684.
15.
DuschlR. (2003). Assessment of scientific inquiry. In AtkinJ.M., and CoffeyJ. (Eds.), Everyday assessment in the science classroom (pp. 41—59). Arlington, VA: NSTA Press.
16.
DuschlR., and GitomerD. (1997). Strategies and challenges to changing the focus of assessment and instruction in science classrooms. Education Assessment, 4(1), 37—73.
17.
DuschlR., and GrandyR. (Eds.). (2007). Establishing a consensus agenda for K-12 science inquiry. The Netherlands: SensePublishers.
18.
DuschlR., SchweingruberH., and ShouseA. (Eds.). (2006). Taking science to school: Learning and teaching science in grades K-8. Washington, DC: National Academy Press.
19.
ErduranS. (1999). Merging curriculum design with chemical epistemology: A case of teaching and learning chemistry through modeling. Unpublished doctoral dissertation, Vanderbilt University, Nashville, TN.
20.
FoltzP. W., LahamD., and LandauerT. K. (1999). The intelligent essay assessor: Applications to educational technology. Interactive Multimedia Electronic Journal of Computer-Enhanced Learning, 1(2). Retrieved January 8, 2006, from imej.wfu.edu/articles/1999/2/04/index.asp.
21.
FrederiksenJ. R., and CollinsA. M. (1989). A systems approach to educational testing. Educational Researcher, 18(9), 27—32.
22.
GearhartM., and HermanJ. L. (1998). Portfolio assessment: Whose work is it? Issues in the use of classroom assignments for accountability. Educational Assessment, 5(1), 41—55.
23.
GeeJ. (1999). An introduction to discourse analysis: Theory and method. New York: Routledge.
24.
GitomerD. H. (1991). The art of accountability. Teaching Thinking and Problem Solving, 13, 1—9.
25.
GitomerD. H. (in press). Policy, practice and next steps for educational research. In DuschlR., and GrandyR. (Eds.), Establishing a consensus agenda for K-12 science inquiry. The Netherlands: SensePublishers.
26.
GitomerD. H., and DuschlR. (1998). Emerging issues and practices in science assessment. In FraserB., and TobinK. (Eds.), International handbook of science education (pp. 791—810). Dordrecht, The Netherlands: Kluwer Academic Publishers.
27.
GlaserR. (1976). Components of a psychology of instruction: Toward a science of design. Review of Educational Research, 46, 1—24.
28.
GlaserR. (1991). The maturing of the relationship between the science of learning and cognition and educational practice. Learning and Instruction, 1(2), 129—144.
29.
GlaserR. (1992). Expert knowledge and processes of thinking. In and and HalpernD.F. (Ed.), Enhancing thinking skills in the sciences and mathematics (pp. 63—75). Hillsdale, NJ: Lawrence Erlbaum Associates.
30.
GlaserR. (1997). Assessment and education: Access and achievement. CSE Technical Report 435. Los Angeles: National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
31.
GlaserR., and SilverE. (1994). Assessment, testing, and instruction: Retrospect and prospect. In and and Darling-HammondL. (Ed.), Review of research in education (Vol. 20, pp. 393—419). Washington, DC: American Educational Research Association.
32.
GreenoJ. G. (2002). Students with competence, authority, and accountability: Affording intellective identities in classrooms. New York: College Board.
33.
HonigM., and HatchT. (2004). Crafting coherence: How schools strategically manage multiple, external demands. Educational Researcher, 33(8), 16—30.
34.
KesidouS., and RosemanJ. E. (2002). How well do middle school science programs measure up? Findings from Project 2061′s curriculum review. Journal of Research in Science Teaching, 39(6), 522—549.
35.
KoretzD., StecherB., and DeibertE. (1992). The reliability of scores from the 1992 Vermont portfolio assessment program. Los Angeles, CA: RAND Institute on Education and Training.
36.
KoretzD., StecherB., KleinS., and McCaffreyD. (1994). The Vermont portfolio assessment program: Findings and implications. Educational Measurement: Issues and Practice, 13(3), 5—16.
37.
LaveJ., and WengerE. (1991). Situated learning: Legitimate peripheral participation. Cambridge: Cambridge University Press.
38.
LeacockC., and ChodorowM. (2003). C-rater: Automated scoring of short answer questions. Computers and the Humanities, 37(4), 389—405.
39.
LeMahieuP. G., GitomerD. H., and EreshJ. T. (1995). Large-scale portfolio assessment: Difficult but not impossible. Educational Measurement: Issues and Practice, 14, 11—28.
40.
MagoneM., CaiJ., SilverE. A., and WangN. (1994). Validating the cognitive complexity and content quality of a mathematics performance assessment. International Journal of Educational Research, 12(3), 317—340.
McDonaldJ. (1992). Teaching: Making sense of an uncertain craft. New York: Teachers College Press.
43.
MessickS. (1989). Validity. In and and LinnR.L. (Ed.), Educational measurement (3rd ed., pp. 13—103). New York: Macmillan.
44.
MislevyR. J. (1995). What can we learn from international assessments?Educational Evaluation and Policy Analysis, 17(4), 419—437.
45.
MislevyR. J. (2005). Issues of structure and issues of scale in assessment from a situative/socio-cultural perspective (CSE Report 668). Los Angeles: National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
46.
MislevyR. J. (2006). Cognitive psychology and educational assessment. In and and BrennanR.L. (Ed.), Educational measurement (4th ed., pp. 257—305). Westport, CT: American Council on Education/Praeger.
47.
MislevyR. J., and HaertelG. (2006). Implications of evidence-centered design for educational testing (Draft PADI Technical Report 17). Menlo Park, CA: SRI International.
48.
MislevyR. J., HamelL., FriedR., GaffneyT., HaertelG., and HafterA.. (2003). Design patterns for assessing science inquiry. Menlo Park, CA: SRI International.
49.
MislevyR. J., and RiconscenteM. M. (2005). Evidence-centered assessment design: Layers, structures, and terminology (PADI Technical Report 9). Menlo Park, CA: SRI International.
50.
MislevyR. J., SteinbergL. S., and AlmondR. G. (2002). On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1, 3—67.
51.
National Assessment Governing Board (NAGB) (1996). Science framework for the 1996 and 2000 National Assessment of Educational Progress. U.S. Department of Education. Washington, DC: The Department. Retrieved October 22, 2006, from http://www.nagb.org/pubs/96-2000science/toc.html.
National Research Council (1996). National science education standards. Washington, DC: National Academy Press.
55.
National Research Council (2000). Inquiry and the national science education standards: A guide for teaching and learning. Washington, DC: National Academy Press.
56.
National Research Council (2002). Learning and understanding: Improving advanced study of mathematics and science in U.S. high schools. Committee on Programs for Advanced Study of Mathematics and Science in American High Schools. GollubJ.P., BertenthalM.W., LabovJ.B., and CurtisP.C. (Eds.). Center for Education, Division of Behavioral and Social Sciences and Education. Washington, DC: National Academy Press.
57.
New Standards Project (1997). New standards performance standards (Vol. 1, Elementary School; Vol. 2, Middle School; Vol. 3, High School). Washington, DC: National Center on Education and the Economy and the University of Pittsburgh.
58.
NuttallD. L., and StobartG. (1994). National curriculum assessment in the U.K. Educational Measurement: Issues and Practice, 13(2), 24—27.
59.
Office of Technology Assessment. (1992). Testing in American schools: Asking the right questions. OTA-SET-519. Washington, DC: U.S. Government Printing Office.
60.
PellegrinoJ. W., BaxterG. P., and GlaserR. (1999). Addressing the “two disciplines” problem: Linking theories of cognition and learning with assessment and instructional practice. In Iran-NejadA., and PearsonP.D. (Eds.), Review of research in education (Vol. 24, pp. 307—353). Washington, DC: American Educational Research Association.
61.
PellegrinoJ. W., ChudowskyN., and GlaserR. (Eds.) (2001). Knowing what students know: The science and design of educational assessment. Washington, DC: National Academy Press.
62.
PineJ., AschbacherP., RothE., JonesM., McPheeC., and MartinC.. (2006). Fifth graders’ science inquiry abilities: A comparative study of students in hands-on and textbook curricula. Journal of Research in Science Teaching, 43(5), 467—484.
63.
PophamW. J., KellerT., MouldingB., PellegrinoJ., and SandiferP. (2005). Instruction-ally supportive accountability tests in science: A viable assessment option?Measurement: Interdisciplinary Research and Perspectives, 3(3), 121—179.
64.
Queensland School Curriculum Council (2002). An outcomes approach to assessment and reporting. Queensland, Australia: Author.
65.
QuintanaC., ReiserB. J., DavisE. A., KrajcikJ., FretzE., and DuncanR. G.. (2004). A scaffolding design framework for software to support science inquiry. Journal of the Learning Sciences, 13(3), 337—386.
66.
ResnickL. B., and ResnickD. P. (1991). Assessing the thinking curriculum: New tools for educational reform. In GiffordB.R., and O'ConnorM.C. (Eds.), Changing assessment: Alternative views of aptitude, achievement and instruction (pp. 37—75). Boston: Kluwer.
67.
RogoffB. (1990). Apprenticeship in thinking: Cognitive development in social context. New York: Oxford University Press.
68.
RoseberryA., WarrenB., and ContantF. (1992). Appropriating scientific discourse: Findings from language minority classrooms. The Journal of the Learning Sciences, 2, 61—94.
69.
ShavelsonR., BaxterG., and PineJ. (1992). Performance assessment: Political rhetoric and measurement reality. Educational Researcher, 21, 22—27.
70.
ShepardL. A. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7), 4—14.
71.
ShermisM. D., and BursteinJ. (2003). Automated essay scoring: A cross-disciplinary perspective. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
72.
SmithC., WiserM., AndersonC., and KrajcikJ. (2006). Implications of research on children's learning for standards and assessment: A proposed learning progression for matter and the atomic-molecular theory. Measurement: Interdisciplinary Research and Perspectives, 4(1&2), 1—98.
73.
SpillaneJ. (2004). Standards deviation: How local schools misunderstand policy. Cambridge, MA: Harvard University Press.
74.
StigginsR. J. (2002). Assessment crisis: The absence of assessment for learning. Phi Delta Kappan, 83(10), 758—765.
75.
VygotskyL. S. (1978). Mind in society. Cambridge, MA: Harvard University Press.
76.
WainerH., and ThissenD. (1993). Combining multiple-choice and constructed-response test scores: Toward a Marxist theory of test construction. Applied Measurement in Education, 6(2), 103—118.
77.
WebbN. L. (1997). Criteria for alignment of expectations and assessments in mathematics and science education. National Institute for Science Education and Council of Chief State School Officers Research Monograph No. 6. Washington, DC: Council of Chief State School Officers.
78.
WebbN. L. (1999). Alignment of science and mathematics standards and assessments in four states (Research monograph No. 18). Madison: University of Wisconsin-Madison, National Institute for Science Education.
79.
WheelerP. H. (1992). Relative costs of various types of assessments. Livermore, CA: EREAPA Associates (ERIC Document No. ED 373074).
80.
WilliamsonD. M., MislevyR. J., and BejarI. (Eds.). (2006). Automated scoring of complex tasks in computer-based testing. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
81.
WilsonM. (Ed.). (2004). Towards coherence between classroom assessment and accountability. The one hundred and third yearbook of the National Society for the Study of Education, Part II. Chicago: National Society for the Study of Education.
82.
WilsonM., and BertenthalM. (Eds.). (2005). Systems for state science assessment. Washington, DC: National Academies Press.
83.
WolfD., BixbyJ., GlennJ., and GardnerH. (1991). To use their minds well: Investigating new forms of student assessment. In and and GrantG. (Ed.), Review of educational research (Vol. 17, pp. 31—74). Washington, DC: American Educational Research Association.