BakerE.L., LinnR.L., HermanJ.L., & KoretzD. (2002). Standards for educational accountability systems (policy brief No. 5).Los Angeles: CRESST.
2.
BlackP., & WiliamD. (1998a). Inside the black box: Raising standards through classroom assessment. Phi Delta Kappan, 80(2), 139–148.
3.
BlackP., & WiliamD. (1998b). Assessment and classroom learning. Assessment in Education, 5(1), 7–73.
4.
BransfordJ.D., & SchwartzD.L. (2001). Rethinking transfer: A simple proposal with multiple implications. Review of Research in Education, 24, 61–100.
5.
BrownA. (1987). Metacognition, executive control, self-regulation, and other more mysterious mechanisms. In WeinertF. E., & KluweR.H. (Eds.), Metacognition, motivation, and understanding (pp. 60–108). Hillsdale, NJ: Erlbaum.
6.
California State Board of Education. (2000). Science content standards for California public schools: Kindergarten through grade twelve.Sacramento, CA: CDE Press.
7.
CampbellD.T., & StanleyJ.C. (1963). Experimental designs for research on teaching. In GageN.L. (Ed.), Handbook of research on teaching (pp. 171–246). Chicago: Rand McNally.
8.
CampioneJ. (1987). Metacognitive components of instructional research with problem learners. In WeinertF. E., & KluweR.H. (Eds.), Metacognition, motivation, and understanding (pp. 117–140). Hillsdale, NJ: Erlbaum.
9.
CampioneJ.C., & BrownA.L. (1990). Guided learning and transfer: Implications for approaches to assessment. In FrederiksenN., GlaserR., LesgoldA., & ShaftoM. (Eds.), Diagnostic monitoring of skill and knowledge acquisition (pp. 141–172). Hills-dale, NJ: Erlbaum.
10.
ChronbachL.J. (1971). Test validation. In ThorndikeE.L. (Ed.), Educational measurement (2nd ed., pp. 443–507). Washington, DC: American Council on Education.
11.
FrederiksenJ.R., & CollinsA. (1989). A systems approach to educational testing. Educational Researcher, 18(9), 27–32.
12.
FrederiksenJ.R., & CollinsA. (1996). Designing an assessment system for the workplace of the future. In ResnickL.B., WirtJ., & JenkinsD. (Eds.). Linking school and work: Roles for standards and assessment (pp. 193–221). San Francisco: Jossey-Bass.
13.
FrederiksenJ.R., & WhiteB.Y. (1997). Cognitive facilitation: A method for promoting reflective collaboration. In Proceedings of the Second International Conference on Computer Support for Collaborative Learning (pp. 53–62). Toronto: University of Toronto.
14.
KruskalJ.B. (1964). Multidimensional scaling by optimizing goodness of fit to a non-metric hypothesis. Psychometrika, 29, 1–27.
LinnR.L. (2000). Assessment and accountability. Educational Researcher, 29(2), 4–14.
18.
MislevyR.J. (1994). The interplay of evidence and consequences in educational assessment. Psychometrika, 59, 439–483.
19.
MislevyR.J., SteinbergL.S., & AlmondR.G. (2003). On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1(1), 3–62.
20.
MossP.A. (1992). Shifting conceptions of validity in educational measurement: Implications for performance assessment. Review of Educational Research, 62(3), 229–258.
21.
MossP. (1994). Can there be validity without reliability?Educational Researcher, 23(2), 5–12.
22.
National Research Council. (1996). National Science Education Standards.Washington, DC: National Academy of Sciences.
23.
PearlJ. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference.San Mateo, CA: Morgan Kaufmann.
24.
PellegrinoJ., ChudowskyN., & GlaserR. (Eds.). (2001). Knowing what students know: The science and design of educational assessment.Washington, DC: National Academy Press.
25.
ShavelsonR., & WebbN. (1991). Generalizability theory: A primer.London: Sage.
26.
ShepardL.A. (1993). Evaluating test validity. Review of Research in Education, 19, 405–450.
27.
ShepardL.A. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7), 4–14.
28.
ThagardP.R. (1989). Explanatory coherence. Behavioral and Brain Sciences, 12, 435–502.
29.
WhiteB., & FrederiksenJ. (1998). Inquiry, modeling, and metacognition: Making science accessible to all students. Cognition and Instruction, 16(1), 3–118.
30.
WigginsG. (1993). Assessment worthy of the liberal arts; Authenticity, context, and validity. In Assessing student performance (pp. 34–71; 206–255). San Francisco: Jossey-Bass.