Sage Journals: Discover world-class research

Abstract

This study describes psychometric investigations of a developmental assessment in mathematics composed of 19 performance tasks. Techniques from classical and the many-faceted Rasch approaches were combined to analyze field test data from a mixedage sample (n = 110). Descriptive statistics on scores from the overall scale and subdomains indicated increased proficiency with age. Convergent validity coefficients of scores with scaled scores of the Stanford Achievement Test mathematics battery ranged from .22 to .57; internal consistency reliability of the total scores on proficiency and independence were .90 and .92, respectively; and median interrater reliability was .80. Classical item statistics and Rasch logit difficulties of tasks suggested an ordered scale structure, although eight tasks did not fit Rasch criteria. The original and calibrated task orderings were consistent at the extreme ends of the scale. Findings point to directions for future improvement of the scale and rater training programs.

Get full access to this article

View all access options for this article.

References

American Educational Research Association (AERA) , American Psychological Association (APA) , & National Council on Measurement in Education (NCME) . (1985). Standards for educational and psychological testing. Washington, DC: Author.

Anderson, R. H. , & Pavan, B. N. (1993). Nongradedness: Helping it to happen. Lancaster, PA: Technomic.

Banerji, M. (1997, March). Student achievement in Grade 3-5 classrooms implementing curriculum-based assessment reforms in mathematics. Paper presented at the annual meeting of the American Educational Research Association, Chicago.

Banerji, M. , & Ferron, J. (1998). Construct validity of scores from a developmental assessment made up of mathematical patterns tasks. Educational and Psychological Measurement, 48(4), 634-660.

Brown, A. L. , Campione, J. C. , Webber, L. S. , & McGilly, K. (1992). Interactive learning environments: A new look at assessment and instruction. In B. R. Gifford & M. C. O’Connor (Eds.), Changing assessments: Alternative views of aptitude, achievement and instruction (pp. 121-211). Boston: Kluwer Academic.

Crocker, L. , & Algina, J. (1986). An introduction to classical and modern test theory. New York: Holt, Rinehart & Winston.

Engelhard, G. (1992). The measurement of writing competence with a many-faceted Rasch model. Applied Measurement in Education, 5(3), 171-191.

Engelhard, G. (1994). Examining rater errors in the assessment of written composition with a many-faceted Rasch model. Journal of Educational Measurement, 31(2), 93-112.

Feuerstein, R. (1979). The dynamic assessment of retarded learners: The learning potential of assessment device, theory, instruments, and techniques. Baltimore: University Park Press.

10.

Goodlad, J. I. , & Anderson, R. H. (1987). The nongraded elementary school. New York: Teachers College Press, Columbia University.

11.

Kulm, G. (1990). Assessing higher order mathematical thinking: What we need to know and be able to do. In G. Kulm (Ed.), Assessing higher order thinking in mathematics (pp. 1-4). Washington, DC: American Association for the Advancement of Science.

12.

Lane, S. , Liu, M. , Ankenmann, R. D. , & Stone, C. E. (1996). Generalizability and validity of a mathematics performance assessment. Journal of Educational Measurement, 33(1), 71-92.

13.

Linacre, J. M. (1994). Many-facet Rasch measurement. Chicago: University of Chicago, Measurement, Evaluation, and Statistical Analysis (MESA) Press.

14.

Lunz, M. E. , Wright, B. D. , & Linacre, J. M. (1990). Measuring the impact of judge severity on examination scores. Applied Measurement in Education, 3(4), 331-345.

15.

Myford, C. , & Mislevy, R. J. (1995). Monitoring and improving a portfolio assessment system. Princeton, NJ: Educational Testing Service.

16.

National Association for the Education of Young Children. (NAEYC) . (1988). Testing of young children: Concerns and cautions. Washington, DC: Author.

17.

National Council of Teachers in Mathematics. (NCTM) . (1980). An agenda for action: Recommendations for school mathematics for the 1980s. Reston, VA: Author.

18.

National Council of Teachers in Mathematics. (NCTM). (1989). Curriculum and evaluation standards for school mathematics. Reston, VA: Author.

19.

National Council of Teachers in Mathematics. (NCTM). (1995). Assessment standards for school mathematics. Reston, VA: Author.

20.

The Psychological Corporation . (1989). Stanford Achievement Tests Series (8th ed.). San Antonio, TX: Harcourt Brace.

21.

Shepard, L. A. (1989). Why we need better assessments. Educational Leadership, 46(7), 4-9.

22.

Stiggins, R. J. (1991). Facing the challenges of a new era in educational assessment. Applied Measurement in Education, 4(4), 263-273.

23.

Wechsler, D. (1989). Wechsler Preschool and Primary Scale of Intelligence-R: Manual. San Antonio, TX: Harcourt Brace.

24.

Woodcock, R. W. , & Johnson, M. B. (1989). Tests of achievement: Standard and supplemental batteries. Allen, TX: DLM Teaching Resources.

25.

Wright, B. D. , & Masters, G. N (1982). Rating scale analysis: Rasch measurement. Chicago: MESA Press.

Validation of Scores/Measures from a K-2 Developmental Assessment in Mathematics

Abstract

Get full access to this article

References