Abstract
This study examined the effectiveness of the three- parameter IRT model in vertically equating five over lapping levels of a mathematics computation test. One to four test levels were administered within intact classrooms to randomly equivalent groups of third through eighth grade students. Test characteristic curves were derived for each grade/test level combina tion. It was generally found that an examinee would receive a higher ability estimate if the test level ad ministered had been calibrated on less able examinees. Practical implications for "out-of-level" and adaptive testing are discussed.
Get full access to this article
View all access options for this article.
