AbediJ. (2001, December). Language accommodation for large-scale assessment in science: Assessing English language learners (Final Deliverable, Project 2.4 Accommodation). Los Angeles: National Center for Research on Evaluation, Standards, and Student Testing, University of California Los Angeles.
2.
AbediJ. (Ed.). (2007). English language proficiency assessment in the nation: Current status and future practice. Davis: University of California, Davis, School of Education.
AbediJ.LordC.HofstetterC.BakerE. (2000). Impact of accommodation strategies on English language learners’ test performance. Educational Measurement: Issues and Practice, 19(3), 16–26.
6.
AbediJ.HofstetterC. H.LordC. (2004). Assessment accommodations for English language learners: Implications for policy-based empirical research. Review of Educational Research, 74, 1–28.
7.
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
8.
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
AnstromK.DiCerboP.KatzA.MilletJ.RiveraC. (2010). A review of the literature on academic English: Implications for K-12 English language learners. Arlington, VA: The George Washington University Center for Equity and Excellence in Education.
11.
BaileyA. L.ButlerF. (2007). A conceptual framework for academic English language for broad application to education. In BaileyA. L. (Ed.), The language demands of school: Putting academic English to the test (pp. 68–102). New Haven, CT: Yale University Press.
12.
BaileyA. L.CarrollP. E. (2015). Assessment of English language learners in the era of new academic content standards. Review of Research in Education, 39, 253–294.
13.
BaileyA. L.HuangB. H. (2011). Do current English language development/proficiency standards reflect the English needed for success in school?Language Testing, 28, 343–365.
14.
BennettR. E. (2010). Cognitively based assessment of, for, and as learning (CBAL): A preliminary theory of action for summative and formative assessment. Measurement: Interdisciplinary Research and Perspectives, 8, 70–91.
15.
BholaD. S.ImparaJ. C.BuckendahlC. W. (2003). Aligning tests with states’ content standards: Methods and issues. Educational Measurement: Issues and Practice, 22(3), 21–29.
16.
BunchM. B. (2011). Testing English language learners under No Child Left Behind. Language Testing, 28, 323–341. doi:10.1177/026553221140418.
CampbellD. T.FiskeD. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105.
19.
CarrollP.BaileyA. (2014, April). Classification models and English learner redesignation: High performing students left behind?Paper presented at the annual meeting of the National Council on Measurement in Education, Philadelphia, PA.
20.
CookH. G.WilmesC.BoalsT.SantosM. (2008). Issues in the development of annual measurable achievement objectives for WIDA Consortium states (WCER Working Paper No. 2008-2). Madison: Wisconsin Center for Education Research.
21.
Council of Chief State School Officers. (2013). English language proficiency (ELP) standards with correspondences to K-12 English language arts (ELA), mathematics, and science practices, K-12 ELA standards, and 6-12 literacy standards. Washington, DC: Author.
22.
CrockerL. M.MillerD.FranksE. A. (1989). Quantitative methods for assessing the fit between test and curriculum. Applied Measurement in Education, 2, 179–194.
23.
CrottsK.SireciS. G. (2014, April). Evaluating computer-based test accommodations for English learners. Paper presented at the annual meeting of the National Council on Measurement in Education, Philadelphia, PA.
24.
CurrieM.ChiramaneeT. (2010). The effect of the multiple-choice item format on the measurement of knowledge of language structure. Language Testing, 27, 471–491. doi:10.1177/0265532209356790
25.
De la TorreJ.SongH.HongY. (2011). A comparison of four methods of IRT subscoring. Applied Psychological Measurement, 35, 296–316. doi:10.1177/0146621610378653
26.
DuncanG. D.del Rio ParantL.ChenW.-H.FerraraS.JohnsonE.OpplerS.ShiehY.-Y. (2005). Study of a dual-language test booklet in eighth-grade mathematics. Applied Measurement in Education, 18, 129–161.
27.
Educational Testing Service. (2009). Guidelines for the assessment of English learners. Princeton, NJ: Author.
28.
ForteE. (2010). Examining the assumptions underlying the NCLB federal accountability policy on School Improvement. Educational Psychologist, 45, 76–88.
29.
ForteE.Faulkner-BondM.WaringS.KutiL.FennerD. S. (2010). The administrator’s guide to federal programs for English learners. Washington, DC: Thompson.
30.
GeisingerK. F. (2000). Psychological testing at the end of the millennium: A brief historical review. Professional Psychology: Research and Practice, 31, 117–118.
HaladynaT. M.RodriguezM. C. (2014). Developing and validating test items. New York, NY: Routledge.
33.
HaugerJ. B.SireciS. G. (2008). Detecting differential item functioning across examinees tested in their dominant language and examinees tested in a second language. International Journal of Testing, 8, 237–250.
In’namiY.KoizumiR. (2009). A meta-analysis of test format effects on reading and listening test performance: Focus on multiple-choice and open-ended formats. Language Testing, 26, 219–244. doi:10.1177/0265532208101006
36.
In’namiY.KoizumiR. (2012). Factor structure of the Revised TOEIC[R]Test: A multiple-sample analysis. Language Testing, 29, 131–152.
37.
International Test Commission (2010). Guidelines for translating and adapting tests. Retrieved from http://www.intestcom.org. Accessed October 12, 2014.
38.
KachchafR.Solano-FloresG. (2012). Rater language background as a source of measurement error in the testing of English language learners. Applied Measurement in Education, 25, 162–177.
39.
KaneM. (1994). Validating the performance standards associated with passing scores. Review of Educational Research, 64, 425–461.
40.
KaneM. (2006). Validation. In BrennanR. L. (Ed.), Educational measurement (4th ed., pp. 17–64). Washington, DC: American Council on Education/Praeger.
41.
KaneM. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1–73.
42.
KaneM. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527–535.
43.
KiefferM. J.LesauxN. K.RiveraM.FrancisD. J. (2009). Accommodations for English language learners taking large-scale assessments: A meta-analysis on effectiveness and validity. Review of Educational Research, 79, 1168–1201.
44.
KimJ.HermanJ. L. (2009). A three-state study of English learner progress. Educational Assessment, 14, 212–231. doi:10.1080/10627190903422831
45.
KimJ.HermanJ. L. (2012). Understanding patterns and precursors of ELL success subsequent to reclassification (CRESST Report No. 818). Los Angeles, CA: National Center for Research on Evaluation, Standards, and Student Testing.
46.
KoenigJ. A.BachmanL. F. (2004). Keeping score for all: The effects of inclusion and accommodation policies on large-scale educational assessments. Washington, DC: National Academies Press.
47.
KoprivaR. J.HedgspethC. (2005). Technical manual, selection taxonomy for English language learner accommodation (STELLA) decision-making systems. College Park: University of Maryland, C-SAVE.
48.
KoprivaR. J.EmickJ. E.Hipolito-DelgadoC. P.CameronC. A. (2007). Do proper accommodation assignments make a difference? Examining the impact of improved decision making on scores of English language learners. Educational Measurement: Issues and Practice, 26(3), 11–20.
49.
KuriakoseA. (2011, January1). The factor structure of the English language development assessment: A confirmatory factor analysis (Doctoral dissertation). Available from ProQuest Dissertations and Theses database. (ED535975)
50.
LakinJ. M.YoungJ. W. (2013). Evaluating growth for ELL students: Implications for accountability policies. Educational Measurement: Issues and Practice, 32(3), 11–26. doi:10.1111/emip.12012
51.
LaneS. (2014). Validity evidence based on testing consequences. Psicothema, 26, 127–135. doi:10.7334/psicothema2013.258
52.
LaneS.LeventhalB (2015). Psychometric challenges in assessing English language learners and students with disabilities. Review of Research in Education, 39, 165–214.
53.
LeeW. (2008). Classification consistency and accuracy for complex assessments using item response theory (CASMA Research Report No. 27). Iowa City: University of Iowa.
54.
LiH.SuenH. K. (2012). The effects of test accommodations for English learners: A meta-analysis. Applied Measurement in Education, 25, 327–346.
55.
LinquantiR. (2001). The redesignation dilemma: Challenges and choices in fostering meaningful accountability for English learners (Policy Report No. 2001-1). Santa Barbara: University of California Linguistic Minority Research Institute.
56.
LivingstonS. A.LewisC. (1995). Estimating the consistency and accuracy of classifications based on test scores. Journal of Educational Measurement, 32, 179–197.
57.
LuechtR. M.AckermanT. (2007). Oregon English Language Proficiency Examination (EPLA) dimensionality analysis for blended-domain locator blocks. Greensboro, NC: Center for Assessment Research and Technology.
58.
MartinielloM. (2008). Language and the performance of English-language learners in math word problems. Harvard Educational Review, 78, 333–368.
59.
MartoneA.SireciS. G. (2009). Evaluating alignment between curriculum, assessments, and instruction. Review of Educational Research, 4, 1332–1361.
McNamaraT.KnochU. (2012). The Rasch wars: The emergence of Rasch measurement in language testing. Language Testing, 29, 555–576. doi:10.1177/0265532211430367
62.
MessickS. (1989). Validity. In LinnR. (Ed.), Educational measurement (3rd ed., pp. 13–100). Washington, DC: American Council on Education.
63.
MosherF. A. (2011). The role of learning progressions in standards-based education reform (Policy Brief No. RB-52). Philadelphia, PA: Consortium for Policy Research in Education.
National Research Council. (2011). Allocating federal funds for state programs for English language learners. Washington, DC: National Academies Press. Retrieved from http://www.nap.edu/openbook.php?record_id=13090
66.
O’ConnerR.AbediJ.TungS. (2012a). A descriptive analysis of enrollment and achievement among English language learner students in Pennsylvania: Summary (Issues & Answers, REL 2012-No. 127). Retrieved from http://files.eric.ed.gov/fulltext/ED531429.pdf
67.
O’ConnerR.AbediJ.TungS. (2012b). A descriptive analysis of enrollment and achievement among limited English proficient students in New Jersey (Issues & Answers, REL 2012-No. 108). Retrieved from http://files.eric.ed.gov/fulltext/ED531432.pdf
68.
PadillaJ.-L.BenitezI. (2014). Validity evidence based on response processes. Psicothema, 26, 110–117.
69.
ParkerC.LouieJ.O’DwyerL. (2009). New measures of English language proficiency and their relationship to performance on large-scale content assessments (Issues & Answers, REL 2009 No. 066). Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory Northeast and Islands. Retrieved from http://ies.ed.gov/ncee/edlabs
70.
Partnership for the Assessment of Readiness for College and Career. (2013). Accessibility features and accommodations manual. Washington, DC: Author.
71.
Pennock-RomanM. (2002). Relative effects of English proficiency on general admissions tests versus subject tests. Research in Higher Education, 43, 601–623.
72.
Pennock-RomanM.RiveraC. (2011). Mean effects of test accommodations for ELLs and non-ELLs: A meta-analysis of experimental studies. Educational Measurement: Issues and Practice, 30(3), 10–28.
73.
RaganA.LesauxN. (2006). Federal, state, and district level English language learner program entry and exit requirements: Effects on the education of language minority learners. Education Policy Analysis Archives, 14(20), 1–32.
74.
RamseyP. A. (1993). Sensitivity review: The ETS experience as a case study. In HollandP. W.WainerH. (Eds.), Differential item functioning (pp. 367–388). Hillsdale, NJ: Erlbaum.
75.
RiveraC.CollumE.WillnerL. S.SiaJ. K.Jr. (2006). An analysis of state assessment policies addressing the accommodation of English language learners. In RiveraC.CollumE. (Eds.), A national review of state assessment policy and practice for English language learners (pp. 1–173). Mahwah, NJ: Lawrence Erlbaum.
76.
RobinsonJ. P. (2011). Evaluating criteria for English learner reclassification: A causal-effects approach using a binding-score regression discontinuity design with instrumental variables. Educational Evaluation and Policy Analysis, 33, 267–292. doi:10.3102/0162373711407912
77.
RömhildA.KenyonD.MacGregorD. (2011). Exploring domain-general and domain-specific linguistic knowledge in the assessment of academic English language proficiency. Language Assessment Quarterly, 8, 213–228. doi:10.1080/15434303.2011.558146
78.
RudnerL. M. (2001). Computing the expected proportions of misclassified examinees. Practical Assessment, Research & Evaluation, 7(14).
79.
RudnerL. M. (2004, April). Expected classification accuracy. Paper presented at the annual meeting of the National Council on Measurement in Education, San Diego, CA.
80.
SaundersW. M.GoldenbergC. (2010). Research to guide English language development instruction. In Improving education for English learners: Research-based approaches (pp. 20–81). Sacramento: California Department of Education.
81.
SawakiY.StrickerL. J.OranjeA. H. (2009). Factor structure of the TOEFL Internet-based test. Language Testing, 26, 5–30.
82.
SchleppegrellM. J.O’HallaronC. L. (2011). Teaching academic language in L2 secondary settings. Annual Review of Applied Linguistics, 31, 3–18. doi:10.1017/S0267190511000067
83.
ShepardL.TaylorG.BetebennerD. (1998). Inclusion of limited-English-proficient students in Rhode Island’s grade 4 mathematics performance assessment. Los Angeles: University of California, Center for the Study of Evaluation/National Center for Research on Evaluation, Standards, and Student Testing.
84.
SireciS. G. (1998). Gathering and analyzing content validity data. Educational Assessment, 5, 299–321.
85.
SireciS. G. (2013). Agreeing on validity arguments. Journal of Educational Measurement, 50, 99–104.
86.
SireciS. G.LiS.ScarpatiS. (2003). The effects of tests accommodations on test performance: A review of the literature. Commissioned paper by the National Academy of Sciences/National Research Council’s Board on Testing and Assessment. Washington, DC: National Research Council.
87.
SireciS. G.MullaneL. A. (1994). Evaluating test fairness in licensure testing: The sensitivity review process. CLEAR Exam Review, 5(2), 22–28.
88.
SireciS. G.RiosJ. A.PowersS. (in press). Comparing test scores from tests administered in different languages. In DoransN.CookL. (Eds.) Fairness. New York, NY: Routledge.
89.
SireciS. G.WellsC.HuH. (2014, April). Using internal structure validity evidence to evaluate test accommodations. Paper presented at the annual meeting of the National Council on Measurement in Education, Philadelphia, PA.
90.
Smarter Balanced Assessment Consortium. (2013). Usability, accessibility, and accommodations guidelines. San Francisco, CA: WestEd.
Solano-FloresG.LiM. (2009). Language variation and score variation in the testing of English language learners, native Spanish speakers. Educational Assessment, 14, 180–194.
93.
Solano-FloresG.TrumbullE.Nelson-BarberS. (2002). Concurrent development of dual language assessments: An alternative to translating tests for linguistic minorities. International Journal of Testing, 2, 107–129.
94.
StansfieldC. W. (2003). Test translation and adaptation in public education in the USA. Language Testing, 20, 188–206.
95.
SwansonC. B. (2009). Perspectives on a population: English-language learners in American Schools. Bethesda, MD: Editorial Projects in Education Research Center.
96.
ThompsonS.BlountA.ThurlowM. (2002). A summary of research on the effects of test accommodations: 1999 through 2001 (Technical Report No. 34). Minneapolis: University of Minnesota, National Center on Educational Outcomes. Retrieved from http://education.umn.edu/NCEO/OnlinePubs/Technical34.htm
U.S. Department of Education. (2012). National evaluation of Title III implementation supplemental report: Exploring approaches to setting English language proficiency performance criteria and monitoring English learner progress. Washington, DC: Author.
100.
U.S. Department of Education, Office of Elementary and Secondary Education. (2009). Standards and assessments peer review guidance: Information and examples for meeting the requirements of the No Child Left Behind Act of 2001 (Third revision). Washington, DC: Author.
101.
WIDA Consortium. (2012). 2012 Amplification of the English language development standards, Kindergarten-Grade 12. Madison: Board of Regents of the University of Wisconsin System.
102.
WillnerL.RiveraC.AcostaB. (2008). Descriptive study of state assessment policies for accommodating English language learners. Arlington, VA: George Washington University Center for Equity and Excellence in Education.
103.
WilsonM.MooreS. (2011). Building out a measurement model to incorporate complexities of testing in the language domain. Language Testing, 28, 441–462. doi:10.1177/0265532210394142
104.
WolfM. K.KimJ.KaoJ. (2012). The effects of glossary and read-aloud accommodations on English language learners’ performance on a mathematics assessment. Applied Measurement in Education, 25, 347–374.
105.
WolfM. K.LeonS. (2009). An investigation of the language demands in content assessments for English language learners. Educational Assessment, 14, 139–159. doi:10.1080/10627190903425883
106.
WolfM. K.WangY.HoltzmanS. (2011, April). Investigating the constructs of English language proficiency assessments and ELLs’ performance on the assessments. Paper presented at the annual meeting of the American Educational Research Association, New Orleans, LA.
107.
Working Group on ELL Policy. (2010). Improving educational outcomes for English language learners: Recommendations for the reauthorization of the Elementary and Secondary Education Act. Retrieved from http://ellpolicy.org/wp-content/uploads/ESEAFinal.pdf
ZhangB. (2010). Assessing the accuracy and consistency of language proficiency classification under competing measurement models. Language Testing, 27, 119–140.
110.
ZwickR.SchlemerL. (2004). SAT validity for linguistic minorities at the University of California, Santa Barbara. Educational Measurement: Issues and Practice, 23, 6–16.