Abstract
This study focuses on improving the predictive power of early warning systems (EWSs) to decrease chronic absenteeism in early childhood. Using a demographically diverse sample of students followed from PreK to third grade in Boston Public Schools (N = 6,698), we demonstrate how and why two modern machine learning (ML) algorithms—the Synthetic Minority Oversampling Technique (SMOTE) and Extreme Gradient Boosting (XGBoost)—can enhance EWS accuracy. The best-performing XGBoost model with SMOTE achieved a 54-percentage point improvement in accuracy (in terms of recall rate) over the logistic regression model closest to those used in current EWSs, more accurately detecting students who would become chronically absent in third grade, and outperformed other ML approaches evaluated. Notably, models excluding student demographic information maintained comparable predictive accuracy.
Get full access to this article
View all access options for this article.
