Return to search

Exploring a Generalizable Machine Learned Solution for Early Prediction of Student At-Risk Status

Determining which students are at-risk of poorer outcomes -- such as dropping out, failing classes, or decreasing standardized examination scores -- has become an important area of both research and practice in K-12 education. The models produced from this type of predictive modeling research are increasingly used by high schools in Early Warning Systems to identify which students are at risk and intervene to support better outcomes. It has become common practice to re-build and validate these detectors, district-by-district, due to different data semantics and various risk factors for students in different districts. As these detectors become more widely used, however, a new challenge emerges in applying these detectors across a broad spectrum of school districts with varying availability of past student data. Some districts have insufficient high-quality past data for building an effective detector. Novel approaches that can address the complex data challenges a new district presents are critical for advancing the field.

Using an ensemble-based algorithm, I develop a modeling approach that can generate a useful model for a previously unseen district. During the ensembling process, my approach, District Similarity Ensemble Extrapolation (DSEE), weights districts that are more similar to the Target district more strongly during ensembling than less similar districts. Using this approach, I can predict student-at-risk status effectively for unseen districts, across a range of grade ranges, and achieve prediction goodness but ultimately fails to perform better than the previously published Knowles (2015) and Bowers (2012) EWS models proposed for use across districts.

Identiferoai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/d8-5scb-n214
Date January 2021
CreatorsColeman, Chad
Source SetsColumbia University
LanguageEnglish
Detected LanguageEnglish
TypeTheses

Page generated in 0.0216 seconds