Preparing teachers for the classrooms of the future through research and design.


April 2016

Forecasting Student Achievement in MOOCs with Natural Language Processing

Carly D. Robinson, Justin Reich, Michael Yeomens, Chris Hulleman, Hunter Gehlbach


Student intention and motivation are among the strongest predictors of persistence and completion in Massive Open Online Courses (MOOCs), but these factors are typically measured through fixed-response items that constrain student expression. We use natural language processing techniques to evaluate whether text analysis of open responses questions about motivation and utility value can offer additional capacity to predict persistence and completion over and above information obtained from fixed-response items. Compared to simple benchmarks based on demographics, we find that a machine learning prediction model can learn from unstructured text to predict which students will complete an online course. We show that the model performs well out-of-sample, compared to a standard array of demographics. These results demonstrate the potential for natural language processing to contribute to predicting student success in MOOCs and other forms of open online learning


Robinson, C., Yeomans, M., Reich, J., Hulleman, C., & Gehlbach, H. (2016) Forecasting Student Achievement in MOOCs with Natural Language Processing. Proceedings of the 2016 Learning Analytics and Knowledge Conference, Edinburgh, Scotland.

Links to Research

More Research

The limits of scalable interventions

“Like Upgrading From a Typewriter to a Computer”: Registered Reports in Education Research

Remote Learning Guidance From State Education Agencies During the COVID-19 Pandemic: