April 2016

Forecasting Student Achievement in MOOCs with Natural Language Processing

Carly D. Robinson, Justin Reich, Michael Yeomens, Chris Hulleman, Hunter Gehlbach


Student intention and motivation are among the strongest predictors of persistence and completion in Massive Open Online Courses (MOOCs), but these factors are typically measured through fixed-response items that constrain student expression. We use natural language processing techniques to evaluate whether text analysis of open responses questions about motivation and utility value can offer additional capacity to predict persistence and completion over and above information obtained from fixed-response items. Compared to simple benchmarks based on demographics, we find that a machine learning prediction model can learn from unstructured text to predict which students will complete an online course. We show that the model performs well out-of-sample, compared to a standard array of demographics. These results demonstrate the potential for natural language processing to contribute to predicting student success in MOOCs and other forms of open online learning


Robinson, C., Yeomans, M., Reich, J., Hulleman, C., & Gehlbach, H. (2016) Forecasting Student Achievement in MOOCs with Natural Language Processing. Proceedings of the 2016 Learning Analytics and Knowledge Conference, Edinburgh, Scotland.

Links to Research

More Research

Simulating more Equitable Discussions: Using Teacher Moments And Practice Based Teacher Education In Mathematical Professional Learning

Let’s hit the refresh button (a couple of times): Reimagining math curriculum and teacher learning to broaden participation in the math of the future

The power to change the equation: Mathematics teacher learning reimagined