Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

The aim of this research work is predicting undergraduate student dropout in a public post-secondary education institution in the Southeast United States. The main sources of data are college database storage and National Student Clearinghouse. Datasets DS-57, DS-11 and DS-101 are created from those sources. All datasets are trained using suitable classification machine learning models. Agile practices are followed to perform experiments. From the results, it is observed that important features predictive of dropouts are related to academic performance and financial aid. Models are evaluated on percent accuracy and F-measure. Random Forest performed with 0.86 F-measure and 87.04 percent classification accuracy. Further training with ensemble machine learning techniques improved F-measure to 0.903 and classification accuracy to 90.8 percent.

Details

PDF

Statistics

from
to
Export
Download Full History