Return to search

Using Data Science and Predictive Analytics to Understand 4-Year University Student Churn

The purpose of this study was to discover factors about first-time freshmen that began at one of the six 4-year universities in the former Tennessee Board of Regents (TBR) system, transferred to any other institution after their first year, and graduated with a degree or certificate. These factors would be used with predictive models to identify these students prior to their initial departure. Thirty-four variables about students and the institutions that they attended and graduated from were used to perform principal component analysis to examine the factors involved in their decisions. A subset of 18 variables about these students in their first semester were used to perform principal component analysis and produce a set of 4 factors that were used in 5 predictive models. The 4 factors of students who transferred and graduated elsewhere were “Institutional Characteristics,” “Institution’s Focus on Academics,” “Student Aptitude,” and “Student Community.” These 4 factors were combined with the additional demographic variables of gender, race, residency, and initial institution to form a final dataset used in predictive modeling. The predictive models used were a logistic regression, decision tree, random forest, artificial neural network, and support vector machine. All models had predictive power beyond that of random chance. The logistic regression and support vector machine models had the most predictive power, followed by the artificial neural network, random forest, and decision tree models respectively.

Identiferoai:union.ndltd.org:ETSU/oai:dc.etsu.edu:etd-4800
Date01 May 2018
CreatorsWhitlock, Joshua Lee
PublisherDigital Commons @ East Tennessee State University
Source SetsEast Tennessee State University
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceElectronic Theses and Dissertations
RightsCopyright by the authors.

Page generated in 0.0023 seconds