Churn prediction is a critical task for businesses to retain their valuable customers. This paper presents a comprehensive study of churn prediction in the telecom sector using 15 approaches, including popular algorithms such as Logistic Regression, Support Vector Machine, Decision Tree, Random Forest, and AdaBoost.
The study is segmented into three sets of experiments, each focusing on a different approach to building the churn prediction model. The model is constructed using the original training set in the first set of experiments. The second set involves oversampling the training set to address the issue of imbalanced data. Lastly, the third set combines oversampling with recursive feature selection to enhance the model's performance further.
The results demonstrate that the Adaptive Boost classifier, implemented with oversampling and recursive feature selection, outperforms the other 14 techniques. It achieves the highest rank in all three evaluation metrics: recall (0.841), f1-score (0.655), and roc_auc (0.793), further indicating that the proposed approach effectively predicts churn and provides valuable insights into customer behavior.
Identifer | oai:union.ndltd.org:CALPOLY/oai:digitalcommons.calpoly.edu:theses-4342 |
Date | 01 June 2023 |
Creators | Tran, Long Dinh |
Publisher | DigitalCommons@CalPoly |
Source Sets | California Polytechnic State University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Master's Theses |
Page generated in 0.002 seconds