ANALYSIS OF CUSTOMER CHURN PREDICTION USING LOGISTIC REGRESSION, $k$-NEAREST NEIGHBORS, DECISION TREE AND RANDOM FOREST ALGORITHMS
Keywords:
telecom industry, machine learning, customer churn prediction, feature selection, logistic regression, k-nearest neighbors (k-NN), decision tree, random forest.DOI:
https://doi.org/10.17654/0972361725008Abstract
Customer churn predictions (CCPs) and their comprehensive analysis have become prevalent in the global telecom industry over the last five years, driven by advancements in machine learning (ML) technologies. In addition, AI (artificial intelligence) and ML-based predictive methods are currently employed for CCP applications to enhance customer retention. This predictive CCP methodology streamlines customer management processes and ensures sustainable profit growth. The machine learning models focus on identifying features derived from data that is rich in various types of information. This study analyzes CCP for a specific telecom company’s customer dataset using ML methods such as logistic regression (L.R.), $k$-nearest neighbor ($k$-NN), decision tree (D.T.), and random forest (R.F.). The UCI Iranian telecom churn dataset was utilized, and the influence of potential factors leading to customer churn was also considered. Results show that the tuned RF method yielded the best outcomes, with churn tendency analysis achieving a higher AUC score at 0.9042 with the accuracy of 0.9562. The most important feature of the dataset affecting the customer churn was identified as complains whereas the least important feature happened to be tariff plan.
Received: August 27, 2024
Accepted: October 16, 2024
References
D. A. A. Al-Sayed, W. A. Q. Awad and M. T. M. Salem, A comparative study of forecasting corporate credit ratings using artificial neural networks, support vector machine, random forest, the Naïve Bayes, decision tree and k-nearest neighbor, Advances and Applications in Statistics 91(2) (2024), 125-139.
http://dx.doi.org/10.17654/0972361724010.
A. Alsulami and R. Alabdan, Fraud detection in financial transactions, Advances and Applications in Statistics 91(8) (2024), 969-986.
http://dx.doi.org/10.17654/0972361724052.
S. Arshad, K. Iqbal, S. Naz, S. Yasmin and Z. Rehman, A hybrid system for customer churn prediction and retention analysis via supervised learning, Computers, Materials & Continua 72 (2022), 4283-4301. Doi: 10.32604/cmc.2022.025442.
M. Bogaert and L. Delaere, Ensemble methods in customer churn prediction: a comparative analysis of the state-of-the-art, Mathematics 11(1137) (2023), 3-28.
L. Breiman, Random forests, Machine Learning 45(1) (2001), 5-32.
L. Breiman, J. H. Friedman, R. A. Olsen and C. J. Stone, Classification and Regression Trees, Chapman & Hall/CRC, 1984.
A. Bugajev, R. Kriauziene, O. Vasilecas and V. Chadysas, The impact of churn labelling rules on churn prediction in telecommunications, Informatica 33(2) (2022), 247-277.
A. Burkov, The Hundred Page Machine Learning Book, 2019.
N. Edwine, W. Wang, W. Song and D. Ssebuggwawo, Detecting the risk of customer churn in telecom sector: a comparative study, Math. Probl. Eng. (2022). https://doi.org/10.1155/2022/8534739.
L. Geiler, S. Affeldt and M. Nadif, An effective strategy for churn prediction and customer profiling, Data & Knowledge Engineering 142 (2022), 102100.
S. J. Haddadi, A. Farshidvard, F. S. Silva, J. C. Reis and M. S. Reis, Customer churn prediction in imbalanced data sets with resampling methods: a comparative study, Expert Systems with Applications 246 (2024), 123086.
A. Idris and A. Khan, Churn prediction system for telecom using filter-wrapper and ensemble classification, The Computer Journal 60(3) (2017), 410-430. doi: 10.1093/comjn/bxv123.
G. James, D. Witten, T. Hastie and R. Tibshirani, An Introduction to Statistical Learning with Applications in R, 2nd ed., Springer, New York, 2021.
S. Jhaveri, I. Khedkar, Y. Kanharia and S. Jaswal, Success prediction using random forest, CatBoost, XGBoost and AdaBoost for Kickstarter campaigns, Proceedings of the Third International Conference on Computing Methodologies and Communication (ICCMC-2019), Erode, India, 2019, pp. 1170-1173.
P. Jiang, Z. Liu, M. Z. Abedin, J. Wang, W. Yang and Q. Dong, Profit-driven weighted classifier with interpretable ability for customer churn prediction, Omega 125 (2024), 103034.
A. Keramati and S. M. Ardabili, Churn analysis for an Iranian mobile operator, Telecommunications Policy 35(4) (2011), 344-356.
W. H. Khoh, Y. H. Pang, S. Y. Ooi, L. Y. K. Wang and Q. W. Poh, Predictive churn modelling for sustainable business in the telecommunication industry: optimized weighted ensemble machine learning, Sustainability 15 (2023), 8631. https://doi.org/10.3390/su15118631.
B. Lantz, Machine Learning with R, Pack Publishing, 4th ed., Birmingham, UK, 2023.
Y. Liang, J. Wu, W. Wang, Y. Cao, B. Zhong, Z. Chen and Z. Li, Product marketing prediction based on XGboost and LightGBM algorithm, Proceedings of the 2nd International Conference on Artificial Intelligence and Pattern Recognition AIPR-2019, Beijing, China, 16-18 August 2019.
DOI: 10.1145/3357254.3357290.
R. Liu, S. Ali, S. F. Bilal, Z. Sakhawat, A. Imran, A. Almuhaimeed, A. Alzahrani and G. Sun, An intelligent hybrid scheme for customer churn prediction integrating clustering and classification algorithms, Appl. Sci. 12 (2022), 9355. https://doi.org/10.3390/app12189355.
P. Ramesh, J. J. Emilyn and V. Vijayakumar, Hybrid artificial neural networks using customer churn prediction, Wireless Personal Communications 124 (2022), 1695-1709.
C. Rao, Y. Xu, X. Xiao, F. Hu and M. Goh, Imbalanced customer churn classification using a new multi-strategy collaborative processing method, Expert Systems with Applications 247 (2024), 123251.
F. F. Reichheld and T. Teal, The Loyalty Effect: The Hidden Force Behind Growth, Profits, and Lasting Value, Harvard Business Press, Boston, Massachusetts, 2001.
L. Saha, H. K. Tripathy, T. Gaber, H. El-Gohary and E.-S. M. El-Kenawy, Deep churn prediction method for telecommunication industry, Sustainability 15(4543) (2023). https://doi.org/10.3390/su15054543.
O. Soleiman-garmabaki and M. H. Rezvani, Ensemble classification using balanced data to predict customer churn: a case study on the telecom industry, Multimedia Tools and Applications 83 (2024), 44799-44831.
https://doi.org/10.1007/s11042-023-17267-9.
S. Tavassoli and H. Koosha, Hybrid ensemble learning approaches to customer churn prediction, Kybernetes 51(3) (2022), 1062-1088.
M. Yalçin and S. B. Kalkan, Determining the best estimation model with tree-based machine learning methods: implementation on customer spendings for e- commerce websites, Advances and Applications in Statistics 75 (2022), 91-109. https://dx.doi.org/10.17654/0972361722029.
Y. Zhou, W. Chen, X. Sun and D. Yang, Early warning of telecom enterprise customer churn based on ensemble learning, PLoS ONE 18(10) (2023), e0292466.
B. Zhu, C. Qian, S. V. Broucke, J. Xiao and Y. Li, A bagging-based selective ensemble model for churn prediction on imbalanced data, Expert Systems with Applications 227 (2023), 120223.
S. B. Coskun and M. Turanli, Credit risk analysis using boosting methods, Journal of Applied Mathematics, Statistics and Information 19 (2023), 5-18.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Pushpa Publishing House, Prayagraj, India

This work is licensed under a Creative Commons Attribution 4.0 International License.
____________________________
Attribution: Credit Pushpa Publishing House as the original publisher, including title and author(s) if applicable.
No Derivatives: Modifying or creating derivative works not allowed without written permission.
Contact Pushpa Publishing House for more info or permissions.
Journal Impact Factor: 