دورية أكاديمية

Comparing Oversampling Techniques to Handle the Class Imbalance Problem: A Customer Churn Prediction Case Study

التفاصيل البيبلوغرافية
العنوان: Comparing Oversampling Techniques to Handle the Class Imbalance Problem: A Customer Churn Prediction Case Study
المؤلفون: Adnan Amin, Sajid Anwar, Awais Adnan, Muhammad Nawaz, Newton Howard, Junaid Qadir, Ahmad Hawalah, Amir Hussain
المصدر: IEEE Access, Vol 4, Pp 7940-7957 (2016)
بيانات النشر: IEEE, 2016.
سنة النشر: 2016
المجموعة: LCC:Electrical engineering. Electronics. Nuclear engineering
مصطلحات موضوعية: SMOTE, ADASYN, mega trend diffusion function, class imbalance, rough set, customer churn, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
الوصف: Customer retention is a major issue for various service-based organizations particularly telecom industry, wherein predictive models for observing the behavior of customers are one of the great instruments in customer retention process and inferring the future behavior of the customers. However, the performances of predictive models are greatly affected when the real-world data set is highly imbalanced. A data set is called imbalanced if the samples size from one class is very much smaller or larger than the other classes. The most commonly used technique is over/under sampling for handling the class-imbalance problem (CIP) in various domains. In this paper, we survey six well-known sampling techniques and compare the performances of these key techniques, i.e., mega-trend diffusion function (MTDF), synthetic minority oversampling technique, adaptive synthetic sampling approach, couples top-N reverse k-nearest neighbor, majority weighted minority oversampling technique, and immune centroids oversampling technique. Moreover, this paper also reveals the evaluation of four rules-generation algorithms (the learning from example module, version 2 (LEM2), covering, exhaustive, and genetic algorithms) using publicly available data sets. The empirical results demonstrate that the overall predictive performance of MTDF and rules-generation based on genetic algorithms performed the best as compared with the rest of the evaluated oversampling methods and rule-generation algorithms.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2169-3536
Relation: https://ieeexplore.ieee.org/document/7707454/; https://doaj.org/toc/2169-3536
DOI: 10.1109/ACCESS.2016.2619719
URL الوصول: https://doaj.org/article/b0201a5a2f8c4d6aa4f5edaa7d9f6d8b
رقم الأكسشن: edsdoj.b0201a5a2f8c4d6aa4f5edaa7d9f6d8b
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:21693536
DOI:10.1109/ACCESS.2016.2619719