دورية أكاديمية

TOWARDS A UNIFIED FRAMEWORK IN TREATING IMBALANCED DATASETS. APPLICATION TO THE PULSAR CANDIDATE SELECTION PROBLEM BASED ON THE HTRU2 DATASET. PART II: OVERSAMPLING VERSUS UNDERSAMPLING - A CRITICAL COMPARISON.

التفاصيل البيبلوغرافية
العنوان: TOWARDS A UNIFIED FRAMEWORK IN TREATING IMBALANCED DATASETS. APPLICATION TO THE PULSAR CANDIDATE SELECTION PROBLEM BASED ON THE HTRU2 DATASET. PART II: OVERSAMPLING VERSUS UNDERSAMPLING - A CRITICAL COMPARISON. (Russian)
المؤلفون: Diaconu, Bogdan, Anghelescu, Lucica, Cruceru, Mihai, Popa, Marius-Eremia Vlaicu
المصدر: Annals of 'Constantin Brancusi' University of Targu-Jiu. Engineering Series / Analele Universităţii Constantin Brâncuşi din Târgu-Jiu. Seria Inginerie; 2022, Issue 2, p54-61, 8p
مصطلحات موضوعية: RANDOM forest algorithms, SAMPLING methods, LOGISTIC regression analysis, PULSARS
مستخلص: This paper will discuss several methods of dealing with unbalanced datasets with application to High Time Resolution Universe Survey dataset. HTRU2 is a labelled dataset with a ratio of positive to negative instances of approximately 0.1. Under sampling and oversampling methods are discussed and tested and the results are compared. It was found that Synthetic Minority Oversampling Technique provides the best performance, assessed by means of the ROC-AUC parameter. Further improvement is expected by combining oversampling and under sampling. [ABSTRACT FROM AUTHOR]
Copyright of Annals of 'Constantin Brancusi' University of Targu-Jiu. Engineering Series / Analele Universităţii Constantin Brâncuşi din Târgu-Jiu. Seria Inginerie is the property of Universitatea Constantin Brancusi din Targu-Jiu and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات: Supplemental Index