دورية أكاديمية

Overlapping classification for autocoding system.

التفاصيل البيبلوغرافية
العنوان: Overlapping classification for autocoding system.
المؤلفون: Yukako Toko, Shinya Iijima, Sato-Ilic, Mika
المصدر: Romanian Statistical Review; 2018, Issue 4, p58-73, 16p
مصطلحات موضوعية: MACHINE learning, SUPPORT vector machines, SUPERVISED learning
مستخلص: Coding is the classification of objects (or features) based on given classification codes, and it is frequently required in the field of official statistics. This paper proposes a supervised overlapping multiclass classifier for autocoding. The classifier is implemented in R. The purpose of this study is to efficiently apply this classifier to the coding task of the Family Income and Expenditure Survey in Japan. We previously developed a non-overlapping multiclass classifier that obtains "exclusive" classes. Even though the developed classifier provides high accuracy for the autocoding task, some objects with ambiguous input information are still incorrectly assigned codes. This shows that exclusive classification has a limitation when dealing with uncertainty. To solve this problem, we propose a new classifier that lists multiple candidates in descending order of the degree of reliability as output and assists experts in selecting a correct code from the listed candidate codes. We refer to this proposed classifier as the overlapping multiclass classifier. A new reliability score based on the weights of entropy is employed in the proposed classifier. With this new reliability score, the proposed classifier improves cumulative accuracy and practicability while the advantages of the structural simplicity of the algorithm and practical calculation time remain unchanged. The proposed algorithm is implemented in R to improve its versatility. [ABSTRACT FROM AUTHOR]
Copyright of Romanian Statistical Review is the property of Romanian Statistical Review Publishing House and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات: Supplemental Index