The suitable distance function for fuzzy C-Means clustering.

التفاصيل البيبلوغرافية
العنوان: The suitable distance function for fuzzy C-Means clustering.
المؤلفون: Eliyanto, Joko, Surono, Sugiyarto
المصدر: AIP Conference Proceedings; 11/3/2022, Vol. 2578 Issue 1, p1-12, 12p
مصطلحات موضوعية: CENTROID, LAGRANGE multiplier, FUZZY logic, EUCLIDEAN distance, MACHINE learning, COMPUTER simulation
مصطلحات جغرافية: MANHATTAN (New York, N.Y.), CANBERRA (A.C.T.)
مستخلص: Fuzzy C-Means clustering is a form of clustering based on distance which apply the concept of fuzzy logic. The clustering process works simultaneously with the iteration process to minimize the objective function. This objective function is the summation from the multiplication of the distance between the data coordinates to the nearest cluster centroid with the degree of which the data belong to the cluster itself. Based on the objective function equation, the value of the objective function will decrease by increasing the number of iteration process. This research provide how we choose the suitable distance for Fuzzy C-Means clustering. The right distance will meet the optimization problem in the Fuzzy C-Means Clustering method and produce good cluster quality. They are Euclidean, Average, Manhattan, Chebisev, Minkowski, Minkowski-Chebisev, and Canberra distance. We use five UCI Machine Learning dataset and two random datasets. We use the Lagrange multiplier method for the optimization of this method. The result quality of the cluster measure by their accuracy, Davies Bouldin Index, purity, and adjusted rand index. The experiment shows that the Canbera distances are the best distances which provide the optimum result by producing minimum objective function 378.185. The suitable distance for the application of the Fuzzy C-Means Clustering method are Euclidean distance, Average distance, Manhattan distance, Minkowski distance, Minkowski-Chebisev distance, and Canberra distance. These six distances produce a numerical simulation that derives the objective function fairly constant. Meanwhile, the Chebisev distance shows the movement of the value of the objective function that fluctuates, so it is not in accordance with the optimization problem in the Fuzzy C Means Clustering method. [ABSTRACT FROM AUTHOR]
Copyright of AIP Conference Proceedings is the property of American Institute of Physics and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات: Complementary Index
الوصف
تدمد:0094243X
DOI:10.1063/5.0106185