دورية
Improving Efficiency of K-Means Algorithm for Large Datasets
العنوان: | Improving Efficiency of K-Means Algorithm for Large Datasets |
---|---|
المؤلفون: | Swapna, Ch., Kumar, V., Murthy, J.V.R |
المصدر: | International Journal of Rough Sets and Data Analysis; April 2016, Vol. 3 Issue: 2 p1-9, 9p |
مستخلص: | Clustering is a process of grouping objects into different classes based on their similarities. K-means is a widely studied partitional based algorithm. It is reported to work efficiently for small datasets; however the performance is not very appreciable in terms of time of computation for large datasets. Several modifications have been made by researchers to address this issue. This paper proposes a novel way of handling the large datasets using K-means in a distributed manner to obtain efficiency. The concept of parallel processing is exploited by dividing the datasets to a number of baskets and then applying K-means in parallel manner to each such basket. The proposed BasketK-means provides a very competitive performance with considerably less computation time. The simulation results on various real datasets and synthetic datasets presented in the work clearly emphasize the effectiveness of the proposed approach. |
قاعدة البيانات: | Supplemental Index |
تدمد: | 23344598 23344601 |
---|---|
DOI: | 10.4018/IJRSDA.2016040101 |