Comparison of Similarity Measures in Cluster Analysis with Binary Data.

التفاصيل البيبلوغرافية
العنوان: Comparison of Similarity Measures in Cluster Analysis with Binary Data.
اللغة: English
المؤلفون: Finch, Holmes, Huynh, Huynh
Peer Reviewed: N
Page Count: 20
تاريخ النشر: 2000
نوع الوثيقة: Reports - Evaluative
Speeches/Meeting Papers
Descriptors: Algorithms, Cluster Analysis, Monte Carlo Methods, Responses, Sample Size, Simulation
مستخلص: One set of approaches to the problem of clustering with dichotomous data in cluster analysis (CA) was studied. The techniques developed for clustering with binary data involve calculating distances between observations based on the variables and then applying one of the standard CA algorithms to these distances. One of the groups of distances that are designed for binary data is known collectively as matching coefficients. There are several incarnations of matching coefficients, but all take as their main goal the measurement of response similarity between any two observations. Thus, distance and similarity come to express the same concept with respect to the observations. This study examined four measures of association that are common to four previous studies. Using Monte Carlo simulation, cluster analysis was conducted using the four distance measures. Under the conditions of this study, the four measures performed very much the same in terms of correctly classifying individuals into two clusters based on dichotomous variables. Another interesting result is that clustering solutions were virtually identical for samples of size 240 and 1,000. (Contains 6 tables, 6 figures, and 12 references.) (SLD)
Entry Date: 2000
رقم الأكسشن: ED442866
قاعدة البيانات: ERIC