Association measures for interval variables
العنوان: | Association measures for interval variables |
---|---|
المؤلفون: | Rui Valadas, Margarida Azeitona, Antonio G. Pacheco, M. Rosário Oliveira |
المصدر: | Advances in Data Analysis and Classification. 16:491-520 |
بيانات النشر: | Springer Science and Business Media LLC, 2021. |
سنة النشر: | 2021 |
مصطلحات موضوعية: | FOS: Computer and information sciences, Statistics and Probability, Structure (mathematical logic), Theoretical computer science, Computer science, Applied Mathematics, Sampling (statistics), Mathematics - Statistics Theory, Statistics Theory (math.ST), 02 engineering and technology, Interval (mathematics), Covariance, 01 natural sciences, Symbolic data analysis, Field (computer science), Computer Science Applications, Methodology (stat.ME), 010104 statistics & probability, Histogram, FOS: Mathematics, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, 0101 mathematics, Divergence (statistics), Statistics - Methodology |
الوصف: | Symbolic Data Analysis (SDA) is a relatively new field of statistics that extends conventional data analysis by taking into account intrinsic data variability and structure. Unlike conventional data analysis, in SDA the features characterizing the data can be multi-valued, such as intervals or histograms. SDA has been mainly approached from a sampling perspective. In this work, we propose a model that links the micro-data and macro-data of interval-valued symbolic variables, which takes a populational perspective. Using this model, we derive the micro-data assumptions underlying the various definitions of symbolic covariance matrices proposed in the literature, and show that these assumptions can be too restrictive, raising applicability concerns. We analyze the various definitions using worked examples and four datasets. Our results show that the existence/absence of correlations in the macro-data may not be correctly captured by the definitions of symbolic covariance matrices and that, in real data, there can be a strong divergence between these definitions. Thus, in order to select the most appropriate definition, one must have some knowledge about the micro-data structure. |
تدمد: | 1862-5355 1862-5347 |
URL الوصول: | https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b33f9ba9c4f7b034aaea765a5fb6a93d https://doi.org/10.1007/s11634-021-00445-8 |
حقوق: | OPEN |
رقم الأكسشن: | edsair.doi.dedup.....b33f9ba9c4f7b034aaea765a5fb6a93d |
قاعدة البيانات: | OpenAIRE |
تدمد: | 18625355 18625347 |
---|