Exploring the Trie of Rules: a fast data structure for the representation of association rules

التفاصيل البيبلوغرافية
العنوان: Exploring the Trie of Rules: a fast data structure for the representation of association rules
المؤلفون: Kudriavtsev, Mikhail, Bezbradica, Marija, McCarren, Andrew
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning, I.2.4, H.3.3, E.1
الوصف: Association rule mining techniques can generate a large volume of sequential data when implemented on transactional databases. Extracting insights from a large set of association rules has been found to be a challenging process. When examining a ruleset, the fundamental question is how to summarise and represent meaningful mined knowledge efficiently. Many algorithms and strategies have been developed to address issue of knowledge extraction; however, the effectiveness of this process can be limited by the data structures. A better data structure can sufficiently affect the speed of the knowledge extraction process. This paper proposes a novel data structure, called the Trie of rules, for storing a ruleset that is generated by association rule mining. The resulting data structure is a prefix-tree graph structure made of pre-mined rules. This graph stores the rules as paths within the prefix-tree in a way that similar rules overlay each other. Each node in the tree represents a rule where a consequent is this node, and an antecedent is a path from this node to the root of the tree. The evaluation showed that the proposed representation technique is promising. It compresses a ruleset with almost no data loss and benefits in terms of time for basic operations such as searching for a specific rule and sorting, which is the base for many knowledge discovery methods. Moreover, our method demonstrated a significant improvement in traversing time, achieving an 8-fold increase compared to traditional data structures.
Comment: 12 pages, 13 figures, preprint of journal article
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2310.17355
رقم الأكسشن: edsarx.2310.17355
قاعدة البيانات: arXiv