Goto

Collaborating Authors

 fp-tree


Exploring the Trie of Rules: a fast data structure for the representation of association rules

Kudriavtsev, Mikhail, Bezbradica, Marija, McCarren, Andrew

arXiv.org Artificial Intelligence

Association rule mining techniques can generate a large volume of sequential data when implemented on transactional databases. Extracting insights from a large set of association rules has been found to be a challenging process. When examining a ruleset, the fundamental question is how to summarise and represent meaningful mined knowledge efficiently. Many algorithms and strategies have been developed to address issue of knowledge extraction; however, the effectiveness of this process can be limited by the data structures. A better data structure can sufficiently affect the speed of the knowledge extraction process. This paper proposes a novel data structure, called the Trie of rules, for storing a ruleset that is generated by association rule mining. The resulting data structure is a prefix-tree graph structure made of pre-mined rules. This graph stores the rules as paths within the prefix-tree in a way that similar rules overlay each other. Each node in the tree represents a rule where a consequent is this node, and an antecedent is a path from this node to the root of the tree. The evaluation showed that the proposed representation technique is promising. It compresses a ruleset with almost no data loss and benefits in terms of time for basic operations such as searching for a specific rule and sorting, which is the base for many knowledge discovery methods. Moreover, our method demonstrated a significant improvement in traversing time, achieving an 8-fold increase compared to traditional data structures.


MachineX: Understanding FP-Tree Construction - DZone AI

#artificialintelligence

In my previous blog, MachineX: Why No One Uses an Apriori Algorithm for Association Rule Learning, we discussed one of the first algorithms in association rule learning, Apriori algorithm. Although, even after being so simple and clear, it has some weaknesses as discussed in the above-mentioned blog. A significant improvement over the Apriori algorithm is the FP-Growth algorithm. To understand how the FP-Growth algorithm helps in finding frequent items, we first have to understand the data structure used by it to do so, the FP-Tree, which will be our focus in this blog. To put it simply, an FP-Tree is a compressed representation of the input data.