Detection of Common Subtrees with Identical Label Distribution

Jul-24-2023–arXiv.org Machine Learning

Tree data are ubiquitous, especially in biology and computer science, but also non-Euclidean [9], which prevents them from being analysed by classical statistical methods adapted to multidimensional data. Therefore, they require the development of specific tools that take into account their structured nature. Among such techniques, frequent pattern mining [1] consists in identifying patterns, i.e. substructures, that appear often in the data. The more elaborate the patterns searched, the more difficult the problem is: the issue is to preserve a reasonable algorithmic complexity that allows the search of a given family of patterns in a reasonable time. Different types of patterns have been considered in the literature to analyse tree data (see the survey [16] and the references therein) with a strong interest in a specific family of patterns called subtrees [3, 23]. In these two papers, only subtrees that appear more often than a given threshold are considered. Reverse search [5] is a generic approach for enumerating frequent patterns in a dataset that consists in (i) building an enumeration tree of substructures, and then (ii) pruning it to keep only frequent patterns.

isomorphism, node, subtree, (16 more...)

arXiv.org Machine Learning

Jul-24-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada > Alberta (0.14)
  - United States > New York
    - New York County > New York City (0.04)
- Europe
  - Germany (0.04)
  - France > Auvergne-Rhône-Alpes
    - Lyon > Lyon (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology
  - Data Science > Data Mining (1.00)
  - Artificial Intelligence
    - Representation & Reasoning > Search (0.69)
    - Machine Learning > Pattern Recognition (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found