AITopics | tree edit distance

Collaborating Authors

tree edit distance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Designing ReLU Generative Networks to Enumerate Trees with a Given Tree Edit Distance

Ghafoor, Mamoona, Akutsu, Tatsuya

arXiv.org Artificial IntelligenceOct-14-2025

The generation of trees with a specified tree edit distance has significant applications across various fields, including computational biology, structured data analysis, and image processing. Recently, generative networks have been increasingly employed to synthesize new data that closely resembles the original datasets. However, the appropriate size and depth of generative networks required to generate data with a specified tree edit distance remain unclear. In this paper, we theoretically establish the existence and construction of generative networks capable of producing trees similar to a given tree with respect to the tree edit distance. Specifically, for a given rooted, ordered, and vertex-labeled tree T of size n + 1 with labels from an alphabet Σ, and a non-negative integer d, we prove that all rooted, ordered, and vertex-labeled trees over Σwith tree edit distance at most d from T can be generated using a ReLU-based generative network with size O(n^3 ) and constant depth. The proposed networks were implemented and evaluated for generating trees with up to 21 nodes. Due to their deterministic architecture, the networks successfully generated all valid trees within the specified tree edit distance. In contrast, state-of-the-art graph generative models GraphRNN and GraphGDP, which rely on non-deterministic mechanisms, produced significantly fewer valid trees, achieving validation rates of only up to 35% and 48%, respectively. These findings provide a theoretical foundation towards construction of compact generative models and open new directions for exact and valid tree-structured data generation. An implementation of the proposed networks is available at https://github.com/MGANN-KU/TreeGen_ReLUNetworks.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.10706

Country: Asia (0.28)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

ASSESS: A Semantic and Structural Evaluation Framework for Statement Similarity

Liu, Xiaoyang, Zhu, Tao, Dong, Zineng, Liu, Yuntian, Guo, Qingfeng, Liu, Zhaoxuan, Chen, Yu, Luo, Tao

arXiv.org Artificial IntelligenceSep-29-2025

Statement autoformalization, the automated translation of statements from natural language into formal languages, has seen significant advancements, yet the development of automated evaluation metrics remains limited. Existing metrics for formal statement similarity often fail to balance semantic and structural information. String-based approaches capture syntactic structure but ignore semantic meaning, whereas proof-based methods validate semantic equivalence but disregard structural nuances and, critically, provide no graded similarity score in the event of proof failure. To address these issues, we introduce ASSESS (A Semantic and Structural Evaluation Framework for Statement Similarity), which comprehensively integrates semantic and structural information to provide a continuous similarity score. Our framework first transforms formal statements into Operator Trees to capture their syntactic structure and then computes a similarity score using our novel TransTED (Transformation Tree Edit Distance) Similarity metric, which enhances traditional Tree Edit Distance by incorporating semantic awareness through transformations. For rigorous validation, we present EPLA (Evaluating Provability and Likeness for Autoformalization), a new benchmark of 524 expert-annotated formal statement pairs derived from miniF2F and ProofNet, with labels for both semantic provability and structural likeness. Experiments on EPLA demonstrate that TransTED Similarity outperforms existing methods, achieving state-of-the-art accuracy and the highest Kappa coefficient. The benchmark, and implementation code will be made public soon.

large language model, logic & formal reasoning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2509.22246

Country: Europe (0.46)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.67)
(2 more...)

Add feedback

DocEdit-v2: Document Structure Editing Via Multimodal LLM Grounding

Suri, Manan, Mathur, Puneet, Dernoncourt, Franck, Jain, Rajiv, Morariu, Vlad I, Sawhney, Ramit, Nakov, Preslav, Manocha, Dinesh

arXiv.org Artificial IntelligenceOct-21-2024

Document structure editing involves manipulating localized textual, visual, and layout components in document images based on the user's requests. Past works have shown that multimodal grounding of user requests in the document image and identifying the accurate structural components and their associated attributes remain key challenges for this task. To address these, we introduce the DocEdit-v2, a novel framework that performs end-to-end document editing by leveraging Large Multimodal Models (LMMs). It consists of three novel components: (1) Doc2Command, which simultaneously localizes edit regions of interest (RoI) and disambiguates user edit requests into edit commands; (2) LLM-based Command Reformulation prompting to tailor edit commands originally intended for specialized software into edit instructions suitable for generalist LMMs. (3) Moreover, DocEdit-v2 processes these outputs via Large Multimodal Models like GPT-4V and Gemini, to parse the document layout, execute edits on grounded Region of Interest (RoI), and generate the edited document image. Extensive experiments on the DocEdit dataset show that DocEdit-v2 significantly outperforms strong baselines on edit command generation (2-33%), RoI bounding box detection (12-31%), and overall document editing (1-12\%) tasks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.16472

Country:

Oceania > Australia (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.04)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Energy (0.46)
Banking & Finance (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Tree edit distance for hierarchical data compatible with HMIL paradigm

Šopík, Břetislav, Strenáčik, Tomáš

arXiv.org Artificial IntelligenceJul-26-2022

In contemporary data analysis in industrial and academic research we often need to work with data that has a hierarchical structure. The analysis of such data is naturally more difficult than the analysis of data with a flat structure because the schema of the hierarchically organized dataset may possess important information which is lost if we ignore it. A common task of a dataset analysis is to evaluate the difference between two of its instances. If the dataset has a flat structure and consists of numerical vectors, appropriate distance function from vector spaces can be used. Similarly, we can utilize the Levenshtein edit distance[1] for comparison of two strings, etc.

data type, edit distance, node, (16 more...)

arXiv.org Artificial Intelligence

2208.00782

Country: Europe > Czechia > Prague (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

An A*-algorithm for the Unordered Tree Edit Distance with Custom Costs

Paaßen, Benjamin

arXiv.org Artificial IntelligenceJul-26-2021

The unordered tree edit distance is a natural metric to compute distances between trees without intrinsic child order, such as representations of chemical molecules. While the unordered tree edit distance is MAX SNP-hard in principle, it is feasible for small cases, e.g. via an A* algorithm. Unfortunately, current heuristics for the A* algorithm assume unit costs for deletions, insertions, and replacements, which limits our ability to inject domain knowledge. In this paper, we present three novel heuristics for the A* algorithm that work with custom cost functions. In experiments on two chemical data sets, we show that custom costs make the A* computation faster and improve the error of a 5-nearest neighbor regressor, predicting chemical properties. We also show that, on these data, polynomial edit distances can achieve similar results as the unordered tree edit distance.

algorithm, edit distance, tree edit distance, (13 more...)

arXiv.org Artificial Intelligence

2108.00953

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Structural Similarity of Boundary Conditions and an Efficient Local Search Algorithm for Goal Conflict Identification

Zhong, Hongzhen, Wan, Hai, Luo, Weilin, Xiao, Zhanhao, Li, Jia, Fang, Biqing

arXiv.org Artificial IntelligenceFeb-22-2021

In goal-oriented requirements engineering, goal conflict identification is of fundamental importance for requirements analysis. The task aims to find the feasible situations which make the goals diverge within the domain, called boundary conditions (BCs). However, the existing approaches for goal conflict identification fail to find sufficient BCs and general BCs which cover more combinations of circumstances. From the BCs found by these existing approaches, we have observed an interesting phenomenon that there are some pairs of BCs are similar in formula structure, which occurs frequently in the experimental cases. In other words, once a BC is found, a new BC may be discovered quickly by slightly changing the former. It inspires us to develop a local search algorithm named LOGION to find BCs, in which the structural similarity is captured by the neighborhood relation of formulae. Based on structural similarity, LOGION can find a lot of BCs in a short time. Moreover, due to the large number of BCs identified, it potentially selects more general BCs from them. By taking experiments on a set of cases, we show that LOGION effectively exploits the structural similarity of BCs. We also compare our algorithm against the two state-of-the-art approaches. The experimental results show that LOGION produces one order of magnitude more BCs than the state-of-the-art approaches and confirm that LOGION finds out more general BCs thanks to a large number of BCs.

algorithm, formulae, logion, (13 more...)

arXiv.org Artificial Intelligence

2102.11482

Country: Asia > China > Guangdong Province > Guangzhou (0.05)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Metric Learning for Ordered Labeled Trees with pq-grams

Shindo, Hikaru, Nishino, Masaaki, Kobayashi, Yasuaki, Yamamoto, Akihiro

arXiv.org Machine LearningMar-9-2020

Computing the similarity between two data points plays a vital role in many machine learning algorithms. Metric learning has the aim of learning a good metric automatically from data. Most existing studies on metric learning for tree-structured data have adopted the approach of learning the tree edit distance. However, the edit distance is not amenable for big data analysis because it incurs high computation cost. In this paper, we propose a new metric learning approach for tree-structured data with pq-grams. The pq-gram distance is a distance for ordered labeled trees, and has much lower computation cost than the tree edit distance. In order to perform metric learning based on pq-grams, we propose a new differentiable parameterized distance, weighted pq-gram distance. We also propose a way to learn the proposed distance based on Large Margin Nearest Neighbors (LMNN), which is a well-studied and practical metric learning scheme. We formulate the metric learning problem as an optimization problem and use the gradient descent technique to perform metric learning. We empirically show that the proposed approach not only achieves competitive results with the state-of-the-art edit distance-based methods in various classification problems, but also solves the classification problems much more rapidly than the edit distance-based methods.

dataset, edit distance, pq-gram distance, (12 more...)

arXiv.org Machine Learning

2003.0396

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
North America > United States > New York (0.04)
Europe > Spain > Canary Islands (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Education (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.52)

Add feedback

Adversarial Edit Attacks for Tree Data

Paaßen, Benjamin

arXiv.org Machine LearningAug-27-2019

Many machine learning models can be attacked with adversarial examples, i.e. inputs close to correctly classified examples that are classified incorrectly. However, most research on adversarial attacks to date is limited to vectorial data, in particular image data. In this contribution, we extend the field by introducing adversarial edit attacks for tree-structured data with potential applications in medicine and automated program analysis. Our approach solely relies on the tree edit distance and a logarithmic number of black-box queries to the attacked classifier without any need for gradient information. We evaluate our approach on two programming and two biomedical data sets and show that many established tree classifiers, like tree-kernel-SVMs and recursive neural networks, can be attacked effectively.

artificial intelligence, machine learning, proceedings, (18 more...)

arXiv.org Machine Learning

1908.09364

Genre: Research Report (0.67)

Industry: Information Technology > Security & Privacy (0.91)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Similarity Measures based on Local Game Trees

Evans, Sabrina, Turrini, Paolo

arXiv.org Artificial IntelligenceFeb-25-2019

We study strategic similarity of game positions in Contribution We study similarity of game positions in two-player extensive games of perfect information, two-player, deterministic games of perfect information, by by looking at the structure of their local game trees, looking at the structure of their local game trees, working with the aim of improving the performance of game with the set of possible moves from each position. We introduce playing agents in detecting forcing continuations.

artificial intelligence, machine learning, similarity, (16 more...)

arXiv.org Artificial Intelligence

1902.09335

Country:

Europe > Hungary > Budapest > Budapest (0.06)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Chess (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Crowdsourcing in Language Classes Can Help Natural Language Processing

Hladká, Barbora (Charles University) | Hana, Jirka (Charles University) | Lukšová, Ivana (Charles University)

AAAI ConferencesOct-31-2014

One way of teaching grammar, namely morphology and syntax, is to visualize sentences as diagrams capturing relationships between words. Similarly, such relationships are captured in a more complex way in treebanks serving as key building stones in modern natural language processing. However, building them is very time consuming, thus we have been seeking for an alternative cheaper and faster way, like crowdsourcing. The purpose of our work is to explore possibility to get sentence diagrams produced by students and teachers. In our pilot study, the object language is Czech, where sentence diagrams are part of elementary school curriculum.

artificial intelligence, natural language, social media, (17 more...)

AAAI Conferences

Second AAAI Conference on Human Computation and Crowdsourcing

Country: Europe > Czechia > Prague (0.05)

Industry:

Education > Educational Setting > K-12 Education (0.73)
Education > Curriculum (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.66)

Add feedback