AITopics | Grammars & Parsing

Collaborating Authors

Grammars & Parsing

News Overviews Instructional Materials AI-Alerts Classics

Deep Graph Mating

Neural Information Processing SystemsDec-24-2025, 00:46:08 GMT

In this paper, we introduce the first learning-free model reuse task within the non-Euclidean domain, termed as Deep Graph Mating (Grama). We strive to create a child Graph Neural Network (GNN) that integrates knowledge from pre-trained parent models without requiring re-training, fine-tuning, or annotated labels. To this end, we begin by investigating the permutation invariance property of GNNs, which leads us to develop two vanilla approaches for Grama: Vanilla Parameter Interpolation (VPI) and Vanilla Alignment Prior to Interpolation (VAPI), both employing topology-independent interpolation in the parameter space. However, neither approach has achieved the anticipated results. Through theoretical analysis of VPI and VAPI, we identify critical challenges unique to Grama, including increased sensitivity to parameter misalignment and further the inherent topology-dependent complexities. Motivated by these findings, we propose the Dual-Message Coordination and Calibration (DuMCC) methodology, comprising the Parent Message Coordination (PMC) scheme to optimise the permutation matrices for parameter interpolation by coordinating aggregated messages, and the Child Message Calibration (CMC) scheme to mitigate over-smoothing identified in PMC by calibrating the message statistics within child GNNs. Experiments across diverse domains, including node and graph property prediction, 3D object recognition, and large-scale semantic parsing, demonstrate that the proposed DuMCC effectively enables training-free knowledge transfer, yielding results on par with those of pre-trained models.

artificial intelligence, machine learning, natural language, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.75)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.59)

Add feedback

Multi-modal Dependency Tree for Video Captioning

Neural Information Processing SystemsDec-23-2025, 23:47:24 GMT

Generating fluent and relevant language to describe visual content is critical for the video captioning task. Many existing methods generate captions using sequence models that predict words in a left-to-right order. In this paper, we investigate a graph-structured model for caption generation by explicitly modeling the hierarchical structure in the sentences to further improve the fluency and relevance of sentences. To this end, we propose a novel video captioning method that generates a sentence by first constructing a multi-modal dependency tree and then traversing the constructed tree, where the syntactic structure and semantic relationship in the sentence are represented by the tree topology. To take full advantage of the information from both vision and language, both the visual and textual representation features are encoded into each tree node. Different from existing dependency parsing methods that generate uni-modal dependency trees for language understanding, our method construct s multi-modal dependency trees for language generation of images and videos. We also propose a tree-structured reinforcement learning algorithm to effectively optimize the captioning model where a novel reward is designed by evaluating the semantic consistency between the generated sub-tree and the ground-truth tree. Extensive experiments on several video captioning datasets demonstrate the effectiveness of the proposed method.

machine learning, natural language, reinforcement learning, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Add feedback

Non-autoregressive Machine Translation with Probabilistic Context-free Grammar

Neural Information Processing SystemsDec-23-2025, 22:58:54 GMT

Non-autoregressive Transformer(NAT) significantly accelerates the inference of neural machine translation. However, conventional NAT models suffer from limited expression power and performance degradation compared to autoregressive (AT) models due to the assumption of conditional independence among target tokens. To address these limitations, we propose a novel approach called PCFG-NAT, which leverages a specially designed Probabilistic Context-Free Grammar (PCFG) to enhance the ability of NAT models to capture complex dependencies among output tokens. Experimental results on major machine translation benchmarks demonstrate that PCFG-NAT further narrows the gap in translation quality between NAT and AT models. Moreover, PCFG-NAT facilitates a deeper understanding of the generated sentences, addressing the lack of satisfactory explainability in neural machine translation.

artificial intelligence, natural language, proceedings, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Recursive Bayesian Networks: Generalising and Unifying Probabilistic Context-Free Grammars and Dynamic Bayesian Networks

Neural Information Processing SystemsDec-23-2025, 21:18:35 GMT

Probabilistic context-free grammars (PCFGs) and dynamic Bayesian networks (DBNs) are widely used sequence models with complementary strengths and limitations. While PCFGs allow for nested hierarchical dependencies (tree structures), their latent variables (non-terminal symbols) have to be discrete. In contrast, DBNs allow for continuous latent variables, but the dependencies are strictly sequential (chain structure). Therefore, neither can be applied if the latent variables are assumed to be continuous and also to have a nested hierarchical dependency structure. In this paper, we present Recursive Bayesian Networks (RBNs), which generalise and unify PCFGs and DBNs, combining their strengths and containing both as special cases. RBNs define a joint distribution over tree-structured Bayesian networks with discrete or continuous latent variables. The main challenge lies in performing joint inference over the exponential number of possible structures and the continuous variables. We provide two solutions: 1) For arbitrary RBNs, we generalise inside and outside probabilities from PCFGs to the mixed discrete-continuous case, which allows for maximum posterior estimates of the continuous latent variables via gradient descent, while marginalising over network structures.

artificial intelligence, machine learning, natural language, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Low-Rank Constraints for Fast Inference in Structured Models

Neural Information Processing SystemsDec-23-2025, 19:43:40 GMT

Structured distributions, i.e. distributions over combinatorial spaces, are commonly used to learn latent probabilistic representations from observed data. However, scaling these models is bottlenecked by the high computational and memory complexity with respect to the size of the latent representations. Common models such as Hidden Markov Models (HMMs) and Probabilistic Context-Free Grammars (PCFGs) require time and space quadratic and cubic in the number of hidden states respectively. This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models. We show that by viewing the central inference step as a matrix-vector product and using a low-rank constraint, we can trade off model expressivity and speed via the rank. Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces while providing practical speedups.

artificial intelligence, machine learning, natural language, (6 more...)

Neural Information Processing Systems

Industry:

Media > Music (0.61)
Leisure & Entertainment (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.98)

Add feedback

Compositional Generalization via Neural-Symbolic Stack Machines

Neural Information Processing SystemsDec-23-2025, 18:47:52 GMT

Despite achieving tremendous success, existing deep learning models have exposed limitations in compositional generalization, the capability to learn compositional rules and apply them to unseen cases in a systematic manner. To tackle this issue, we propose the Neural-Symbolic Stack Machine (NeSS). It contains a neural network to generate traces, which are then executed by a symbolic stack machine enhanced with sequence manipulation operations. NeSS combines the expressive power of neural sequence models with the recursion supported by the symbolic stack machine. Without training supervision on execution traces, NeSS achieves 100% generalization performance in four domains: the SCAN benchmark of language-driven navigation tasks, the task of few-shot learning of compositional instructions, the compositional machine translation benchmark, and context-free grammar parsing tasks.

artificial intelligence, machine learning, natural language, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Systematic Generalization with Edge Transformers

Neural Information Processing SystemsDec-23-2025, 18:08:42 GMT

Recent research suggests that systematic generalization in natural language understanding remains a challenge for state-of-the-art neural models such as Transformers and Graph Neural Networks. To tackle this challenge, we propose Edge Transformer, a new model that combines inspiration from Transformers and rule-based symbolic AI. The first key idea in Edge Transformers is to associate vector states with every edge, that is, with every pair of input nodes---as opposed to just every node, as it is done in the Transformer model. The second major innovation is a triangular attention mechanism that updates edge representations in a way that is inspired by unification from logic programming. We evaluate Edge Transformer on compositional generalization benchmarks in relational reasoning, semantic parsing, and dependency parsing. In all three settings, the Edge Transformer outperforms Relation-aware, Universal and classical Transformer baselines.

artificial intelligence, natural language, proceedings, (4 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.61)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.85)

Add feedback

AI-Specific Code Smells: From Specification to Detection

Mahmoudi, Brahim, Moha, Naouel, Stiévenart, Quentin, Avellaneda, Florent

arXiv.org Artificial IntelligenceDec-12-2025

The rise of Artificial Intelligence (AI) is reshaping how software systems are developed and maintained. However, AI-based systems give rise to new software issues that existing detection tools often miss. Among these, we focus on AI-specific code smells, recurring patterns in the code that may indicate deeper problems such as unreproducibility, silent failures, or poor model generalization. We introduce SpecDetect4AI, a tool-based approach for the specification and detection of these code smells at scale. This approach combines a high-level declarative Domain-Specific Language (DSL) for rule specification with an extensible static analysis tool that interprets and detects these rules for AI-based systems. We specified 22 AI-specific code smells and evaluated SpecDetect4AI on 826 AI-based systems (20M lines of code), achieving a precision of 88.66% and a recall of 88.89%, outperforming other existing detection tools. Our results show that SpecDetect4AI supports the specification and detection of AI-specific code smells through dedicated rules and can effectively analyze large AI-based systems, demonstrating both efficiency and extensibility (SUS 81.7/100).

large language model, machine learning, programming language, (23 more...)

arXiv.org Artificial Intelligence

2509.20491

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada > Quebec > Montreal (0.05)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

SETUP: Sentence-level English-To-Uniform Meaning Representation Parser

Markle, Emma, Bach, Javier Gutierrez, Wein, Shira

arXiv.org Artificial IntelligenceDec-9-2025

Uniform Meaning Representation (UMR) is a novel graph-based semantic representation which captures the core meaning of a text, with flexibility incorporated into the annotation schema such that the breadth of the world's languages can be annotated (including low-resource languages). While UMR shows promise in enabling language documentation, improving low-resource language technologies, and adding interpretability, the downstream applications of UMR can only be fully explored when text-to-UMR parsers enable the automatic large-scale production of accurate UMR graphs at test time. Prior work on text-to-UMR parsing is limited to date. In this paper, we introduce two methods for English text-to-UMR parsing, one of which fine-tunes existing parsers for Abstract Meaning Representation and the other, which leverages a converter from Universal Dependencies, using prior work as a baseline. Our best-performing model, which we call SETUP, achieves an AnCast score of 84 and a SMATCH++ score of 91, indicating substantial gains towards automatic UMR parsing.

artificial intelligence, computational linguistic, natural language, (18 more...)

arXiv.org Artificial Intelligence

2512.07068

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(9 more...)

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.80)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)

Add feedback

Understanding Syntactic Generalization in Structure-inducing Language Models

Arps, David, Sajjad, Hassan, Kallmeyer, Laura

arXiv.org Artificial IntelligenceDec-9-2025

Structure-inducing Language Models (SiLM) are trained on a self-supervised language modeling task, and induce a hierarchical sentence representation as a byproduct when processing an input. SiLMs couple strong syntactic generalization behavior with competitive performance on various NLP tasks, but many of their basic properties are yet underexplored. In this work, we train three different SiLM architectures from scratch: Structformer (Shen et al., 2021), UDGN (Shen et al., 2022), and GPST (Hu et al., 2024b). We train these architectures on both natural language (English, German, and Chinese) corpora and synthetic bracketing expressions. The models are then evaluated with respect to (i) properties of the induced syntactic representations (ii) performance on grammaticality judgment tasks, and (iii) training dynamics. We find that none of the three architectures dominates across all evaluation metrics. However, there are significant differences, in particular with respect to the induced syntactic representations. The Generative Pretrained Structured Transformer (GPST; Hu et al. 2024) performs most consistently across evaluation settings, and outperforms the other models on long-distance dependencies in bracketing expressions. Furthermore, our study shows that small models trained on large amounts of synthetic data provide a useful testbed for evaluating basic model properties.

computational linguistic, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.07969

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Austria > Vienna (0.14)
(25 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback