AITopics | composition function

Compositional Generalization from First Principles

Neural Information Processing SystemsMay-1-2026, 02:03:27 GMT

Leveraging the compositional nature of our world to expedite learning and facilitate generalization is a hallmark of human perception. In machine learning, on the other hand, achieving compositional generalization has proven to be an elusive goal, even for models with explicit compositional priors. To get a better handle on compositional generalization, we here approach it from the bottom up: Inspired by identifiable representation learning, we investigate compositionality as a property of the data-generating process rather than the data itself. This reformulation enables us to derive mild conditions on only the support of the training distribution and the model architecture, which are sufficient for compositional generalization. We further demonstrate how our theoretical framework applies to real-world scenarios and validate our findings empirically. Our results set the stage for a principled theoretical study of compositional generalization.

artificial intelligence, machine learning, object-oriented architecture, (19 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.34)

Add feedback

15f6a10899f557ce53fe39939af6f930-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 06:15:34 GMT

generalization, representation, sprite, (16 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Extended Diffeomorphism for Real-Time Motion Replication in Workspaces with Different Spatial Arrangements

Saito, Masaki, Itadera, Shunki, Murakami, Toshiyuki

arXiv.org Artificial IntelligenceSep-3-2025

This paper presents two types of extended diffeomorphism designs to compensate for spatial placement differences between robot workspaces. Teleoperation of multiple robots is attracting attention to expand the utilization of the robot embodiment. Real-time reproduction of robot motion would facilitate the efficient execution of similar tasks by multiple robots. A challenge in the motion reproduction is compensating for the spatial arrangement errors of target keypoints in robot workspaces. This paper proposes a methodology for smooth mappings that transform primary robot poses into follower robot poses based on the predefined key points in each workspace. Through a picking task experiment using a dual-arm UR5 robot, this study demonstrates that the proposed mapping generation method can balance lower mapping errors for precise operation and lower mapping gradients for smooth replicated movement.

artificial intelligence, optimization problem, robot, (17 more...)

arXiv.org Artificial Intelligence

2509.00491

Country: Asia > Japan (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

A Systematic Study of Compositional Syntactic Transformer Language Models

Zhao, Yida, Xve, Hao, Hu, Xiang, Tu, Kewei

arXiv.org Artificial IntelligenceJul-1-2025

Syntactic language models (SLMs) enhance Transformers by incorporating syntactic biases through the modeling of linearized syntactic parse trees alongside surface sentences. This paper focuses on compositional SLMs that are based on constituency parse trees and contain explicit bottom-up composition of constituent representations. We identify key aspects of design choices in existing compositional SLMs and propose a unified framework encompassing both existing models and novel variants. We conduct a comprehensive empirical evaluation of all the variants in our framework across language modeling, syntactic generalization, summarization, dialogue, and inference efficiency. Based on the experimental results, we make multiple recommendations on the design of compositional SLMs. Our code is released at https://github.com/zhaoyd1/compositional_SLMs.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.22978

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(11 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Compositional Generalization from First Principles

Wiedemer, Thaddäus, Mayilvahanan, Prasanna, Bethge, Matthias, Brendel, Wieland

arXiv.org Artificial IntelligenceJul-10-2023

Leveraging the compositional nature of our world to expedite learning and facilitate generalization is a hallmark of human perception. In machine learning, on the other hand, achieving compositional generalization has proven to be an elusive goal, even for models with explicit compositional priors. To get a better handle on compositional generalization, we here approach it from the bottom up: Inspired by identifiable representation learning, we investigate compositionality as a property of the data-generating process rather than the data itself. This reformulation enables us to derive mild conditions on only the support of the training distribution and the model architecture, which are sufficient for compositional generalization. We further demonstrate how our theoretical framework applies to real-world scenarios and validate our findings empirically. Our results set the stage for a principled theoretical study of compositional generalization.

artificial intelligence, generalization, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2307.05596

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)

Add feedback

Composition, Attention, or Both?

Yoshida, Ryo, Oseki, Yohei

arXiv.org Artificial IntelligenceMay-10-2023

In this paper, we propose a novel architecture called Composition Attention Grammars (CAGs) that recursively compose subtrees into a single vector representation with a composition function, and selectively attend to previous structural information with a self-attention mechanism. We investigate whether these components -- the composition function and the self-attention mechanism -- can both induce human-like syntactic generalization. Specifically, we train language models (LMs) with and without these two components with the model sizes carefully controlled, and evaluate their syntactic generalization performance against six test circuits on the SyntaxGym benchmark. The results demonstrated that the composition function and the self-attention mechanism both play an important role to make LMs more human-like, and closer inspection of linguistic phenomenon implied that the composition function allowed syntactic features, but not semantic features, to percolate into subtree representations.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.12958

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
(9 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Confidence Composition for Monitors of Verification Assumptions

Ruchkin, Ivan, Cleaveland, Matthew, Ivanov, Radoslav, Lu, Pengyuan, Carpenter, Taylor, Sokolsky, Oleg, Lee, Insup

arXiv.org Artificial IntelligenceNov-3-2021

Closed-loop verification of cyber-physical systems with neural network controllers offers strong safety guarantees under certain assumptions. It is, however, difficult to determine whether these guarantees apply at run time because verification assumptions may be violated. To predict safety violations in a verified system, we propose a three-step framework for monitoring the confidence in verification assumptions. First, we represent the sufficient condition for verified safety with a propositional logical formula over assumptions. Second, we build calibrated confidence monitors that evaluate the probability that each assumption holds. Third, we obtain the confidence in the verification guarantees by composing the assumption monitors using a composition function suitable for the logical formula. Our framework provides theoretical bounds on the calibration and conservatism of compositional monitors. In two case studies, we demonstrate that the composed monitors improve over their constituents and successfully predict safety violations.

assumption, composition, composition function, (14 more...)

arXiv.org Artificial Intelligence

2111.03782

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
(3 more...)

Add feedback

Knowledge Guided Text Retrieval and Reading for Open Domain Question Answering

Min, Sewon, Chen, Danqi, Zettlemoyer, Luke, Hajishirzi, Hannaneh

arXiv.org Artificial IntelligenceNov-10-2019

This paper presents a general approach for open-domain question answering (QA) that models interactions between paragraphs using structural information from a knowledge base. We first describe how to construct a graph of passages from a large corpus, where the relations are either from the knowledge base or the internal structure of Wikipedia. We then introduce a reading comprehension model which takes this graph as an input, to better model relationships across pairs of paragraphs. This approach consistently outperforms competitive baselines in three open-domain QA datasets, WebQuestions, Natural Questions and TriviaQA, improving the pipeline-based state-of-the-art by 3--13%.

eader, graph, relation, (16 more...)

arXiv.org Artificial Intelligence

1911.03868

Country:

North America > United States > Minnesota > Hennepin County (0.14)
Asia > China (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.74)

Add feedback

Measuring Compositionality in Representation Learning

Andreas, Jacob

arXiv.org Machine LearningFeb-19-2019

Many machine learning algorithms represent input data with vector embeddings or discrete codes. When inputs exhibit compositional structure (e.g. objects built from parts or procedures from subroutines), it is natural to ask whether this compositional structure is reflected in the the inputs' learned representations. While the assessment of compositionality in languages has received significant attention in linguistics and adjacent fields, the machine learning literature lacks general-purpose tools for producing graded measurements of compositional structure in more general (e.g. vector-valued) representation spaces. We describe a procedure for evaluating compositionality by measuring how well the true representation-producing model can be approximated by a model that explicitly composes a collection of inferred representational primitives. We use the procedure to provide formal and empirical characterizations of compositional structure in a variety of settings, exploring the relationship between compositionality and learning dynamics, human judgments, representational similarity, and generalization.

compositionality, proceedings, representation, (15 more...)

arXiv.org Machine Learning

1902.07181

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Education > Curriculum > Subject-Specific Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(2 more...)

Add feedback

Learning Semantic Representations for Novel Words: Leveraging Both Form and Context

Schick, Timo, Schütze, Hinrich

arXiv.org Artificial IntelligenceNov-9-2018

Word embeddings are a key component of high-performing natural language processing (NLP) systems, but it remains a challenge to learn good representations for novel words on the fly, i.e., for words that did not occur in the training data. The general problem setting is that word embeddings are induced on an unlabeled training corpus and then a model is trained that embeds novel words into this induced embedding space. Currently, two approaches for learning embeddings of novel words exist: (i) learning an embedding from the novel word's surface-form (e.g., subword n-grams) and (ii) learning an embedding from the context in which it occurs. In this paper, we propose an architecture that leverages both sources of information - surface-form and context - and show that it results in large increases in embedding quality. Our architecture obtains state-of-the-art results on the Definitional Nonce and Contextual Rare Words datasets. As input, we only require an embedding set and an unlabeled corpus for training our architecture to produce embeddings appropriate for the induced embedding space. Thus, our model can easily be integrated into any existing NLP system and enhance its capability to handle novel words.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1811.03866

Genre: Research Report (0.50)

Industry: Information Technology (0.55)

Technology: