AITopics | set transformer

Collaborating Authors

set transformer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Set based Interpolation for Few Task Learning

Neural Information Processing SystemsApr-25-2026, 06:35:36 GMT

Meta-learning approaches enable machine learning systems to adapt to new tasks given few examples by leveraging knowledge from related tasks. However, a large number of meta-training tasks are still required for generalization to unseen tasks during meta-testing, which introduces a critical bottleneck for real-world problems that come with only few tasks, due to various reasons including the difficulty and cost of constructing tasks. Recently, several task augmentation methods have been proposed to tackle this issue using domain-specific knowledge to design augmentation techniques to densify the meta-training task distribution.

artificial intelligence, dataset, machine learning, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

f6a8dd1c954c8506aadc764cc32b895e-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 03:58:30 GMT

sequence length, suggestion, transformer, (15 more...)

Neural Information Processing Systems

Genre: Summary/Review (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.32)

Add feedback

b24d516bb65a5a58079f0f3526c87c57-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 19:00:59 GMT

experiment, representation, slot set encoder, (12 more...)

Neural Information Processing Systems

Country: Asia > South Korea (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

From Mice to Trains: Amortized Bayesian Inference on Graph Data

Jedhoff, Svenja, Semenova, Elizaveta, Raulo, Aura, Meyer, Anne, Bürkner, Paul-Christian

arXiv.org Machine LearningJan-6-2026

Graphs arise across diverse domains, from biology and chemistry to social and information networks, as well as in transportation and logistics. Inference on graph-structured data requires methods that are permutation-invariant, scalable across varying sizes and sparsities, and capable of capturing complex long-range dependencies, making posterior estimation on graph parameters particularly challenging. Amortized Bayesian Inference (ABI) is a simulation-based framework that employs generative neural networks to enable fast, likelihood-free posterior inference. We adapt ABI to graph data to address these challenges to perform inference on node-, edge-, and graph-level parameters. Our approach couples permutation-invariant graph encoders with flexible neural posterior estimators in a two-module pipeline: a summary network maps attributed graphs to fixed-length representations, and an inference network approximates the posterior over parameters. In this setting, several neural architectures can serve as the summary network. In this work we evaluate multiple architectures and assess their performance on controlled synthetic settings and two real-world domains -- biology and logistics -- in terms of recovery and calibration.

artificial intelligence, machine learning, transformer, (14 more...)

arXiv.org Machine Learning

2601.02241

Country:

Europe > Germany (0.28)
North America > United States > New York (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)

Genre: Research Report (0.84)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.85)

Add feedback

Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL

Neural Information Processing SystemsDec-25-2025, 14:46:46 GMT

The cooperative Multi-Agent Reinforcement Learning (MARL) with permutation invariant agents framework has achieved tremendous empirical successes in real-world applications. Unfortunately, the theoretical understanding of this MARL problem is lacking due to the curse of many agents and the limited exploration of the relational reasoning in existing works. In this paper, we verify that the transformer implements complex relational reasoning, and we propose and analyze model-free and model-based offline MARL algorithms with the transformer approximators. We prove that the suboptimality gaps of the model-free and model-based algorithms are independent of and logarithmic in the number of agents respectively, which mitigates the curse of many agents. These results are consequences of a novel generalization error bound of the transformer and a novel analysis of the Maximum Likelihood Estimate (MLE) of the system dynamics with the transformer. Our model-based algorithm is the first provably efficient MARL algorithm that explicitly exploits the permutation invariance of the agents. Our improved generalization bound may be of independent interest and is applicable to other regression problems related to the transformer beyond MARL.

provable efficiency and application, relational reasoning, set transformer, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Add feedback

Convolutional Set Transformer

Chinello, Federico, Boracchi, Giacomo

arXiv.org Artificial IntelligenceSep-30-2025

We introduce the Convolutional Set Transformer (CST), a novel neural architecture designed to process image sets of arbitrary cardinality that are visually heterogeneous yet share high-level semantics - such as a common category, scene, or concept. Existing set-input networks, e.g., Deep Sets and Set Transformer, are limited to vector inputs and cannot directly handle 3D image tensors. As a result, they must be cascaded with a feature extractor, typically a CNN, which encodes images into embeddings before the set-input network can model inter-image relationships. In contrast, CST operates directly on 3D image tensors, performing feature extraction and contextual modeling simultaneously, thereby enabling synergies between the two processes. This design yields superior performance in tasks such as Set Classification and Set Anomaly Detection and further provides native compatibility with CNN explainability methods such as Grad-CAM, unlike competing approaches that remain opaque. Finally, we show that CSTs can be pre-trained on large-scale datasets and subsequently adapted to new domains and tasks through standard Transfer Learning schemes. To support further research, we release CST-15, a CST backbone pre-trained on ImageNet (https://github.com/chinefed/convolutional-set-transformer).

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.22889

Country: Europe (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

e332505c4c80ad1d9dc0af26103b672b-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 13:35:16 GMT

artificial intelligence, dataset, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Abundance-Aware Set Transformer for Microbiome Sample Embedding

Yoo, Hyunwoo, Rosen, Gail

arXiv.org Artificial IntelligenceAug-18-2025

Microbiome sample representation to input into LLMs is essential for downstream tasks such as phenotype prediction and environmental classification. While prior studies have explored embedding-based representations of each microbiome sample, most rely on simple averaging over sequence embeddings, often overlooking the biological importance of taxa abundance. In this work, we propose an abundance-aware variant of the Set Transformer to construct fixed-size sample-level embeddings by weighting sequence embeddings according to their relative abundance. Without modifying the model architecture, we replicate embedding vectors proportional to their abundance and apply self-attention-based aggregation. Our method outperforms average pooling and unweighted Set Transformers on real-world microbiome classification tasks, achieving perfect performance in some cases. These results demonstrate the utility of abundance-aware aggregation for robust and biologically informed microbiome representation. To the best of our knowledge, this is one of the first approaches to integrate sequence-level abundance into Transformer-based sample embeddings.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.11075

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.94)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

encouraged that reviewers find our paper clear and well written (R1, R2, R3) and our method to be theoretically sound

Neural Information Processing SystemsAug-17-2025, 08:06:33 GMT

We would like to thank the reviewers for their helpful comments and their thorough evaluation of our work. Reversible layers is a technique introduced by Gomez et al. (2017) and is orthogonal and In contrast, clustered attention places no such restriction. We will also add Set Transformers to the related work section. Is speech favorable to clustering? We would like to mention that our NLP approximation experiment for GLUE and SQuAD tasks in 4.3 shows that NLP/vision tasks in the long context setting, as suggested.

artificial intelligence, machine learning, sequence length, (17 more...)

Neural Information Processing Systems

Genre: Summary/Review (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.32)

Add feedback