AITopics

Country: Europe (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)

Streit, Julian, Weindel, Franziska, Heckel, Reinhard

Transformer-Based Decoding in Concatenated Coding Schemes Under Synchronization Errors

arXiv.org Artificial IntelligenceNov-4-2025

We consider the reconstruction of a codeword from multiple noisy copies that are independently corrupted by insertions, deletions, and substitutions. This problem arises, for example, in DNA data storage. A common code construction uses a concatenated coding scheme that combines an outer linear block code with an inner code, which can be either a nonlinear marker code or a convolutional code. Outer decoding is done with Belief Propagation, and inner decoding is done with the Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm. However, the BCJR algorithm scales exponentially with the number of noisy copies, which makes it infeasible to reconstruct a codeword from more than about four copies. In this work, we introduce BCJRFormer, a transformer-based neural inner decoder. BCJRFormer achieves error rates comparable to the BCJR algorithm for binary and quaternary single-message transmissions of marker codes. Importantly, BCJRFormer scales quadratically with the number of noisy copies. This property makes BCJRFormer well-suited for DNA data storage, where multiple reads of the same DNA strand occur. To lower error rates, we replace the Belief Propagation outer decoder with a transformer-based decoder. Together, these modifications yield an efficient and performant end-to-end transformer-based pipeline for decoding multiple noisy copies affected by insertion, deletion, and substitution errors. Additionally, we propose a novel cross-attending transformer architecture called ConvBCJRFormer. This architecture extends BCJRFormer to decode transmissions of convolutional codewords, serving as an initial step toward joint inner and outer decoding for more general linear code classes.

artificial intelligence, machine learning, sequence, (17 more...)

doi: 10.1109/ISIT63088.2025.11195598

2511.00999

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsAug-19-2025, 01:56:54 GMT

cf6501108fced72ee5c47e2151c4e153-Paper-Conference.pdf

large language model, machine learning, natural language, (20 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Neural Information Processing SystemsAug-17-2025, 18:05:18 GMT

af790b7ae573771689438bbcfc5933fe-Supplemental-Conference.pdf

artificial intelligence, dataset, machine learning, (19 more...)

Country: Europe (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Neural Information Processing SystemsAug-17-2025, 18:05:14 GMT

af790b7ae573771689438bbcfc5933fe-Paper-Conference.pdf

artificial intelligence, bayesian inference, machine learning, (14 more...)

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.46)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

arXiv.org Artificial IntelligenceSep-19-2024

KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning

Liu, Junnan, Mao, Qianren, Jiang, Weifeng, Li, Jianxin

Knowledge graph reasoning plays a vital role in various applications and has garnered considerable attention. Recently, path-based methods have achieved impressive performance. However, they may face limitations stemming from constraints in message-passing neural networks, such as missing paths and information over-squashing. In this paper, we revisit the application of transformers for knowledge graph reasoning to address the constraints faced by path-based methods and propose a novel method KnowFormer.KnowFormer utilizes a transformer architecture to perform reasoning on knowledge graphs from the message-passing perspective, rather than reasoning by textual information like previous pretrained language model based methods. Specifically, we define the attention computation based on the query prototype of knowledge graph reasoning, facilitating convenient construction and efficient optimization. To incorporate structural information into the self-attention mechanism, we introduce structure-aware modules to calculate query, key, and value respectively. Additionally, we present an efficient attention computation method for better scalability. Experimental results demonstrate the superior performance of KnowFormer compared to prominent baseline methods on both transductive and inductive benchmarks.

ormer, reasoning, revisiting transformer, (11 more...)

2409.12865

Country:

North America > United States (0.46)
Europe > Austria > Vienna (0.14)
Asia > China > Beijing > Beijing (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (0.67)
Education (0.67)
Information Technology (0.46)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Artificial IntelligenceJun-28-2024

PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators

Zeng, Kuo-Hao, Zhang, Zichen, Ehsani, Kiana, Hendrix, Rose, Salvador, Jordi, Herrasti, Alvaro, Girshick, Ross, Kembhavi, Aniruddha, Weihs, Luca

We present PoliFormer (Policy Transformer), an RGB-only indoor navigation agent trained end-to-end with reinforcement learning at scale that generalizes to the real-world without adaptation despite being trained purely in simulation. PoliFormer uses a foundational vision transformer encoder with a causal transformer decoder enabling long-term memory and reasoning. It is trained for hundreds of millions of interactions across diverse environments, leveraging parallelized, multi-machine rollouts for efficient training with high throughput. PoliFormer is a masterful navigator, producing state-of-the-art results across two distinct embodiments, the LoCoBot and Stretch RE-1 robots, and four navigation benchmarks. It breaks through the plateaus of previous work, achieving an unprecedented 85.5% success rate in object goal navigation on the CHORES-S benchmark, a 28.5% absolute improvement. PoliFormer can also be trivially extended to a variety of downstream applications such as object tracking, multi-object navigation, and open-vocabulary navigation with no finetuning.

agent, benchmark, ormer, (13 more...)

2406.20083

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
(10 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Hasan, Adib, Roozbehani, Mardavij, Dahleh, Munther

WeatherFormer: A Pretrained Encoder Model for Learning Robust Weather Representations from Small Datasets

arXiv.org Machine LearningMay-22-2024

This paper introduces WeatherFormer, a transformer encoder-based model designed to learn robust weather features from minimal observations. It addresses the challenge of modeling complex weather dynamics from small datasets, a bottleneck for many prediction tasks in agriculture, epidemiology, and climate science. WeatherFormer was pretrained on a large pretraining dataset comprised of 39 years of satellite measurements across the Americas. With a novel pretraining task and fine-tuning, WeatherFormer achieves state-of-the-art performance in county-level soybean yield prediction and influenza forecasting. Technical innovations include a unique spatiotemporal encoding that captures geographical, annual, and seasonal variations, adapting the transformer architecture to continuous weather data, and a pretraining strategy to learn representations that are robust to missing weather features. This paper for the first time demonstrates the effectiveness of pretraining large transformer encoder models for weather-dependent applications across multiple domains.

dataset, transformer 0, weather measurement, (14 more...)

arXiv.org Machine Learning

2405.17455

Country:

Asia > China (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
North America > United States > New York (0.05)
(12 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

arXiv.org Artificial IntelligenceMar-21-2024

FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer

Hwang, Dongyeong, Kim, Hyunju, Kim, Sunwoo, Shin, Kijung

The success of a specific neural network architecture is closely tied to the dataset and task it tackles; there is no one-size-fits-all solution. Thus, considerable efforts have been made to quickly and accurately estimate the performances of neural architectures, without full training or evaluation, for given tasks and datasets. Neural architecture encoding has played a crucial role in the estimation, and graphbased methods, which treat an architecture as a graph, have shown prominent performance. For enhanced representation learning of neural architectures, we introduce FlowerFormer, a powerful graph transformer that incorporates the information flows within a neural architecture. FlowerFormer consists of two key components: (a) bidirectional asynchronous message passing, inspired by the flows; (b) global attention built on flow-based masking. Our extensive experiments demonstrate the superiority of FlowerFormer over existing neural encoding methods, and its effectiveness extends beyond computer vision models to include graph neural networks and auto speech recognition models. Our code is available at http://github.com/y0ngjaenius/CVPR2024_FLOWERFormer.

architecture, neural architecture, ormer, (17 more...)

2403.12821

Country:

North America > United States (0.28)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceJun-14-2023

NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification

Wu, Qitian, Zhao, Wentao, Li, Zenan, Wipf, David, Yan, Junchi

Graph neural networks have been extensively studied for learning with inter-connected data. Despite this, recent evidence has revealed GNNs' deficiencies related to over-squashing, heterophily, handling long-range dependencies, edge incompleteness and particularly, the absence of graphs altogether. While a plausible solution is to learn new adaptive topology for message passing, issues concerning quadratic complexity hinder simultaneous guarantees for scalability and precision in large networks. In this paper, we introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes, as an important building block for a pioneering Transformer-style network for node classification on large graphs, dubbed as \textsc{NodeFormer}. Specifically, the efficient computation is enabled by a kernerlized Gumbel-Softmax operator that reduces the algorithmic complexity to linearity w.r.t. node numbers for learning latent graph structures from large, potentially fully-connected graphs in a differentiable manner. We also provide accompanying theory as justification for our design. Extensive experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs (with up to 2M nodes) and graph-enhanced applications (e.g., image classification) where input graphs are missing.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2306.08385

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)