AITopics | bottleneck layer

Predictive Coding Enhances Meta-RLTo Achieve Interpretable Bayes-Optimal Belief Representation Under Partial Observability

Neural Information Processing SystemsJun-15-2026, 19:32:40 GMT

Learning a compact representation of history is critical for planning and generalization in partially observable environments. While meta-reinforcement learning (RL) agents can attain near Bayes-optimal policies, they often fail to learn the compact, interpretable Bayes-optimal belief states. This representational inefficiency potentially limits the agent's adaptability and generalization capacity. Inspired by predictive coding in neuroscience--which suggests that the brain predicts sensory inputs as a neural implementation of Bayesian inference--and by auxiliary predictive objectives in deep RL, we investigate whether integrating self-supervised predictive coding modules into meta-RL can facilitate learning of Bayes-optimal representations. Through state machine simulation, we show that meta-RL with predictive modules consistently generates more interpretable representations that better approximate Bayes-optimal belief states compared to conventional meta-RL across a wide variety of tasks, even when both achieve optimal policies. In challenging tasks requiring active information seeking, only meta-RL with predictive modules successfully learns optimal representations and policies, whereas conventional meta-RL struggles with inadequate representation learning. Finally, we demonstrate that better representation learning leads to improved generalization. Our results strongly suggest the role of predictive learning as a guiding principle for effective representation learning in agents navigating partial observability.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.88)
Law > Litigation (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

4d19b37a2c399deace9082d464930022-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 19:40:15 GMT

artificial intelligence, machine learning, probability, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Material

Neural Information Processing SystemsApr-24-2026, 21:10:38 GMT

This supplemental material introduces implementation details, additional comparison experiments, complete proofs and checklist of our proposed model. A.1 Implementation Details Network Architecture: Inspired by [33], we utilize a pre-trained ResNet-50 [20] as the feature extractor for object recognition tasks (i.e., Office-31 [22], Office-Caltech [18] and Office-Home [46]). The penultimate fully-connected layer is replaced with a bottleneck layer and a classifier with weight normalization. Batch normalization is employed to normalize the outputs of bottleneck layer. For digit recognition task (i.e., Digits-Five [41]), we utilize a variant of the LeNet [27] as the feature extractor and classifier.

artificial intelligence, machine learning, probability, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

4d19b37a2c399deace9082d464930022-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 14:05:57 GMT

lemma 2, log 1, probability, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

4d19b37a2c399deace9082d464930022-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 14:05:54 GMT

adversarial example, lemma 2, probability, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Predictive Coding Enhances Meta-RL To Achieve Interpretable Bayes-Optimal Belief Representation Under Partial Observability

Kuo, Po-Chen, Hou, Han, Dabney, Will, Walker, Edgar Y.

arXiv.org Artificial IntelligenceOct-28-2025

Learning a compact representation of history is critical for planning and generalization in partially observable environments. While meta-reinforcement learning (RL) agents can attain near Bayes-optimal policies, they often fail to learn the compact, interpretable Bayes-optimal belief states. This representational inefficiency potentially limits the agent's adaptability and generalization capacity. Inspired by predictive coding in neuroscience--which suggests that the brain predicts sensory inputs as a neural implementation of Bayesian inference--and by auxiliary predictive objectives in deep RL, we investigate whether integrating self-supervised predictive coding modules into meta-RL can facilitate learning of Bayes-optimal representations. Through state machine simulation, we show that meta-RL with predictive modules consistently generates more interpretable representations that better approximate Bayes-optimal belief states compared to conventional meta-RL across a wide variety of tasks, even when both achieve optimal policies. In challenging tasks requiring active information seeking, only meta-RL with predictive modules successfully learns optimal representations and policies, whereas conventional meta-RL struggles with inadequate representation learning. Finally, we demonstrate that better representation learning leads to improved generalization. Our results strongly suggest the role of predictive learning as a guiding principle for effective representation learning in agents navigating partial observability.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2510.22039

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.88)
Law > Litigation (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

4d19b37a2c399deace9082d464930022-Supplemental.pdf

Neural Information Processing SystemsAug-14-2025, 09:20:38 GMT

artificial intelligence, machine learning, probability, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Impact of Bottleneck Layers and Skip Connections on the Generalization of Linear Denoising Autoencoders

Ham, Jonghyun, Fleissner, Maximilian, Ghoshdastidar, Debarghya

arXiv.org Machine LearningJun-2-2025

Modern deep neural networks exhibit strong generalization even in highly overparameterized regimes. Significant progress has been made to understand this phenomenon in the context of supervised learning, but for unsupervised tasks such as denoising, several open questions remain. While some recent works have successfully characterized the test error of the linear denoising problem, they are limited to linear models (one-layer network). In this work, we focus on two-layer linear denoising autoencoders trained under gradient flow, incorporating two key ingredients of modern deep learning architectures: A low-dimensional bottleneck layer that effectively enforces a rank constraint on the learned solution, as well as the possibility of a skip connection that bypasses the bottleneck. We derive closed-form expressions for all critical points of this model under product regularization, and in particular describe its global minimizer under the minimum-norm principle. From there, we derive the test risk formula in the overparameterized regime, both for models with and without skip connections. Our analysis reveals two interesting phenomena: Firstly, the bottleneck layer introduces an additional complexity measure akin to the classical bias-variance trade-off -- increasing the bottleneck width reduces bias but introduces variance, and vice versa. Secondly, skip connection can mitigate the variance in denoising autoencoders -- especially when the model is mildly overparameterized. We further analyze the impact of skip connections in denoising autoencoder using random matrix theory and support our claims with numerical evidence.

artificial intelligence, machine learning, skip connection, (18 more...)

arXiv.org Machine Learning

2505.24668

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Attention GhostUNet++: Enhanced Segmentation of Adipose Tissue and Liver in CT Images

Hayat, Mansoor, Aramvith, Supavadee, Bhattacharjee, Subrata, Ahmad, Nouman

arXiv.org Artificial IntelligenceApr-17-2025

-- Accurate segmentation of abdominal adipose tissue, including subcutaneous (SA T) and visceral adipose tissue (V A T), along with liver segmentation, is essential for understanding body composition and associated health risks such as type 2 diabetes and cardiovascular disease. This study proposes Attention GhostUNet++, a novel deep learning model incorporating Channel, Spatial, and Depth Attention mechanisms into the Ghost UNet++ bottleneck for automated, precise segmentation. Evaluated on the AA TTCT -IDS and LiTS datasets, the model achieved Dice coefficients of 0.9430 for V A T, 0.9639 for SA T, and 0.9652 for liver segmentation, surpassing baseline models. Despite minor limitations in boundary detail segmentation, the proposed model significantly enhances feature refinement, contextual understanding, and computational efficiency, offering a robust solution for body composition analysis. Clinical relevance -- The Attention GhostUNet++ model offers a significant advancement in the automated segmentation of adipose tissue and liver regions from CT images.

artificial intelligence, machine learning, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2504.11491

Country:

Europe > Spain (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.71)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Analytical Discovery of Manifold with Machine Learning

Shen, Yafei, Ma, Huan-Fei, Yang, Ling

arXiv.org Machine LearningApr-3-2025

A NALYTICALD ISCOVERY OF M ANIFOLD WITH M A-CHINE L EARNING Y afei Shen 1 & Huan-Fei Ma 1, & Ling Y ang 1, 1 School of Mathematical Sciences, Soochow University, Suzhou 215006, China A BSTRACT Understanding low-dimensional structures within high-dimensional data is crucial for visualization, interpretation, and denoising in complex datasets. Despite the advancements in manifold learning techniques, key challenges--such as limited global insight and the lack of interpretable analytical descriptions--remain unresolved. In this work, we introduce a novel framework, GAMLA (Global Analytical Manifold Learning using Auto-encoding). GAMLA employs a two-round training process within an auto-encoding framework to derive both character and complementary representations for the underlying manifold. With the character representation, the manifold is represented by a parametric function which unfold the manifold to provide a global coordinate. While with the complementary representation, an approximate explicit manifold description is developed, offering a global and analytical representation of smooth manifolds underlying high-dimensional datasets. This enables the analytical derivation of geometric properties such as curvature and normal vectors. Moreover, we find the two representations together decompose the whole latent space and can thus characterize the local spatial structure surrounding the manifold, proving particularly effective in anomaly detection and categorization. Through extensive experiments on benchmark datasets and real-world applications, GAMLA demonstrates its ability to achieve computational efficiency and interpretability while providing precise geometric and structural insights. This framework bridges the gap between data-driven manifold learning and analytical geometry, presenting a versatile tool for exploring the intrinsic properties of complex data sets. 1 I NTRODUCTION Discovering low-dimensional structures, particularly their geometric properties, from high-dimensional data clouds enables visualization, denoising, and interpretation of complex datasets (Meil a & Zhang, 2023; Belkin & Niyogi, 2003; van der Maaten & Hinton, 2008; McInnes & Healy, 2018; Luo & Hu, 2020). As a result, the concept of manifold learning has attracted significant attention, leading to numerous breakthroughs over the past two decades.

data mining, machine learning, manifold, (19 more...)

arXiv.org Machine Learning

2504.02511

Country: