AITopics | Reinforcement Learning

Conditional Mutual Information for Disentangled Representations in Reinforcement Learning

Neural Information Processing SystemsApr-30-2026, 10:08:45 GMT

Reinforcement Learning (RL) environments can produce training data with spurious correlations between features due to the amount of training data or its limited feature coverage. This can lead to RL agents encoding these misleading correlations in their latent representation, preventing the agent from generalising if the correlation changes within the environment or when deployed in the real world. Disentangled representations can improve robustness, but existing disentanglement techniques that minimise mutual information between features require independent features, thus they cannot disentangle correlated features. We propose an auxiliary task for RL algorithms that learns a disentangled representation of high-dimensional observations with correlated features by minimising the conditional mutual information between features in the representation. We demonstrate experimentally, using continuous control tasks, that our approach improves generalisation under correlation shifts, as well as improving the training performance of RL algorithms in the presence of correlated features.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.47)

Industry: Education > Educational Setting > Continuing Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

fc8ee7c7ab5b5f6b1615045dfb617ed6-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 09:54:31 GMT

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Asia > China (0.28)
Europe > France (0.28)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)

Add feedback

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models

Neural Information Processing SystemsApr-30-2026, 09:53:57 GMT

Learning from human feedback has been shown to improve text-to-image models. These techniques first learn a reward function that captures what humans care about in the task and then improve the models based on the learned reward function. Even though relatively simple approaches (e.g., rejection sampling based on reward scores) have been investigated, fine-tuning text-to-image models with the reward function remains challenging. In this work, we propose using online reinforcement learning (RL) to fine-tune text-to-image models. We focus on diffusion models, defining the fine-tuning task as an RL problem, and updating the pre-trained text-to-image diffusion models using policy gradient to maximize the feedbacktrained reward. Our approach, coined DPOK, integrates policy optimization with KL regularization. We conduct an analysis of KL regularization for both RL fine-tuning and supervised fine-tuning. In our experiments, we show that DPOK is generally superior to supervised fine-tuning with respect to both image-text alignment and image quality.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Instructional Material (0.34)
Research Report > New Finding (0.34)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

AIhub monthly digest: April 2026 – machine learning for particle physics, AI Index Report, and table tennis

AIHubApr-30-2026, 09:10:38 GMT

Welcome to our monthly digest, where you can catch up with any AIhub stories you may have missed, peruse the latest news, recap recent events, and more. This month, we meet PhD students and early-career researchers, find out how machine learning is used for particle physics discoveries, cast an eye over the latest AI Index Report, and watch a robot beating elite players at table tennis. In an article published in Nature this month, Sony AI introduced Ace, a table tennis robot that has beaten professional players in competitive matches. The system combines event-based vision sensors and a control system based on model-free reinforcement learning, as well as state-of-the-art high-speed robot hardware. The ninth edition of the Artificial Intelligence Index Report was published on 13 April 2026 .

index report, machine learning, reinforcement learning, (14 more...)

AIHub

Industry: Leisure & Entertainment > Sports > Tennis (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.36)

Add feedback

f93df618c6907bc0a03222040d70d004-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 08:53:28 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)

Add feedback

f57ffe47d0b528fbb97901d16bd4eba2-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 07:58:10 GMT

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Education (1.00)
Leisure & Entertainment > Games > Computer Games (0.93)
Information Technology (0.93)
Transportation > Ground > Road (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

f31bf160569618084ba9bdc2a8de29d0-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 07:10:32 GMT

machine learning, reinforcement learning, trajectory, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Momentum Provably Improves Error Feedback!

Neural Information Processing SystemsApr-30-2026, 06:39:55 GMT

Due to the high communication overhead when training machine learning models in a distributed environment, modern algorithms invariably rely on lossy communication compression. However, when untreated, the errors caused by compression propagate, and can lead to severely unstable behavior, including exponential divergence. Almost a decade ago, Seide et al. [2014] proposed an error feedback (EF) mechanism, which we refer to as EF14, as an immensely effective heuristic for mitigating this issue. However, despite steady algorithmic and theoretical advances in the EF field in the last decade, our understanding is far from complete. In this work we address one of the most pressing issues.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: Asia (0.27)

Genre: Research Report (0.45)

Technology: