AITopics | Reinforcement Learning

MoVie: Visual Model-Based Policy Adaptation for View Generalization Sizhe Y ang

Neural Information Processing SystemsFeb-11-2026, 02:46:31 GMT

Visual Reinforcement Learning (RL) agents trained on limited views face significant challenges in generalizing their learned abilities to unseen views.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

POMO: PolicyOptimizationwithMultipleOptima forReinforcementLearning

Neural Information Processing SystemsFeb-11-2026, 02:21:27 GMT

We introduce Policy Optimization with Multiple Optima (POMO), anend-to-end approach forbuildingsuchaheuristic solver.POMO isapplicable to a wide range of CO problems.

machine learning, pomo, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference

Neural Information Processing SystemsFeb-11-2026, 02:20:49 GMT

Specifically, instead of directly measuring the divergence with paired images, we train a reward model with the dataset we construct, consisting of nearly 51,000 images annotated with human preferences.

diffusion model, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Improving Zero-shot Generalizationin Offline Reinforcement Learning using Generalized Similarity Functions

Neural Information Processing SystemsFeb-11-2026, 01:51:47 GMT

large language model, machine learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)

Add feedback

f0eb6568ea114ba6e293f903c34d7488-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 01:42:15 GMT

Several works haveshown this vulnerability via adversarial attacks, butexisting approaches onimproving therobustness ofDRL under this setting have limited success and lack for theoretical principles. We show that naively applying existing techniques on improving robustness for classification tasks,likeadversarialtraining,areineffectiveformanyRLtasks.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Industry:

Information Technology (0.49)
Government > Military (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

TheDifficultyofPassiveLearning inDeepReinforcementLearning

Neural Information Processing SystemsFeb-11-2026, 01:41:11 GMT

Given the impressive results of deepreinforcement learning, weargueforaneedtomoreclearly understand the challenges inthis setting.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

ASelf-TuningActor-CriticAlgorithm

Neural Information Processing SystemsFeb-11-2026, 01:13:18 GMT

The general concept is to represent the training loss as a function of both the agent parameters and the hyperparameters. The agent optimizes the parameters to minimize this loss function, w.r.t the current hyperparameters.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)

Add feedback

f02208a057804ee16ac72ff4d3cec53b-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 01:13:10 GMT

hyperparameter, metaparameter, st acx, (12 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Industry: Leisure & Entertainment > Games (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

9d9258fd703057246cb341e615426e2d-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-11-2026, 01:03:01 GMT

dataset, learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Genre: Research Report (0.93)

Industry:

Education (0.68)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

XDO: ADoubleOracleAlgorithmfor Extensive-FormGames

Neural Information Processing SystemsFeb-11-2026, 01:02:49 GMT

Policy Space Response Oracles (PSRO) is a reinforcement learning (RL) algorithm for two-player zero-sum games that has been empirically shown to find approximate Nash equilibria in large games.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (1.00)

Technology: