AITopics | Reinforcement Learning

Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation Long-Fei Li

Neural Information Processing SystemsOct-10-2025, 05:14:09 GMT

Reinforcement Learning (RL) with function approximation has achieved remarkable success in various applications involving large state and action spaces, such as games [Silver et al., 2016],

algorithm, function approximation, log null 1, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > Japan > Honshū > Kantō > Chiba Prefecture > Chiba (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.61)

Add feedback

Rethinking Exploration in Reinforcement Learning with Effective Metric-Based Exploration Bonus

Neural Information Processing SystemsOct-10-2025, 05:00:47 GMT

Additionally, methods that utilize the bisimulation metric for evaluating state discrepancies face a theory-practice gap due to improper approximations in metric learning, particularly struggling with hard exploration tasks.

agent, exploration, exploration bonus, (14 more...)

Neural Information Processing Systems

Country:

Asia > Macao (0.14)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Hong Kong (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment > Games > Computer Games (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.84)

Add feedback

GT A: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning

Neural Information Processing SystemsOct-10-2025, 04:50:23 GMT

Offline Reinforcement Learning (Offline RL) presents challenges of learning effective decision-making policies from static datasets without any online interactions.

dataset, diffusion model, trajectory, (15 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

665654759cdf2114c0cbe2b8e501e00e-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 04:43:03 GMT

adversary, learner, markov game, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Baltimore (0.04)
North America > United States > Texas (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Middle East > Cyprus > Pafos > Paphos (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Leisure & Entertainment > Games (1.00)
Education (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Game Theory (0.67)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

663bce02a0050c4a11f1eb8a7f1429d3-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 04:42:53 GMT

algorithm, dataset, reference policy, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Montana (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Robots (0.93)
(2 more...)

Add feedback

661c37f3b098bdee53fd7d9c4ef6964a-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 04:42:30 GMT

international conference, meow, proceedings, (13 more...)

Neural Information Processing Systems

Country:

Asia > Taiwan (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
Europe > Portugal > Braga > Braga (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
(3 more...)

Add feedback

Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs

Neural Information Processing SystemsOct-10-2025, 04:39:22 GMT

The interaction is usually modeled as Markov Decision Processes (MDPs). Research on MDPs can be broadly divided into two lines based on the reward generation mechanism. The first line of work [Jaksch et al., 2010, Azar et al., 2013, 2017, He et al., 2021] considers the

algorithm, dynamic regret, linear mixture mdp, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Pre-Trained Multi-Goal Transformers with Prompt Optimization for Efficient Online Adaptation

Neural Information Processing SystemsOct-10-2025, 04:29:35 GMT

We adopt a multi-armed bandit framework for this process, enhancing prompt selection based on the returns from online trajectories.

dataset, online adaptation, trajectory, (14 more...)

Neural Information Processing Systems

Country: Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Leisure & Entertainment (0.46)
Information Technology (0.46)
Education (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(4 more...)

Add feedback

PEAC: Unsupervised Pre-training for Cross-Embodiment Reinforcement Learning

Neural Information Processing SystemsOct-10-2025, 04:21:05 GMT

Designing generalizable agents capable of adapting to diverse embodiments has achieved significant attention in Reinforcement Learning (RL), which is critical for deploying RL agents in various real-world applications. Previous Cross-Embodiment RL approaches have focused on transferring knowledge across embodiments within specific tasks.

agent, downstream task, embodiment, (14 more...)

Neural Information Processing Systems

Country: Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: