AITopics | Markov Models

RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning

Neural Information Processing SystemsOct-10-2025, 05:38:18 GMT

Reinforcement Learning from Human Feedback (RLHF) has recently surged in popularity, particularly for aligning large language models and other AI systems with human intentions. At its core, RLHF can be viewed as a specialized instance of Preference-based Reinforcement Learning (PbRL), where the preferences specifically originate from human judgments rather than arbitrary evaluators. Despite this connection, most existing approaches in both RLHF and PbRL primarily focus on optimizing a mean reward objective, neglecting scenarios that necessitate risk-awareness, such as AI safety, healthcare, and autonomous driving. These scenarios often operate under a one-episode-reward setting, which makes conventional risk-sensitive objectives inapplicable.

algorithm, objective, trajectory, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Oregon (0.04)
North America > United States > California > Yolo County > Davis (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)
Workflow (0.68)

Industry:

Transportation > Ground > Road (0.34)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

6e09c213ac18d6375704a4f3ea75c4f8-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 05:31:37 GMT

experiment, fusion, mamba, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Africa > Eswatini > Manzini > Manzini (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (0.67)
Information Technology (0.67)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

6df3a719d99bd2479c04114d357003d0-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 05:25:40 GMT

accumulation, agent, cultural accumulation, (15 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language (0.68)
(2 more...)

Add feedback

Minimum Entropy Coupling with Bottleneck

Neural Information Processing SystemsOct-10-2025, 05:25:29 GMT

This paper investigates a novel lossy compression framework operating under logarithmic loss, designed to handle situations where the reconstruction distribution diverges from the source distribution.

coupling, deterministic mapping, information, (11 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Ontario > Hamilton (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
(2 more...)

Add feedback

6bdde0373d53d4a501249547084bed43-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 05:15:13 GMT

diamond, international conference, world model, (13 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > Greece > Attica > Athens (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution Y ang Yue

Neural Information Processing SystemsOct-10-2025, 04:49:39 GMT

MLLM based on each situation at hand.

arxiv preprint arxiv, language model, mllm, (15 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Shikoku > Kagawa Prefecture > Takamatsu (0.04)
Asia > China (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

66b35d2e8d524706f39cc21f5337b002-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 04:43:49 GMT

agent, overfitness, specialization, (15 more...)

Neural Information Processing Systems

Country: Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.69)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs

Neural Information Processing SystemsOct-10-2025, 04:39:22 GMT

The interaction is usually modeled as Markov Decision Processes (MDPs). Research on MDPs can be broadly divided into two lines based on the reward generation mechanism. The first line of work [Jaksch et al., 2010, Azar et al., 2013, 2017, He et al., 2021] considers the

algorithm, dynamic regret, linear mixture mdp, (15 more...)

Neural Information Processing Systems

Country: