AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

82389fbff376d1e8aec510916d50d054-Paper-Conference.pdf

Neural Information Processing SystemsOct-11-2025, 00:28:32 GMT

dataset, reinforcement, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

Europe > Hungary > Budapest > Budapest (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Provable Partially Observable Reinforcement Learning with Privileged Information Yang Cai

Neural Information Processing SystemsOct-11-2025, 00:27:23 GMT

Partial observability of the underlying states generally presents significant challenges for reinforcement learning (RL).

algorithm, information, learning, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Ohio > Cuyahoga County > Cleveland (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Workflow (0.93)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

Add feedback

62a9c80248963f348778a9c0bec060dd-Paper-Conference.pdf

Neural Information Processing SystemsOct-11-2025, 00:23:56 GMT

algorithm, mdp, reward function, (16 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Lombardy > Milan (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Genre: Research Report > Experimental Study (0.92)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Data Science (0.67)

Add feedback

Adversarially Trained Weighted Actor-Critic for Safe Offline Reinforcement Learning

Neural Information Processing SystemsOct-11-2025, 00:23:27 GMT

Additionally, we offer a practical version of WSAC and compare it with existing state-of-the-art safe offline RL algorithms in several continuous control environments.

algorithm, assumption, behavior policy, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington (0.04)
North America > United States > New Jersey (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

5c186016d0844767209dc36e9e61441b-Paper-Conference.pdf

Neural Information Processing SystemsOct-11-2025, 00:23:12 GMT

DeMa's focus on sequences diminishes approximately exponentially.

arxiv preprint arxiv, attention mechanism, dema, (11 more...)

Neural Information Processing Systems

Country:

Asia > China > Zhejiang Province (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment > Games (0.47)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
(2 more...)

Add feedback

Online Control with Adversarial Disturbance for Continuous-time Linear Systems

Neural Information Processing SystemsOct-11-2025, 00:22:36 GMT

A major challenge in robotics is to deploy simulated controllers into real-world.

algorithm, assumption, inequality, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Robots (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

3eec5006051d9544e717067de3220198-Paper-Conference.pdf

Neural Information Processing SystemsOct-11-2025, 00:18:24 GMT

dormant neuron, neuron, over-active neuron, (15 more...)

Neural Information Processing Systems

Country: Asia > China > Fujian Province > Xiamen (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

3750e99b522bd36a099d2e8b9f0550c7-Paper-Conference.pdf

Neural Information Processing SystemsOct-11-2025, 00:17:28 GMT

cumulative cost, minimum-cost reach-avoid problem, reach-avoid problem, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Energy (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference

Neural Information Processing SystemsOct-11-2025, 00:17:24 GMT

Specifically, instead of directly measuring the divergence with paired images, we train a reward model with the dataset we construct, consisting of nearly 51,000 images annotated with human preferences.

arxiv preprint arxiv, dataset, diffusion model, (14 more...)

Neural Information Processing Systems

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting Xiong-Hui Chen

Neural Information Processing SystemsOct-11-2025, 00:12:29 GMT

However, current research for decision-making, like reinforcement learning (RL), has primarily required numerous real interactions with the target environment to learn a skill, while failing to utilize the existing knowledge already summarized in the text.

arxiv preprint arxiv, dataset, knowledge, (14 more...)

Neural Information Processing Systems

Country: