AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

2c23b3c72127e15fedc276722faee927-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 08:54:14 GMT

equation, knowledge, knowledge policy, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Distributional Policy Evaluation: a Maximum Entropy approach to Representation Learning

Neural Information Processing SystemsOct-8-2025, 08:36:09 GMT

In Distributional Reinforcement Learning (D-RL) [Bellemare et al., 2023], an agent aims to estimate Sutton and Barto, 2018], where the objective is to predict the expected return only. In Section 3, we answer this methodological question, showing that it is possible to reformulate Policy Evaluation in a distributional setting so that its performance index is explicitly intertwined with the representation of the (state or action) spaces.

factorization, policy evaluation, representation, (16 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Lombardy > Milan (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.42)

Add feedback

2a91de02871011d0090e662ffd6f2328-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 08:35:36 GMT

estimation error, international conference, neural network, (12 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
North America > United States > New Jersey (0.04)
North America > United States > Michigan (0.04)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

2a095b46705d7e6f81fc50270fe770c2-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 08:19:44 GMT

arxiv preprint arxiv, q-function, realizability, (11 more...)

Neural Information Processing Systems

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > Strength High (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Add feedback

2a095b46705d7e6f81fc50270fe770c2-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 08:19:41 GMT

arxiv preprint arxiv, q-function, realizability, (11 more...)

Neural Information Processing Systems

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > Strength High (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)

Add feedback

A Long N-step Surrogate Stage Reward for Deep Reinforcement Learning Junmin Zhong Arizona State University Ruofan Wu Arizona State University Jennie Si Arizona State University

Neural Information Processing SystemsOct-8-2025, 08:19:09 GMT

DDPG (TD3) [7], have demonstrated their great potential. Contributions . 1) We introduce a new, simple yet effective surrogate reward They usually proceed as follows.

algorithm, equation, lnss, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Arizona (1.00)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

29ef811e72b2b97cf18dd5d866b0f472-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 08:19:04 GMT

algorithm, drl algorithm, lnss, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Arizona (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Constraint-Conditioned Policy Optimization for Versatile Safe Reinforcement Learning

Neural Information Processing SystemsOct-8-2025, 08:17:36 GMT

CCPO can generalize to diverse unseen constraint thresholds without retraining the policy.

arxiv preprint arxiv, reinforcement, threshold, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)

Add feedback

A Partially Supervised Reinforcement Learning Framework for Visual Active 440 Search: Supplementary Material 441 A Policy Network Architecture and Hyperparameter Details

Neural Information Processing SystemsOct-8-2025, 08:01:03 GMT

Table 13: ANT comparisons when trained with small car as target on xView in multi-query setting.

mp-v, psv, target test, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

A Partially Supervised Reinforcement Learning Framework for Visual Active Search

Neural Information Processing SystemsOct-8-2025, 08:01:00 GMT

Moreover, query results (e.g., detected search and rescue activity in a particular region) are highly informative about the locations of target objects in other regions, for example, due to spatial

mp-v, psv, target test, (16 more...)

Neural Information Processing Systems

Genre:

Overview (0.67)
Research Report > New Finding (0.67)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Add feedback