AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

Neural Information Processing SystemsAug-16-2025, 10:42:27 GMT

There have been many recent advances on provably efficient Reinforcement Learning (RL) in problems with rich observation spaces. However, all these works share a strong realizability assumption about the optimal value function of the true MDP . Such realizability assumptions are often too strong to hold in practice. In this work, we consider the more realistic setting of agnostic RL with rich observation spaces and a fixed class of policies Π that may not contain any near-optimal policy. We provide an algorithm for this setting whose error is bounded in terms of the rank d of the underlying MDP .

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.45)

Industry: Leisure & Entertainment > Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)

Add feedback

9eed867b73ab1eab60583c9d4a789b1b-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 10:42:25 GMT

data mining, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country: Asia > Middle East (0.28)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
(2 more...)

Add feedback

Multi-agent Dynamic Algorithm Configuration

Neural Information Processing SystemsAug-16-2025, 10:39:55 GMT

Unlike AC, DAC can dynamically adjust the algorithm's configuration during

algorithm, hyperparameter, ma-dac, (16 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Quebec > Montreal (0.04)
(15 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)

Add feedback

Large-Scale Retrieval for Reinforcement Learning Peter C. Humphreys

Neural Information Processing SystemsAug-16-2025, 10:23:19 GMT

In addition, new information can be attended to by the agent, without retraining, by simply augmenting the retrieval dataset.

dataset, information, neighbor, (13 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

7eca17ef54789b0663cab421f2e9dbf5-Paper-Conference.pdf

Neural Information Processing SystemsAug-16-2025, 10:23:16 GMT

information, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

SOPE: Spectrum of Off-Policy Estimators

Neural Information Processing SystemsAug-16-2025, 10:22:02 GMT

Consequently, if the parameterization is not rich enough, then it may not be possible to represent the distribution ratios accurately, and when using rich function approximators (such as neural networks) then the optimization procedure may get stuck in sub-optimal saddle points.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

7e9fbd01b3084956dd8a070c7bf30bad-Paper-Conference.pdf

Neural Information Processing SystemsAug-16-2025, 10:03:20 GMT

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits

Neural Information Processing SystemsAug-16-2025, 09:41:16 GMT

Many of the recent triumphs in machine learning are dependent on well-tuned hyperparameters. This is particularly prominent in reinforcement learning (RL) where a small change in the configuration can lead to failure. Despite the importance of tuning hyperparameters, it remains expensive and is often done in a naive and laborious way. A recent solution to this problem is Population Based Training (PBT) which updates both weights and hyperparameters in a single training run of a population of agents. PBT has been shown to be particularly effective in RL, leading to widespread use in the field. However, PBT lacks theoretical guarantees since it relies on random heuristics to explore the hyperparameter space.

agent, hyperparameter, optimization, (13 more...)

Neural Information Processing Systems

Country: