AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

fdc42b6b0ee16a2f866281508ef56730-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 06:27:52 GMT

algorithm, algorithm 1, step hold, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Risk-SensitiveReinforcementLearning: Near-OptimalRisk-SampleTradeoffinRegret

Neural Information Processing SystemsFeb-11-2026, 06:27:44 GMT

We study risk-sensitive reinforcement learning in episodic Markov decision processes with unknown transition kernels, where the goal is to optimize the total reward under the risk measure of exponential utility. We propose two provably efficient model-free algorithms, Risk-Sensitive Value Iteration (RSVI) and Risk-Sensitive Q-learning (RSQ). These algorithms implement a form of risk-sensitive optimism in the face of uncertainty, which adapts to both riskseeking and risk-averse modes of exploration.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

46c2a9a6f2b2be68682013eb1173c801-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 06:26:21 GMT

make wood pickaxe requirement, player observation step 32 8, specific interaction requirement, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre:

Research Report (0.87)
Workflow (0.68)

Industry:

Materials > Metals & Mining > Iron (1.00)
Materials > Metals & Mining > Coal (1.00)
Health & Medicine > Consumer Health (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

cf708fc1decf0337aded484f8f4519ae-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 06:17:14 GMT

international conference, latent space, representation, (12 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Austria > Tyrol > Innsbruck (0.04)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Robots (0.93)

Add feedback

fd5c905bcd8c3348ad1b35d7231ee2b1-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 06:16:32 GMT

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.74)

Add feedback

AgnosticQ-learningwithFunctionApproximationin DeterministicSystems: Near-OptimalBoundson ApproximationErrorandSampleComplexity

Neural Information Processing SystemsFeb-11-2026, 06:16:25 GMT

Therefore, we help address the open problem on agnosticQ-learning proposed in [Wen and Van Roy,2013].

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning

Neural Information Processing SystemsFeb-11-2026, 06:07:23 GMT

Lemma 2.Suppose Assumptions 1 and 2 hold.

machine learning, neural information processing system, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

cec2346566ba8ecd04bfd992fd193fb3-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 05:58:21 GMT

arxiv preprint arxiv, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

a8c9f9ccc45771d2fd06bcd04ff3442e-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 05:50:58 GMT

Underthisassumption,weintroduce IMED-RLandprove that its regret upper bound asymptotically matches the regret lower bound.

data mining, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

The Policy-gradient Placement and Generative Routing Neural Networks for Chip Design

Neural Information Processing SystemsFeb-11-2026, 05:43:24 GMT

Distinct from traditional heuristic solvers, this paper on one hand proposes an RL-based model for mixed-size macro placement, which differs from existing learning-based placers that often consider the macro by coarse grid-based mask. While the standard cells are placed via gradient-based GPU acceleration. On the other hand, a one-shot conditional generative routing model, which is composed of a special-designed input-size-adapting generator and a bi-discriminator, is devised to perform one-shot routing to the pins within each net, and the order of nets to route is adaptively learned.

machine learning, reinforcement learning, router, (17 more...)

Neural Information Processing Systems

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.46)

Industry: Semiconductors & Electronics (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback