AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning

Neural Information Processing SystemsFeb-11-2026, 06:07:23 GMT

Lemma 2.Suppose Assumptions 1 and 2 hold.

machine learning, neural information processing system, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

cec2346566ba8ecd04bfd992fd193fb3-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 05:58:21 GMT

arxiv preprint arxiv, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

a8c9f9ccc45771d2fd06bcd04ff3442e-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 05:50:58 GMT

Underthisassumption,weintroduce IMED-RLandprove that its regret upper bound asymptotically matches the regret lower bound.

data mining, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

The Policy-gradient Placement and Generative Routing Neural Networks for Chip Design

Neural Information Processing SystemsFeb-11-2026, 05:43:24 GMT

Distinct from traditional heuristic solvers, this paper on one hand proposes an RL-based model for mixed-size macro placement, which differs from existing learning-based placers that often consider the macro by coarse grid-based mask. While the standard cells are placed via gradient-based GPU acceleration. On the other hand, a one-shot conditional generative routing model, which is composed of a special-designed input-size-adapting generator and a bi-discriminator, is devised to perform one-shot routing to the pins within each net, and the order of nets to route is adaptively learned.

machine learning, reinforcement learning, router, (17 more...)

Neural Information Processing Systems

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.46)

Industry: Semiconductors & Electronics (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Author Statement 506 The authors of this work would like to state that we bear full responsibility for any potential violation

Neural Information Processing SystemsFeb-11-2026, 05:41:07 GMT

Table 3 presents the details of datasets in HoK1v1 task. Spells set to frenzy . Generally, a level of "1" is used for datasets with the "norm" prefix, while a level This distinction indicates varying levels of difficulty. In the Generalization category, "norm_general" and "hard_general," have their corresponding datasets. For example, to sample the "norm_general" dataset, we let the level-1 model fight with level-0, level-542 For example, in the "norm_hero_general" experiment, we directly use the model trained on "norm_medium" dataset only contains the fixed default hero "luban."

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Industry: Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

464fefa022aaefc85d901317bbf13f85-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-11-2026, 05:41:04 GMT

dataset, hok3v3, reinforcement learning, (10 more...)

Neural Information Processing Systems

Genre:

Research Report (0.93)
Instructional Material (0.68)

Industry:

Law (0.67)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

4640d5da5888238b9de7e0dbacd2c605-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 05:32:07 GMT

arxiv preprint arxiv, function approximation, international conference, (10 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Add feedback

a862f5788fd09bb6843c694d8120d50c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 05:31:23 GMT

agent, demonstration, imitation, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)

Add feedback

PolicyPolicyUpdates

Neural Information Processing SystemsFeb-11-2026, 05:23:00 GMT

We tackle this planning issue by extending the policy gradient theory to policy updates with respecttoanystatedensity.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)

Add feedback

Enhancing Robustness of Graph Neural Networks on Social Media with Explainable Inverse Reinforcement Learning

Neural Information Processing SystemsFeb-11-2026, 05:21:39 GMT

Social media platforms capture diverse attack sequence samples through both machine and manual screening processes. Investigating effective ways to leverage these adversarial samples to enhance robustness is imperative.

machine learning, node, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: