AITopics | policy regret

Collaborating Authors

policy regret

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

665654759cdf2114c0cbe2b8e501e00e-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 12:43:08 GMT

adversary, learner, markov game, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Baltimore (0.04)
North America > United States > Texas (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Middle East > Cyprus > Pafos > Paphos (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Leisure & Entertainment > Games (1.00)
Education (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Game Theory (0.67)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

Policy Regret in Repeated Games

Raman Arora, Michael Dinitz, Teodor Vanislavov Marinov, Mehryar Mohri

Neural Information Processing SystemsFeb-13-2026, 13:37:56 GMT

Neural Information Processing Systems http://nips.cc/

adversary, algorithm, policy regret, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Baltimore (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Add feedback

d4ca950da1d6fd954520c45ab19fef1c-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-10-2026, 13:58:14 GMT

assumption, non-stochastic control, revision, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

abb451a12cf1a9d93292e81f0d4fdd7a-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 18:53:03 GMT

algorithm, online, policy regret, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.43)

Add feedback

1f10c3650a3aa5912dccc5789fd515e8-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 17:55:15 GMT

algorithm, init, mechanism, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Policy Regret in Repeated Games

Raman Arora, Michael Dinitz, Teodor Vanislavov Marinov, Mehryar Mohri

Neural Information Processing SystemsNov-21-2025, 03:53:25 GMT

The player's goal is to accumulate the

artificial intelligence, machine learning, policy regret, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Baltimore (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Add feedback

Algorithm Design and Stronger Guarantees for the Improving Multi-Armed Bandits Problem

Blum, Avrim, Garicano, Marten, Ravichandran, Kavya, Sharma, Dravyansh

arXiv.org Machine LearningNov-14-2025

The improving multi-armed bandits problem is a formal model for allocating effort under uncertainty, motivated by scenarios such as investing research effort into new technologies, performing clinical trials, and hyperparameter selection from learning curves. Each pull of an arm provides reward that increases monotonically with diminishing returns. A growing line of work has designed algorithms for improving bandits, albeit with somewhat pessimistic worst-case guarantees. Indeed, strong lower bounds of $Ω(k)$ and $Ω(\sqrt{k})$ multiplicative approximation factors are known for both deterministic and randomized algorithms (respectively) relative to the optimal arm, where $k$ is the number of bandit arms. In this work, we propose two new parameterized families of bandit algorithms and bound the sample complexity of learning the near-optimal algorithm from each family using offline data. The first family we define includes the optimal randomized algorithm from prior work. We show that an appropriately chosen algorithm from this family can achieve stronger guarantees, with optimal dependence on $k$, when the arm reward curves satisfy additional properties related to the strength of concavity. Our second family contains algorithms that both guarantee best-arm identification on well-behaved instances and revert to worst case guarantees on poorly-behaved instances. Taking a statistical learning perspective on the bandit rewards optimization problem, we achieve stronger data-dependent guarantees without the need for actually verifying whether the assumptions are satisfied.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

2511.10619

Country: