AITopics | barycentric spanner

Learning in Observable POMDPs, without Computationally Intractable Oracles

Neural Information Processing SystemsApr-24-2026, 11:30:06 GMT

Much of reinforcement learning theory is built on top of oracles that are computationally hard to implement. Specifically for learning near-optimal policies in Partially Observable Markov Decision Processes (POMDPs), existing algorithms either need to make strong assumptions about the model dynamics (e.g.

artificial intelligence, machine learning, pomdp, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre: Workflow (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Learning in Observable POMDPs, without Computationally Intractable Oracles

Neural Information Processing SystemsApr-24-2026, 11:30:03 GMT

Much of reinforcement learning theory is built on top of oracles that are computationally hard to implement. Specifically for learning near-optimal policies in Partially Observable Markov Decision Processes (POMDPs), existing algorithms either need to make strong assumptions about the model dynamics (e.g.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

f15cc4c30e8e5116e1e874e4895b74d0-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 15:57:25 GMT

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

Efficient Model-Free Exploration in Low-Rank MDPs

Neural Information Processing SystemsFeb-17-2026, 06:54:43 GMT

What are the right computational primitives for exploration?

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > North Carolina > Wake County > Raleigh (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

b1bdb0f22c9748203c62f29aa297ac57-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 15:03:40 GMT

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Genre: Research Report (0.46)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

b1bdb0f22c9748203c62f29aa297ac57-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 15:03:36 GMT

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Genre: Research Report (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

565e8a413d0562de9ee4378402d2b481-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 11:55:21 GMT

algorithm, arxiv preprint arxiv, barycentric spanner, (12 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

099607cd970f4e1ac2fdd30624dffff8-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 09:17:03 GMT

algorithm, pomdp, reinforcement, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Efficient Swap Regret Minimization in Combinatorial Bandits

Kontogiannis, Andreas, Pollatos, Vasilis, Mertikopoulos, Panayotis, Panageas, Ioannis

arXiv.org Machine LearningFeb-3-2026

This paper addresses the problem of designing efficient no-swap regret algorithms for combinatorial bandits, where the number of actions $N$ is exponentially large in the dimensionality of the problem. In this setting, designing efficient no-swap regret translates to sublinear -- in horizon $T$ -- swap regret with polylogarithmic dependence on $N$. In contrast to the weaker notion of external regret minimization - a problem which is fairly well understood in the literature - achieving no-swap regret with a polylogarithmic dependence on $N$ has remained elusive in combinatorial bandits. Our paper resolves this challenge, by introducing a no-swap-regret learning algorithm with regret that scales polylogarithmically in $N$ and is tight for the class of combinatorial bandits. To ground our results, we also demonstrate how to implement the proposed algorithm efficiently -- that is, with a per-iteration complexity that also scales polylogarithmically in $N$ -- across a wide range of well-studied applications.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2602.02087

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
North America > United States > California > Orange County > Irvine (0.04)
(3 more...)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Uniform Last-Iterate Guarantee for Bandits and Reinforcement Learning

Neural Information Processing SystemsOct-10-2025, 21:09:47 GMT

This paper introduces a stronger metric, uniform last-iterate (ULI) guarantee, capturing both cumulative and instantaneous performance of RL algorithms.

algorithm, probability, uli guarantee, (14 more...)

Neural Information Processing Systems

Country: