AITopics | offline linear contextual bandit

8443219a991f068c34d9491ad68ffa94-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 09:44:22 GMT

arxiv preprint arxiv, contextual bandit, linear contextual bandit, (10 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Pessimism for Offline Linear Contextual Bandits using \ell_p Confidence Sets

Neural Information Processing SystemsDec-24-2025, 16:01:13 GMT

We present a family $\{\widehat{\pi}_p\}_{p\ge 1}$ of pessimistic learning rules for offline learning of linear contextual bandits, relying on confidence sets with respect to different $\ell_p$ norms, where $\widehat{\pi}_2$ corresponds to Bellman-consistent pessimism (BCP), while $\widehat{\pi}_\infty$ is a novel generalization of lower confidence bound (LCB) to the linear setting. We show that the novel $\widehat{\pi}_\infty$ learning rule is, in a sense, adaptively optimal, as it achieves the minimax performance (up to log factors) against all $\ell_q$-constrained problems, and as such it strictly dominates all other predictors in the family, including $\widehat{\pi}_2$.

name change, offline linear contextual bandit, widehat, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.66)

Add feedback

8443219a991f068c34d9491ad68ffa94-Paper-Conference.pdf

Neural Information Processing SystemsAug-16-2025, 14:10:39 GMT

arxiv preprint arxiv, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Pessimism for Offline Linear Contextual Bandits using \ell_p Confidence Sets

Neural Information Processing SystemsJan-16-2025, 13:00:07 GMT

We present a family \{\widehat{\pi}_p\}_{p\ge 1} of pessimistic learning rules for offline learning of linear contextual bandits, relying on confidence sets with respect to different \ell_p norms, where \widehat{\pi}_2 corresponds to Bellman-consistent pessimism (BCP), while \widehat{\pi}_\infty is a novel generalization of lower confidence bound (LCB) to the linear setting. We show that the novel \widehat{\pi}_\infty learning rule is, in a sense, adaptively optimal, as it achieves the minimax performance (up to log factors) against all \ell_q -constrained problems, and as such it strictly dominates all other predictors in the family, including \widehat{\pi}_2 .

confidence set, offline linear contextual bandit, widehat, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.77)

Add feedback

Collaborating Authors

offline linear contextual bandit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

8443219a991f068c34d9491ad68ffa94-Paper-Conference.pdf

Pessimism for Offline Linear Contextual Bandits using \ell_p Confidence Sets

8443219a991f068c34d9491ad68ffa94-Paper-Conference.pdf

Pessimism for Offline Linear Contextual Bandits using \ell_p Confidence Sets