AITopics | exp3

Consider a player that in each of T rounds chooses one of K arms. An adversary chooses the cost of each arm in a bounded interval, and a sequence of feedback delays {dt} that are unknown to the player. After picking arm at at round t, the player receives the cost of playing this arm dt rounds later. In cases where t + dt > T, this feedback is simply missing.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > France (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Contextual Bandits with Cross-Learning

Santiago Balseiro, Negin Golrezaei, Mohammad Mahdian, Vahab Mirrokni, Jon Schneider

Neural Information Processing SystemsFeb-12-2026, 11:56:36 GMT

In the classical contextual bandits problem, in each roundt, a learner observes some contextc, chooses some actiona to perform, and receives some reward ra,t(c).

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.77)
Information Technology > Game Theory (0.48)
Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Tuning Mixed Input Hyperparametersonthe Flyfor Efficient Population Based AutoRL

Neural Information Processing SystemsFeb-9-2026, 14:46:07 GMT

artificial intelligence, international conferenceon machine learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
Oceania > Australia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

751d51528afe5e6f7fe95dece4ed32ba-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 23:16:21 GMT

algorithm, base algorithm, corral, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
North America > United States > Massachusetts (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.73)

Add feedback

4cea2358d3cc5f8cd32397ca9bc51b94-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 08:56:38 GMT

adversary mab, antfin, effective variance, (14 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

0887f1a5b9970ad13f46b8c1485f7900-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 09:36:36 GMT

depth-width trade-off, dynamical system, strength, (8 more...)

Neural Information Processing Systems

Genre: Research Report (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.34)

Add feedback

Supplement to " Model Selection in Contextual Stochastic Bandit Problems "

Neural Information Processing SystemsOct-3-2025, 06:29:16 GMT

In Section D we present the proofs for Section 5.1 In Section H we show the proofs of the lower bounds in Section 6. We outline briefly some other direct applications of our results. CORRAL will achieve regret O ( p | L | dT) . B.1 Original Corral The original Corral algorithm [2] is reproduced below. We reproduce the EXP3.P algorithm (Figure 3.1 in [ 's expected replay regret satisfies: Therefore total regret is bounded by 6 U ( T,) log( T) D.2 Applications of Proposition 5.1 We now show that several algorithms are ( U,, T) bounded: Lemma D.2.

algorithm, base algorithm, theorem 5, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.41)

Add feedback

751d51528afe5e6f7fe95dece4ed32ba-Paper.pdf

Neural Information Processing SystemsOct-3-2025, 06:29:09 GMT

algorithm, base algorithm, corral, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
North America > United States > Massachusetts (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)

Add feedback

Contextual Bandits with Cross-Learning

Santiago Balseiro, Negin Golrezaei, Mohammad Mahdian, Vahab Mirrokni, Jon Schneider

Neural Information Processing SystemsOct-2-2025, 22:31:59 GMT

This is a special case of the multi-armed bandits with exogenous costs problem, and hence an instance of contextual-bandits with cross-learning.

artificial intelligence, data mining, machine learning, (21 more...)

Neural Information Processing Systems

Country: North America (0.28)

Industry: Information Technology > Services (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Game Theory (0.95)
Information Technology > Data Science > Data Mining > Big Data (0.78)

Add feedback

Supplementary materials for Paper " Bandit Samplers for Training Graph Neural Networks "

Neural Information Processing SystemsOct-2-2025, 21:11:53 GMT

We show the convergences on validation in terms of timing (seconds) in Figure 1 and Figure 2. Basically, our algorithms converge to much better results in nearly same duration compared with Note that we cannot complete the training of AS-GA T on Reddit because of memory issues. Note that the comparisons of timing between "graph sampling" and "layer sampling" paradigms have As a result, we do not compare the timing with "graph sampling" approaches. That is, graph sampling approaches are designed for graph data that all vertices have labels. To summarize, the "layer sampling" approaches are more flexible and general compared with "graph sampling" Before we give the proof of Theorem 1, we first prove the following Lemma 1 that will be used later.

artificial intelligence, machine learning, training graph neural network, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.41)

Add feedback

Filters

Collaborating Authors

exp3

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Online EXP3 Learning in Adversarial Bandits with Delayed Feedback

Contextual Bandits with Cross-Learning

Tuning Mixed Input Hyperparametersonthe Flyfor Efficient Population Based AutoRL

751d51528afe5e6f7fe95dece4ed32ba-Paper.pdf

4cea2358d3cc5f8cd32397ca9bc51b94-Supplemental.pdf

0887f1a5b9970ad13f46b8c1485f7900-AuthorFeedback.pdf

Supplement to " Model Selection in Contextual Stochastic Bandit Problems "

751d51528afe5e6f7fe95dece4ed32ba-Paper.pdf

Contextual Bandits with Cross-Learning

Supplementary materials for Paper " Bandit Samplers for Training Graph Neural Networks "