Goto

Collaborating Authors

 epoch-greedy algorithm


The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information

Neural Information Processing Systems

Epoch-Greedy has the following properties: No knowledge of a time horizon T is necessary. The regret incurred by Epoch-Greedy is controlled by a sample complexity bound for a hypothesis class. Here S is the complexity term in a sample complexity bound for standard supervised learning.


The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information

Neural Information Processing Systems

Epoch-Greedy has the following properties: No knowledge of a time horizon $T$ is necessary. The regret incurred by Epoch-Greedy is controlled by a sample complexity bound for a hypothesis class. Here $S$ is the complexity term in a sample complexity bound for standard supervised learning. Papers published at the Neural Information Processing Systems Conference.


The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information

Neural Information Processing Systems

We present Epoch-Greedy, an algorithm for multi-armed bandits with observable side information. Epoch-Greedy has the following properties: No knowledge of a time horizon $T$ is necessary. The regret incurred by Epoch-Greedy is controlled by a sample complexity bound for a hypothesis class. The regret scales as $O(T^{2/3} S^{1/3})$ or better (sometimes, much better). Here $S$ is the complexity term in a sample complexity bound for standard supervised learning.