Efficient and Near-Optimal Algorithm for Contextual Dueling Bandits with Offline Regression Oracles

Open in new window