schapire
EfficientFirst-OrderContextualBandits: Prediction,Allocation,andTriangularDiscrimination
On the technical side, we show that the logarithmic loss and an informationtheoretic quantity called thetriangular discriminationplay a fundamental role in obtaining first-order guarantees, and we combine this observation with new refinements tothe regression oracle reduction framework ofFoster and Rakhlin [29].