Improved Confidence Regions and Optimal Algorithms for Online and Offline Linear MNL Bandits

Neural Information Processing Systems 

In this work, we consider the data-driven assortment optimization problem under the linear multinomial logit (MNL) choice model. We first establish an improved confidence region for the maximum-likelihood-estimator (MLE) of the d-dimensional linear MNL likelihood function that removes the explicit dependency on a problem-dependent parameter κ 1 in previous result [42], which scales exponentially with the radius of the parameter set. Building on the confidence region result, we investigate the data-driven assortment optimization problem in both offline and online settings.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found