Improved Confidence Regions and Optimal Algorithms for Online and Offline Linear MNL Bandits
–Neural Information Processing Systems
In this work, we consider the data-driven assortment optimization problem under the linear multinomial logit (MNL) choice model. We first establish an improved confidence region for the maximum-likelihood-estimator (MLE) of the d-dimensional linear MNL likelihood function that removes the explicit dependency on a problem-dependent parameter κ 1 in previous result [42], which scales exponentially with the radius of the parameter set. Building on the confidence region result, we investigate the data-driven assortment optimization problem in both offline and online settings.
Neural Information Processing Systems
Jun-15-2026, 02:43:06 GMT
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.68)
- Research Report
- Industry:
- Education (0.46)
- Technology: