Nearly Minimax Optimal Regret for Multinomial Logistic Bandit

Open in new window