Regret minimization in Linear Bandits with offline data via extended D-optimal exploration

Open in new window