PAC Learning with Bandit Feedback: Sharp Sample Complexity in the Realizable Setting

Open in new window