Is O(log N) practical? Near-Equivalence Between Delay Robustness and Bounded Regret in Bandits and RL

Open in new window