Tight Sample Complexity Bounds for Entropic Best Policy Identification

Open in new window