AProvablyEfficientSampleCollectionStrategy forReinforcementLearning