Supplement for Counterexample Guided RL Policy Refinement Using Bayesian Optimization
–Neural Information Processing Systems
We executed these methods for 20 iterations each having 200 testing samples. We report the mean and standard deviation of the number of counterexamples discovered.
Neural Information Processing Systems
Mar-21-2025, 13:49:51 GMT
- Technology: