Learning to Reject Low-Quality Explanations via User Feedback