Filtering instances and rejecting predictions to obtain reliable models in healthcare
Valeriano, Maria Gabriela, Marzagão, David Kohan, Montelongo, Alfredo, Kiffer, Carlos Roberto Veiga, Katz, Natan, Lorena, Ana Carolina
–arXiv.org Artificial Intelligence
Machine Learning (ML) models are widely used in high-stakes domains such as healthcare, where the reliability of predictions is critical. However, these models often fail to account for uncertainty, providing predictions even with low confidence. This work proposes a novel two-step data-centric approach to enhance the performance of ML models by improving data quality and filtering low-confidence predictions. The first step involves leveraging Instance Hardness (IH) to filter problematic instances during training, thereby refining the dataset. The second step introduces a confidence-based rejection mechanism during inference, ensuring that only reliable predictions are retained. We evaluate our approach using three real-world healthcare datasets, demonstrating its effectiveness at improving model reliability while balancing predictive performance and rejection rate. Additionally, we use alternative criteria - influence values for filtering and uncertainty for rejection - as baselines to evaluate the efficiency of the proposed method. The results demonstrate that integrating IH filtering with confidence-based rejection effectively enhances model performance while preserving a large proportion of instances. This approach provides a practical method for deploying ML systems in safety-critical applications.
arXiv.org Artificial Intelligence
Oct-29-2025
- Country:
- South America > Brazil (0.46)
- Genre:
- Workflow (1.00)
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Industry:
- Technology: