Clinical Uncertainty Impacts Machine Learning Evaluations