Aligning Evaluation with Clinical Priorities: Calibration, Label Shift, and Error Costs