Estimating Model Performance under Domain Shifts with Class-Specific Confidence Scores