TCE: A Test-Based Approach to Measuring Calibration Error