A calibration test for evaluating set-based epistemic uncertainty representations