Validation of ML-UQ calibration statistics using simulated reference values: a sensitivity analysis