Beyond Average Performance -- exploring regions of deviating performance for black box classification models