Learning Concept Credible Models for Mitigating Shortcuts

Neural Information Processing Systems 

During training, models can exploit spurious correlations as shortcuts, resulting in poor generalization performance when shortcuts do not persist. In this work, assuming access to a representation based on domain knowledge ( i.e., known