Don't blame Dataset Shift! Shortcut Learning due to Gradients and Cross Entropy

Open in new window