Explaining the Impact of Training on Vision Models via Activation Clustering