Towards aligned body representations in vision models