Unlocking the Multi-modal Potential of CLIP for Generalized Category Discovery