Robustness in Both Domains: CLIP Needs a Robust Text Encoder