From Scarcity to Efficiency: Improving CLIP Training via Visual-enriched Captions