CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features