Enhancing Modality Representation and Alignment for Multimodal Cold-start Active Learning