Integrating Chain-of-Thought for Multimodal Alignment: A Study on 3D Vision-Language Learning

Open in new window