CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations

Open in new window