Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement

Open in new window