Enhancing Large Vision Language Models with Self-Training on Image Comprehension Yihe Deng 1, Pan Lu1,3, Fan Yin

Neural Information Processing Systems 

Improving this capability requires high-quality vision-language data, which is costly and labor-intensive to acquire.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found