Unified Generative and Discriminative Training for Multi-modal Large Language Models Wei Chow

Neural Information Processing Systems 

In recent times, Vision-Language Models (VLMs) have been trained under two predominant paradigms.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found