Three Towers: Flexible Contrastive Learning with Pretrained Image Models

Neural Information Processing Systems 

LiT directly replaces the image tower with the frozen embeddings, excluding any potential benefits from training the image tower contrastively.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found