GD: Multi-Modal Open-World Counting

Neural Information Processing Systems 

GD is comparable to or outperforms all previous text-only works, and when using both text and visual exemplars, we outperform all previous models; third, we carry out a preliminary study into different interactions between the text and visual exemplar prompts, including the cases where they reinforce each other and where one restricts the other.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found