Chatting Makes Perfect: Chat-based Image Retrieval Supplementary Material

Neural Information Processing Systems 

In Appendix A, we start by showing more qualitative results of chats and their retrieval results, and BLIP2 chats compared to a human answerer. Next, in Appendix B, we present the few shot instructional prompts that were used by different LLMs for generating follow-up questions. Another example in Figure 2 describes two trains, searched by the text "A train that is parked next to another train". Figure 3 demonstrates a case where the description "a small and dirty kitchen with pots and food everywhere" is ambiguous, subjective to the viewer and may match many images in the corpus. In Figure 4 we show an example of a dialog between ChatIR and a human.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found