AVIS: Autonomous Visual Information Seeking with Large Language Model Agent

Neural Information Processing Systems 

In this paper, we propose an autonomous information seeking visual question answering framework, A VIS. Our method leverages a Large Language Model (LLM)

Similar Docs  Excel Report  more

TitleSimilaritySource
None found