Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering (Appendix)
–Neural Information Processing Systems
We chose the Google Search corpus [Luo et al., 2021] for our question-answering system as it provides good coverage of the knowledge needed and is publicly available. However, as noted by the authors of RA-VQA, additional knowledge bases may be required to answer some questions correctly. Future work may address the issue by improving the quality and expanding the coverage of knowledge. We do not perceive any immediate ethical concerns associated with the misuse of our proposed system. There is a possibility that the trained KB-VQA system might generate inappropriate or biased content as a result of the training data biases during LLM and LMM pre-training and fine-tuning.
Neural Information Processing Systems
Apr-26-2026, 22:14:59 GMT
- Country:
- North America > United States (0.29)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.14)
- Industry:
- Information Technology (0.49)
- Technology: