Modality-Aware Integration with Large Language Models for Knowledge-based Visual Question Answering

Open in new window