Audio Captioning RAG via Generative Pair-to-Pair Retrieval with Refined Knowledge Base

Open in new window