FSMR: A Feature Swapping Multi-modal Reasoning Approach with Joint Textual and Visual Clues

Open in new window