Gaining Extra Supervision via Multi-task learning for Multi-Modal Video Question Answering

Open in new window