Learning to Describe Video with Weak Supervision by Exploiting Negative Sentential Information

Open in new window