Towards Generating Diverse Audio Captions via Adversarial Training

Open in new window