mle baseline
Reviews: Training Language GANs from Scratch
I've raised my score accordingly, but I still think that there needs to be more solid results. In particular, while the rebuttal notes that ScratchGAN can almost match the MLE baseline, I am not sure how strong the MLE baseline itself is. Based on sample quality, I suspect that the MLE baseline itself is quite weak and does not use more modern LM approaches (e.g. Of course, I am not saying that the authors deliberately used weak baselines, but it would be helpful to compare against stronger MLE baselines too. Weaknesses: - The main weakness is empirical---scratchGAN appreciably underperforms an MLE model in terms of LM score and reverse LM score.
Towards Generating Diverse Audio Captions via Adversarial Training
Mei, Xinhao, Liu, Xubo, Sun, Jianyuan, Plumbley, Mark D., Wang, Wenwu
Automated audio captioning is a cross-modal translation task for describing the content of audio clips with natural language sentences. This task has attracted increasing attention and substantial progress has been made in recent years. Captions generated by existing models are generally faithful to the content of audio clips, however, these machine-generated captions are often deterministic (e.g., generating a fixed caption for a given audio clip), simple (e.g., using common words and simple grammar), and generic (e.g., generating the same caption for similar audio clips). When people are asked to describe the content of an audio clip, different people tend to focus on different sound events and describe an audio clip diversely from various aspects using distinct words and grammar. We believe that an audio captioning system should have the ability to generate diverse captions, either for a fixed audio clip, or across similar audio clips. To this end, we propose an adversarial training framework based on a conditional generative adversarial network (C-GAN) to improve diversity of audio captioning systems. A caption generator and two hybrid discriminators compete and are learned jointly, where the caption generator can be any standard encoder-decoder captioning model used to generate captions, and the hybrid discriminators assess the generated captions from different criteria, such as their naturalness and semantics. We conduct experiments on the Clotho dataset. The results show that our proposed model can generate captions with better diversity as compared to state-of-the-art methods.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- North America > United States > New York (0.04)
- (3 more...)
Improving Factual Consistency of Abstractive Summarization via Question Answering
Nan, Feng, Santos, Cicero Nogueira dos, Zhu, Henghui, Ng, Patrick, McKeown, Kathleen, Nallapati, Ramesh, Zhang, Dejiao, Wang, Zhiguo, Arnold, Andrew O., Xiang, Bing
A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents. The fact that automatic summarization may produce plausible-sounding yet inaccurate summaries is a major concern that limits its wide application. In this paper we present an approach to address factual consistency in summarization. We first propose an efficient automatic evaluation metric to measure factual consistency; next, we propose a novel learning algorithm that maximizes the proposed metric during model training. Through extensive experiments, we confirm that our method is effective in improving factual consistency and even overall quality of the summaries, as judged by both automatic metrics and human evaluation.
- Europe > United Kingdom (0.28)
- Asia > Middle East > UAE (0.28)
- Asia > Japan (0.28)
- Transportation > Air (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)
- Leisure & Entertainment > Sports > Boxing (0.46)