unsupervised text generation
Unsupervised Text Generation by Learning from Search
In this work, we propose TGLS, a novel framework for unsupervised Text Generation by Learning from Search. We start by applying a strong search algorithm (in particular, simulated annealing) towards a heuristically defined objective that (roughly) estimates the quality of sentences. Then, a conditional generative model learns from the search results, and meanwhile smooth out the noise of search. The alternation between search and learning can be repeated for performance bootstrapping. We demonstrate the effectiveness of TGLS on two real-world natural language generation tasks, unsupervised paraphrasing and text formalization. Our model significantly outperforms unsupervised baseline methods in both tasks. Especially, it achieves comparable performance to strong supervised methods for paraphrase generation.
Review for NeurIPS paper: Unsupervised Text Generation by Learning from Search
Weaknesses: (1) This paper claims the method as a "generic unsupervised text generation framework", but only evaluates on relatively easy tasks like paraphrase generation and formality style transfer where many words are shared between the input and output, and there are other very standard and representative sequence generation tasks like machine translation or summarization. The related work section also only discusses literature on these two tasks. Thus I feel like the paper (especially the intro) is a bit over-claiming -- the paper needs to either include other standard tasks to make the current claim valid, or downplay the claim and discuss the limitations of the proposed method in a separate (sub)section. For example, I would expect it is much more difficult for SA to explore the output space for machine translation, summarization, dialogue, etc., which means the proposed method may not be "a generic framework". I think paraphrasing and formality transfer are very special cases where SA can work well, thus I can see many limitations of the proposed method unless the authors demonstrate enough experimental evidence.
Review for NeurIPS paper: Unsupervised Text Generation by Learning from Search
The reviewers overall found the method here interesting, but there were a few concerns: 1. Most importantly, the description in the paper seems to overclaim the breadth of the results. For example given the relatively limited scope of the experiments it seems strange to call this "Unsupervised Text Generation", at the best it's probably "Unsupervised Paraphrasing", "Unsupervised Style Transfer", or stretching "Unsupervised Conditional Text Generation". I would encourage the authors to scope the claims appropriately. 2. Significant additional complexity compared to other methods. The method is relatively complicated, and this may limit its applicability to some extent.
Unsupervised Text Generation by Learning from Search
In this work, we propose TGLS, a novel framework for unsupervised Text Generation by Learning from Search. We start by applying a strong search algorithm (in particular, simulated annealing) towards a heuristically defined objective that (roughly) estimates the quality of sentences. Then, a conditional generative model learns from the search results, and meanwhile smooth out the noise of search. The alternation between search and learning can be repeated for performance bootstrapping. We demonstrate the effectiveness of TGLS on two real-world natural language generation tasks, unsupervised paraphrasing and text formalization.
Search and Learning for Unsupervised Text Generation
With the advances of deep learning techniques, text generation is attracting increasing interest in the artificial intelligence (AI) community, because of its wide applications and because it is an essential component of AI. Traditional text generation systems are trained in a supervised way, requiring massive labeled parallel corpora. In this paper, I will introduce our recent work on search and learning approaches to unsupervised text generation, where a heuristic objective function estimates the quality of a candidate sentence, and discrete search algorithms generate a sentence by maximizing the search objective. A machine learning model further learns from the search results to smooth out noise and improve efficiency. Our approach is important to the industry for building minimal viable products for a new task; it also has high social impacts for saving human annotation labor and for processing low-resource languages.
- North America > Canada > Alberta (0.15)
- North America > United States (0.05)
Search and Learning for Unsupervised Text Generation New Faculty Highlights Extended Abstract
The following article is an extended abstract submitted as part of AAAI's New Faculty Highlights Program. With the advances of deep learning techniques, text generation is attracting increasing interest in the artificial intelligence (AI) commu- nity, because of its wide applications and because it is an essential component of AI. Traditional text generation systems are trained in a supervised way, requiring massive labeled parallel corpora. In this paper, I will introduce our recent work on search and learning ap- proaches to unsupervised text generation, where a heuristic objective function estimates the quality of a candidate sentence, and discrete search algorithms generate a sentence by maximizing the search objective. A machine learning model further learns from the search results to smooth out noise and improve efficiency.
Unsupervised Text Generation by Learning from Search
Li, Jingjing, Li, Zichao, Mou, Lili, Jiang, Xin, Lyu, Michael R., King, Irwin
In this work, we present TGLS, a novel framework to unsupervised Text Generation by Learning from Search. We start by applying a strong search algorithm (in particular, simulated annealing) towards a heuristically defined objective that (roughly) estimates the quality of sentences. Then, a conditional generative model learns from the search results, and meanwhile smooth out the noise of search. The alternation between search and learning can be repeated for performance bootstrapping. We demonstrate the effectiveness of TGLS on two real-world natural language generation tasks, paraphrase generation and text formalization. Our model significantly outperforms unsupervised baseline methods in both tasks. Especially, it achieves comparable performance with the state-of-the-art supervised methods in paraphrase generation.
- North America > Canada > Alberta (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > China > Hong Kong (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Generation (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)
Unsupervised Text Generation from Structured Data
Schmitt, Martin, Schütze, Hinrich
This work presents a joint solution to two challenging tasks: text generation from data and open information extraction. We propose to model both tasks as sequence-to-sequence translation problems and thus construct a joint neural model for both. Our experiments on knowledge graphs from Visual Genome, i.e., structured image analyses, shows promising results compared to strong baselines. Building on recent work on unsupervised machine translation, we report the first results - to the best of our knowledge - on fully unsupervised text generation from structured data.
- Europe > Denmark > Capital Region > Copenhagen (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
- (3 more...)