IBM Research India
A Deep Generative Framework for Paraphrase Generation
Gupta, Ankush (IBM Research India) | Agarwal, Arvind (IBM Research India) | Singh, Prawaan ( Indian Institute of Technology, Kanpur ) | Rai, Piyush ( Indian Institute of Technology, Kanpur )
Paraphrase generation is an important problem in NLP , especially in question answering, information retrieval, information extraction, conversation systems, to name a few. In this paper, we address the problem of generating paraphrases automatically. Our proposed method is based on a combination of deep generative models ( VAE ) with sequence-to-sequence models ( LSTM ) to generate paraphrases, given an input sentence. Traditional VAEs when combined with recurrent neural networks can generate free text but they are not suitable for paraphrase generation for a given sentence. We address this problem by conditioning the both, encoder and decoder sides of VAE , on the original sentence, so that it can generate the given sentence's paraphrases. Unlike most existing models, our model is simple, modular and can generate multiple paraphrases, for a given sentence. Quantitative evaluation of the proposed method on a benchmark paraphrase dataset demonstrates its efficacy, and its performance improvement over the state-of-the-art methods by a significant margin, whereas qualitative human evaluation indicate that the generated paraphrases are well-formed, grammatically correct, and are relevant to the input sentence. Furthermore, we evaluate our method on a newly released question paraphrase dataset, and establish a new baseline for future research.
Parameter Learning for Latent Network Diffusion
Wu, Xiaojian (University of Massachusetts Amherst) | Kumar, Akshat (IBM Research India) | Sheldon, Daniel (University of Massachusetts Amherst) | Zilberstein, Shlomo (University of Massachusetts Amherst)
Diffusion processes in networks are increasingly used to model dynamic phenomena such as the spread of information, wildlife, or social influence. Our work addresses the problem of learning the underlying parameters that govern such a diffusion process by observing the time at which nodes become active. A key advantage of our approach is that, unlike previous work, it can tolerate missing observations for some nodes in the diffusion process. Having incomplete observations is characteristic of offline networks used to model the spread of wildlife. We develop an EM algorithm to address parameter learning in such settings. Since both the E and M steps are computationally challenging, we employ a number of optimization methods such as nonlinear and difference-of-convex programming to address these challenges. Evaluation of the approach on the Red-cockaded Woodpecker conservation problem shows that it is highly robust and accurately learns parameters in various settings, even with more than 80% missing data.