Language as a Latent Sequence: deep latent variable models for semi-supervised paraphrase generation
Yu, Jialin, Cristea, Alexandra I., Harit, Anoushka, Sun, Zhongtian, Aduragba, Olanrewaju Tahir, Shi, Lei, Moubayed, Noura Al
–arXiv.org Artificial Intelligence
This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin (p <.05; Wilcoxon test). Our code is publicly available at "https://github.com/jialin-yu/latent-sequence-paraphrase".
arXiv.org Artificial Intelligence
Sep-8-2023
- Country:
- Oceania > Australia
- Victoria > Melbourne (0.04)
- New South Wales > Sydney (0.04)
- North America
- United States
- Maryland > Baltimore (0.04)
- Nevada (0.04)
- Texas > Travis County
- Austin (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York
- Richmond County > New York City (0.04)
- Queens County > New York City (0.04)
- New York County > New York City (0.04)
- Kings County > New York City (0.04)
- Bronx County > New York City (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- California
- San Francisco County > San Francisco (0.14)
- Los Angeles County > Long Beach (0.14)
- San Diego County > San Diego (0.04)
- Puerto Rico > San Juan
- San Juan (0.04)
- Canada
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Alberta > Census Division No. 15
- Improvement District No. 9 > Banff (0.04)
- United States
- Europe
- France (0.04)
- Germany > Berlin (0.04)
- United Kingdom > England
- Greater Manchester > Manchester (0.04)
- Durham > Durham (0.04)
- Sweden > Stockholm
- Stockholm (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Italy > Tuscany
- Florence (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Asia
- India (0.05)
- South Korea (0.04)
- Macao (0.04)
- Middle East
- Japan
- Kyūshū & Okinawa > Okinawa (0.04)
- Honshū > Kansai
- Osaka Prefecture > Osaka (0.04)
- China > Beijing
- Beijing (0.04)
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- Oceania > Australia
- Genre:
- Overview (1.00)
- Research Report > New Finding (0.46)
- Industry:
- Education (0.46)
- Technology: