SideControl: Controlled Open-domain Dialogue Generation via Additive Side Networks
–arXiv.org Artificial Intelligence
Transformer-based pre-trained language models boost the performance of open-domain dialogue systems. Prior works leverage Transformer-based pre-trained language models to generate texts with desired attributes in two general approaches: (1) gradient-based methods: updating all latent representations of pre-trained models with gradients from attribute models; (2) weighted-decoding methods: re-ranking beam candidates from pre-trained models with attribute functions. However, gradient-based methods lead to high computation cost and can easily get overfitted on small training sets, while weighted-decoding methods are inherently constrained by the low-variance high-bias pre-trained model. In this work, we propose a novel approach to control the generation of Transformer-based pre-trained language models: the SideControl framework, which leverages a novel control attributes loss to incorporate useful control signals, and is shown to perform well with very limited training samples. We evaluate our proposed method on two benchmark open-domain dialogue datasets, and results show that the SideControl framework has better controllability, higher generation quality and better sample-efficiency than existing gradient-based and weighted-decoding baselines.
arXiv.org Artificial Intelligence
Sep-4-2021
- Country:
- Oceania > Australia
- North America
- United States
- Michigan (0.04)
- Virginia > Albemarle County
- Charlottesville (0.14)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- California > Santa Clara County
- Palo Alto (0.04)
- Canada > British Columbia
- United States
- Europe
- Asia
- Taiwan > Taiwan Province
- Taipei (0.04)
- Middle East > Qatar
- Japan > Honshū
- Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Taiwan > Taiwan Province
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Leisure & Entertainment (0.46)
- Technology: