Yan, Minyxuan
Few-Shot Table-to-Text Generation with Prompt Planning and Knowledge Memorization
Guo, Zhixin, Yan, Minyxuan, Qi, Jiexing, Zhou, Jianping, He, Ziwei, Lin, Zhouhan, Zheng, Guanjie, Wang, Xinbing
Pre-trained language models (PLM) have achieved remarkable advancement in table-to-text generation tasks. However, the lack of labeled domain-specific knowledge and the topology gap between tabular data and text make it difficult for PLMs to yield faithful text. Low-resource generation likewise faces unique challenges in this domain. Inspired by how humans descript tabular data with prior knowledge, we suggest a new framework: PromptMize, which targets table-to-text generation under few-shot settings. The design of our framework consists of two aspects: a prompt planner and a knowledge adapter. The prompt planner aims to generate a prompt signal that provides instance guidance for PLMs to bridge the topology gap between tabular data and text. Moreover, the knowledge adapter memorizes domain-specific knowledge from the unlabelled corpus to supply essential information during generation. Extensive experiments and analyses are investigated on three open domain few-shot NLG datasets: human, song, and book. Compared with previous state-of-the-art approaches, our model achieves remarkable performance in generating quality as judged by human and automatic evaluations.
Adapting Prompt for Few-shot Table-to-Text Generation
Guo, Zhixin, Yan, Minyxuan, Qi, Jiexing, Zhou, Jianping, He, Ziwei, Lin, Zhouhan, Zheng, Guanjie, Wang, Xinbing
Pretrained language models (PLMs) have made remarkable progress in table-to-text generation tasks. However, the lack of domain-specific knowledge makes it challenging to bridge the topological gap between tabular data and text, especially in real-world applications with limited resources. To mitigate the limitation of insufficient labeled data, we propose a novel framework: Adapt-Prompt-to-Generate (AdaPTGen). The core insight of AdaPTGen is to adapt prompt templates of domain-specific knowledge into the model, which brings at least three benefits: (1) it injects representation of normal table-related descriptions to bridge the topological gap between tabular data and texts; (2) it enables us to use large amounts of unlabeled domain-specific knowledge fully, which can alleviate the PLMs' inherent shortcomings of lacking domain knowledge; (3) it allows us to design various tasks to explore the domain-specific knowledge. Extensive experiments and analyses are conducted on three open-domain few-shot natural language generation (NLG) data sets: Humans, Songs, and Books. Compared to previous state-of-the-art approaches, our model achieves superior performance in terms of both fluency and accuracy.