Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models
Wu, Di, Ahmad, Wasi Uddin, Chang, Kai-Wei
–arXiv.org Artificial Intelligence
Keyphrase Generation (KPG) is a longstanding task in NLP with widespread applications. The advent of sequence-to-sequence (seq2seq) pre-trained language models (PLMs) has ushered in a transformative era for KPG, yielding promising performance improvements. However, many design decisions remain unexplored and are often made arbitrarily. This paper undertakes a systematic analysis of the influence of model selection and decoding strategies on PLM-based KPG. We begin by elucidating why seq2seq PLMs are apt for KPG, anchored by an attention-driven hypothesis. We then establish that conventional wisdom for selecting seq2seq PLMs lacks depth: (1) merely increasing model size or performing task-specific adaptation is not parameter-efficient; (2) although combining in-domain pre-training with task adaptation benefits KPG, it does partially hinder generalization. Regarding decoding, we demonstrate that while greedy search achieves strong F1 scores, it lags in recall compared with sampling-based methods. Based on these insights, we propose DeSel, a likelihood-based decode-select algorithm for seq2seq PLMs. DeSel improves greedy search by an average of 4.7% semantic F1 across five datasets. Our collective findings pave the way for deeper future investigations into PLM-based KPG.
arXiv.org Artificial Intelligence
Oct-22-2023
- Country:
- Oceania > Australia
- North America
- Dominican Republic (0.04)
- United States
- Washington > King County
- Seattle (0.04)
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California
- Los Angeles County > Los Angeles (0.14)
- Santa Clara County > Palo Alto (0.04)
- Washington > King County
- Canada > British Columbia
- Europe
- United Kingdom > England
- Greater Manchester > Manchester (0.04)
- Sweden > Uppsala County
- Uppsala (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Italy > Tuscany
- Florence (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- United Kingdom > England
- Asia
- Thailand > Chiang Mai
- Chiang Mai (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Japan > Honshū
- Chūbu > Aichi Prefecture > Nagoya (0.04)
- China
- Heilongjiang Province > Daqing (0.04)
- Hong Kong (0.04)
- Thailand > Chiang Mai
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Information Technology (0.46)
- Technology: