Goto

Collaborating Authors

 keyphrase generation



A new dataset for multilingual keyphrase generation

Neural Information Processing Systems

Keyphrases are an important tool for efficiently dealing with the ever-increasing amount of information present on the internet. While there are many recent papers on English keyphrase generation, keyphrase generation for other languages remains vastly understudied, mostly due to the absence of datasets. To address this, we present a novel dataset called Papyrus, composed of 16427 pairs of abstracts and keyphrases. We release four versions of this dataset, corresponding to different subtasks. Papyrus-e considers only English keyphrases, Papyrus-f considers French keyphrases, Papyrus-m considers keyphrase generation in any language (mostly French and English), and Papyrus-a considers keyphrase generation in several languages. We train a state-of-the-art model on all four tasks and show that they lead to better results for non-English languages, with an average improvement of 14.2\% on keyphrase extraction and 2.0\% on generation. We also show an improvement of 0.4\% on extraction and 0.7\% on generation over English state-of-the-art results by concatenating Papyrus-e with the Kp20K training set.


Let's Use ChatGPT To Write Our Paper! Benchmarking LLMs To Write the Introduction of a Research Paper

Garg, Krishna, Shaik, Firoz, Bandyopadhyay, Sambaran, Caragea, Cornelia

arXiv.org Artificial Intelligence

As researchers increasingly adopt LLMs as writing assistants, generating high-quality research paper introductions remains both challenging and essential. We introduce Scientific Introduction Generation (SciIG), a task that evaluates LLMs' ability to produce coherent introductions from titles, abstracts, and related works. Curating new datasets from NAACL 2025 and ICLR 2025 papers, we assess five state-of-the-art models, including both open-source (DeepSeek-v3, Gemma-3-12B, LLaMA 4-Maverick, MistralAI Small 3.1) and closed-source GPT-4o systems, across multiple dimensions: lexical overlap, semantic similarity, content coverage, faithfulness, consistency, citation correctness, and narrative quality. Our comprehensive framework combines automated metrics with LLM-as-a-judge evaluations. Results demonstrate LLaMA-4 Maverick's superior performance on most metrics, particularly in semantic similarity and faithfulness. Moreover, three-shot prompting consistently outperforms fewer-shot approaches. These findings provide practical insights into developing effective research writing assistants and set realistic expectations for LLM-assisted academic writing. To foster reproducibility and future research, we will publicly release all code and datasets.



Pipelined Decoder for Efficient Context-Aware Text Generation

Huang, Zixian, Niu, Chenxu, Gu, Yu, Xiao, Gengyang, Huang, Xinwei, Cheng, Gong

arXiv.org Artificial Intelligence

As the basis of generative AI, an autoregressive model requires the generation of a new token depending on all the previously generated tokens, which brings high quality but also restricts the model to generate tokens one by one, forming a bottleneck limiting the generation speed. In this paper, we propose a new decoder architecture that efficiently generates text in parallel for context-aware generation tasks. Our proposed pipelined decoder initiates the generation of multiple subsequences simultaneously, and, at each time-step, it generates a new token for each subsequence to realize parallelism. Experiments on multiple text generation tasks, including question answering, text summarization, and keyphrase generation, show that our pipelined decoder significantly improves the generation speed without a significant loss of generation quality or additional memory consumption.


An Analysis of Datasets, Metrics and Models in Keyphrase Generation

Boudin, Florian, Aizawa, Akiko

arXiv.org Artificial Intelligence

Keyphrase generation refers to the task of producing a set of words or phrases that summarises the content of a document. Continuous efforts have been dedicated to this task over the past few years, spreading across multiple lines of research, such as model architectures, data resources, and use-case scenarios. Yet, the current state of keyphrase generation remains unknown as there has been no attempt to review and analyse previous work. In this paper, we bridge this gap by presenting an analysis of over 50 research papers on keyphrase generation, offering a comprehensive overview of recent progress, limitations, and open challenges. Our findings highlight several critical issues in current evaluation practices, such as the concerning similarity among commonly-used benchmark datasets and inconsistencies in metric calculations leading to overestimated performances. Additionally, we address the limited availability of pre-trained models by releasing a strong PLM-based model for keyphrase generation as an effort to facilitate future research.


ERU-KG: Efficient Reference-aligned Unsupervised Keyphrase Generation

Do, Lam Thanh, Bodke, Aaditya, Akash, Pritom Saha, Chang, Kevin Chen-Chuan

arXiv.org Artificial Intelligence

Unsupervised keyphrase prediction has gained growing interest in recent years. However, existing methods typically rely on heuristically defined importance scores, which may lead to inaccurate informativeness estimation. In addition, they lack consideration for time efficiency. To solve these problems, we propose ERU-KG, an unsupervised keyphrase generation (UKG) model that consists of an informativeness and a phraseness module. The former estimates the relevance of keyphrase candidates, while the latter generate those candidates. The informativeness module innovates by learning to model informativeness through references (e.g., queries, citation contexts, and titles) and at the term-level, thereby 1) capturing how the key concepts of documents are perceived in different contexts and 2) estimating informativeness of phrases more efficiently by aggregating term informativeness, removing the need for explicit modeling of the candidates. ERU-KG demonstrates its effectiveness on keyphrase generation benchmarks by outperforming unsupervised baselines and achieving on average 89\% of the performance of a supervised model for top 10 predictions. Additionally, to highlight its practical utility, we evaluate the model on text retrieval tasks and show that keyphrases generated by ERU-KG are effective when employed as query and document expansions. Furthermore, inference speed tests reveal that ERU-KG is the fastest among baselines of similar model sizes. Finally, our proposed model can switch between keyphrase generation and extraction by adjusting hyperparameters, catering to diverse application requirements.


Zero-Shot Keyphrase Generation: Investigating Specialized Instructions and Multi-Sample Aggregation on Large Language Models

Mohan, Jayanth, Chowdhury, Jishnu Ray, Malik, Tomas, Caragea, Cornelia

arXiv.org Artificial Intelligence

Keyphrases are the essential topical phrases that summarize a document. Keyphrase generation is a long-standing NLP task for automatically generating keyphrases for a given document. While the task has been comprehensively explored in the past via various models, only a few works perform some preliminary analysis of Large Language Models (LLMs) for the task. Given the impact of LLMs in the field of NLP, it is important to conduct a more thorough examination of their potential for keyphrase generation. In this paper, we attempt to meet this demand with our research agenda. Specifically, we focus on the zero-shot capabilities of open-source instruction-tuned LLMs (Phi-3, Llama-3) and the closed-source GPT-4o for this task. We systematically investigate the effect of providing task-relevant specialized instructions in the prompt. Moreover, we design task-specific counterparts to self-consistency-style strategies for LLMs and show significant benefits from our proposals over the baselines.


KAPPA: A Generic Patent Analysis Framework with Keyphrase-Based Portraits

Xia, Xin, Wang, Yujin, Zhou, Jun, Zhong, Guisheng, Cai, Linning, Zhang, Chen

arXiv.org Artificial Intelligence

Patent analysis highly relies on concise and interpretable document representations, referred to as patent portraits. Keyphrases, both present and absent, are ideal candidates for patent portraits due to their brevity, representativeness, and clarity. In this paper, we introduce KAPPA, an integrated framework designed to construct keyphrase-based patent portraits and enhance patent analysis. KAPPA operates in two phases: patent portrait construction and portrait-based analysis. To ensure effective portrait construction, we propose a semantic-calibrated keyphrase generation paradigm that integrates pre-trained language models with a prompt-based hierarchical decoding strategy to leverage the multi-level structural characteristics of patents. For portrait-based analysis, we develop a comprehensive framework that employs keyphrase-based patent portraits to enable efficient and accurate patent analysis. Extensive experiments on benchmark datasets of keyphrase generation, the proposed model achieves significant improvements compared to state-of-the-art baselines. Further experiments conducted on real-world patent applications demonstrate that our keyphrase-based portraits effectively capture domain-specific knowledge and enrich semantic representation for patent analysis tasks.


Self-Compositional Data Augmentation for Scientific Keyphrase Generation

Houbre, Mael, Boudin, Florian, Daille, Beatrice, Aizawa, Akiko

arXiv.org Artificial Intelligence

State-of-the-art models for keyphrase generation require large amounts of training data to achieve good performance. However, obtaining keyphrase-labeled documents can be challenging and costly. To address this issue, we present a self-compositional data augmentation method. More specifically, we measure the relatedness of training documents based on their shared keyphrases, and combine similar documents to generate synthetic samples. The advantage of our method lies in its ability to create additional training samples that keep domain coherence, without relying on external data or resources. Our results on multiple datasets spanning three different domains, demonstrate that our method consistently improves keyphrase generation. A qualitative analysis of the generated keyphrases for the Computer Science domain confirms this improvement towards their representativity property.