CiteBART: Learning to Generate Citations for Local Citation Recommendation
Çelik, Ege Yiğit, Tekir, Selma
–arXiv.org Artificial Intelligence
Citations are essential building blocks in scientific writing. The scientific community is longing for support in their generation. Citation generation involves two complementary subtasks: Determining the citation worthiness of a context and, if it's worth it, proposing the best candidate papers for the citation placeholder. The latter subtask is called local citation recommendation (LCR). This paper proposes CiteBART, a custom BART pre-training based on citation token masking to generate citations to achieve LCR. In the base scheme, we mask the citation token in the local citation context to make the citation prediction. In the global one, we concatenate the citing paper's title and abstract to the local citation context to learn to reconstruct the citation token. CiteBART outperforms state-of-the-art approaches on the citation recommendation benchmarks except for the smallest FullTextPeerRead dataset. The effect is significant in the larger benchmarks, e.g., Refseer and ArXiv. We present a qualitative analysis and an ablation study to provide insights into the workings of CiteBART. Our analyses confirm that its generative nature brings about a zero-shot capability.
arXiv.org Artificial Intelligence
Dec-23-2024
- Country:
- Africa > Middle East
- Morocco (0.04)
- Asia
- China > Hong Kong (0.04)
- Middle East
- Jordan (0.04)
- Republic of Türkiye > İzmir Province
- İzmir (0.04)
- Europe > Austria (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- United States
- Colorado > Denver County
- Denver (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Colorado > Denver County
- Canada > Ontario
- Africa > Middle East
- Genre:
- Research Report > New Finding (1.00)
- Technology: