DS@GT at CheckThat! 2025: Exploring Retrieval and Reranking Pipelines for Scientific Claim Source Retrieval on Social Media Discourse
Schofield, Jeanette, Tian, Shuyu, Truong, Hoang Thanh Thanh, Heil, Maximilian
–arXiv.org Artificial Intelligence
Social media users often make scientific claims without citing where these claims come from, generating a need to verify these claims. This paper details work done by the DS@GT team for CLEF 2025 CheckThat! Lab Task 4b Scientific Claim Source Retrieval which seeks to find relevant scientific papers based on implicit references in tweets. Our team explored 6 different data augmentation techniques, 7 different retrieval and reranking pipelines, and finetuned a bi-encoder. Achieving an MRR@5 of 0.58, our team ranked 16th out of 30 teams for the CLEF 2025 CheckThat! Lab Task 4b, and improvement of 0.15 over the BM25 baseline of 0.43. Our code is available on Github at https://github.com/dsgt-arc/checkthat-2025-swd/tree/main/subtask-4b.
arXiv.org Artificial Intelligence
Jul-10-2025
- Country:
- Europe
- France > Auvergne-Rhône-Alpes
- Italy > Emilia-Romagna
- Metropolitan City of Bologna > Bologna (0.04)
- Spain > Galicia
- Madrid (0.04)
- North America > United States
- Georgia > Fulton County
- Atlanta (0.14)
- New York > New York County
- New York City (0.04)
- Georgia > Fulton County
- Europe
- Genre:
- Research Report (0.65)
- Industry:
- Health & Medicine > Therapeutic Area (0.49)
- Technology: