LLM, Reporting In! Medical Information Extraction Across Prompting, Fine-tuning and Post-correction

Belmadani, Ikram, Hashemi, Parisa Nazari, Sebbag, Thomas, Favre, Benoit, Fortier, Guillaume, Quiniou, Solen, Morin, Emmanuel, Dufour, Richard

Oct-7-2025–arXiv.org Artificial Intelligence

This work presents our participation in the EvalLLM 2025 challenge on biomedical Named Entity Recognition (NER) and health event extraction in French (few-shot setting). For NER, we propose three approaches combining large language models (LLMs), annotation guidelines, synthetic data, and post-processing: (1) in-context learning (ICL) with GPT-4.1, incorporating automatic selection of 10 examples and a summary of the annotation guidelines into the prompt, (2) the universal NER system GLiNER, fine-tuned on a synthetic corpus and then verified by an LLM in post-processing, and (3) the open LLM LLaMA-3.1-8B-Instruct, fine-tuned on the same synthetic corpus. Event extraction uses the same ICL strategy with GPT-4.1, reusing the guideline summary in the prompt. Results show GPT-4.1 leads with a macro-F1 of 61.53% for NER and 15.02% for event extraction, highlighting the importance of well-crafted prompting to maximize performance in very low-resource scenarios.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Oct-7-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.68)
- Europe > France (0.47)
- Asia > Middle East
  - UAE (0.28)

Genre:
- Research Report (0.70)

Industry:
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)