AITopics | Daille, Béatrice

Collaborating Authors

Daille, Béatrice

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ACL-rlg: A Dataset for Reading List Generation

Aubert-Béduchaud, Julien, Boudin, Florian, Daille, Béatrice, Dufour, Richard

arXiv.org Artificial IntelligenceDec-30-2024

Familiarizing oneself with a new scientific field and its existing literature can be daunting due to the large amount of available articles. Curated lists of academic references, or reading lists, compiled by experts, offer a structured way to gain a comprehensive overview of a domain or a specific scientific challenge. In this work, we introduce ACL-rlg, the largest open expert-annotated reading list dataset. We also provide multiple baselines for evaluating reading list generation and formally define it as a retrieval task. Our qualitative study highlights the fact that traditional scholarly search engines and indexing methods perform poorly on this task, and GPT-4o, despite showing better results, exhibits signs of potential data contamination.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.15692

Country:

Europe (1.00)
North America > United States > Texas (0.14)
Asia > Japan > Honshū (0.14)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical domains

Labrak, Yanis, Bazoge, Adrien, Dufour, Richard, Rouvier, Mickael, Morin, Emmanuel, Daille, Béatrice, Gourraud, Pierre-Antoine

arXiv.org Artificial IntelligenceMay-4-2023

In recent years, pre-trained language models (PLMs) achieve the best performance on a wide range of natural language processing (NLP) tasks. While the first models were trained on general domain data, specialized ones have emerged to more effectively treat specific domains. In this paper, we propose an original study of PLMs in the medical domain on French language. We compare, for the first time, the performance of PLMs trained on both public data from the web and private data from healthcare establishments. We also evaluate different learning strategies on a set of biomedical tasks. In particular, we show that we can take advantage of already existing biomedical PLMs in a foreign language by further pre-train it on our targeted data. Finally, we release the first specialized PLMs for the biomedical field in French, called DrBERT, as well as the largest corpus of medical data under free license on which these models are trained.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.00958

Country: Europe > France (0.29)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (0.88)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.48)
Health & Medicine > Health Care Technology > Medical Record (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)

Add feedback

FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domain

Labrak, Yanis, Bazoge, Adrien, Dufour, Richard, Rouvier, Mickael, Morin, Emmanuel, Daille, Béatrice, Gourraud, Pierre-Antoine

arXiv.org Artificial IntelligenceApr-9-2023

This paper introduces FrenchMedMCQA, the first publicly available Multiple-Choice Question Answering (MCQA) dataset in French for medical domain. It is composed of 3,105 questions taken from real exams of the French medical specialization diploma in pharmacy, mixing single and multiple answers. Each instance of the dataset contains an identifier, a question, five possible answers and their manual correction(s). We also propose first baseline models to automatically process this MCQA task in order to report on the current performances and to highlight the difficulty of the task. A detailed analysis of the results showed that it is necessary to have representations adapted to the medical domain or to the MCQA task: in our case, English specialized models yielded better results than generic French ones, even though FrenchMedMCQA is in French. Corpus, models and tools are available online.

artificial intelligence, natural language, question answering, (16 more...)

arXiv.org Artificial Intelligence

2304.0428

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.63)

Add feedback