RoBiologyDataChoiceQA: A Romanian Dataset for improving Biology understanding of Large Language Models

Ghinea, Dragos-Dumitru, Corbeanu, Adela-Nicoleta, Dumitran, Adrian-Marius

Oct-1-2025–arXiv.org Artificial Intelligence

In recent years, large language models (LLMs) have demonstrated significant potential across various natural language processing (NLP) tasks. However, their performance in domain-specific applications and non-English languages remains less explored. This study introduces a novel Romanian-language dataset for multiple-choice biology questions, carefully curated to assess LLM comprehension and reasoning capabilities in scientific contexts. Containing approximately 14,000 questions, the dataset provides a comprehensive resource for evaluating and improving LLM performance in biology. We benchmark several popular LLMs, analyzing their accuracy, reasoning patterns, and ability to understand domain-specific terminology and linguistic nuances. Additionally, we perform comprehensive experiments to evaluate the impact of prompt engineering, fine-tuning, and other optimization techniques on model performance. Our findings highlight both the strengths and limitations of current LLMs in handling specialized knowledge tasks in low-resource languages, offering valuable insights for future research and development.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Oct-1-2025

arXiv.org PDF

Add feedback

Country:
- North America (1.00)
- Europe > Romania (0.68)

Genre:
- Research Report > New Finding (0.66)

Industry:
- Health & Medicine
  - Pharmaceuticals & Biotechnology (0.93)
  - Therapeutic Area > Cardiology/Vascular Diseases (0.46)
- Education > Educational Setting
  - K-12 Education (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.32)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found