Generative AI for Research Data Processing: Lessons Learnt From Three Use Cases
Mitra, Modhurita, de Vos, Martine G., Cortinovis, Nicola, Ometto, Dawa
–arXiv.org Artificial Intelligence
--There has been enormous interest in generative AI since ChatGPT was launched in 2022. However, there are concerns about the accuracy and consistency of the outputs of generative AI. We have carried out an exploratory study on the application of this new technology in research data processing. We identified tasks for which rule-based or traditional machine learning approaches were difficult to apply, and then performed these tasks using generative AI. We demonstrate the feasibility of using the generative AI model Claude 3 Opus in three research projects involving complex data processing tasks: 1) Information extraction: We extract plant species names from historical seedlists (catalogues of seeds) published by botanical gardens. We share the lessons we learnt from these use cases: How to determine if generative AI is an appropriate tool for a given data processing task, and if so, how to maximise the accuracy and consistency of the results obtained. In this paper, we share our insights on the application of generative AI in research software engineering projects. Generative AI can potentially be used to perform a wide variety of research data processing tasks, such as interpreting documents, extracting information from them, and classifying text into categories. Since the tasks are specified through prompts in natural language, the barrier to entry is low. Therefore, this tool can be easily used by domain experts in a wide range of fields, with varying levels of programming skills and depth of knowledge of technical topics such as machine learning.
arXiv.org Artificial Intelligence
Apr-23-2025
- Country:
- Europe
- Austria
- Upper Austria > Linz (0.04)
- Vienna (0.14)
- Bulgaria > Sofia City Province
- Sofia (0.04)
- France (0.04)
- Netherlands (0.05)
- Austria
- North America > United States
- District of Columbia > Washington (0.04)
- Oceania > Australia
- New South Wales > Sydney (0.04)
- Europe
- Genre:
- Research Report
- Experimental Study (0.69)
- New Finding (0.46)
- Research Report
- Industry:
- Banking & Finance (0.68)
- Government > Regional Government (0.46)
- Health & Medicine (1.00)
- Information Technology (0.97)
- Media (0.67)
- Technology: