Towards an automated workflow in materials science for combining multi-modal simulative and experimental information using data mining and large language models

Katzer, Balduin, Klinder, Steffen, Schulz, Katrin

arXiv.org Artificial Intelligence 

To retrieve and compare scientific data of simulations and experiments in materials science, data needs to be easily accessible and machine readable to qualify and quantify various materials science phenomena. However, a majority of information is encoded within scientific documents limiting the capability of finding suitable literature as well as material properties. This manuscript showcases an automated workflow, which unravels the encoded information from scientific literature to a machine readable data structure of texts, figures, tables, equations and meta-data, using natural language processing and language as well as vision transformer models to generate a machine-readable database. The machine-readable database can be enriched with local data, as e.g. The study shows that such an automated workflow accelerates information retrieval, proximate context detection and material property extraction from multi-modal input data exemplarily shown for the research field of microstructural analyses of face-centered cubic single crystals. Ultimately, a Retrieval-Augmented Generation (RAG) based Large Language Model (LLM) enables a fast and e fficient question answering chat bot. Introduction Understanding physical processes in materials and material microstructures is of fundamental importance in facilitating their use in engineering applications. However, analyzing the increasing amount of existing scientific knowledge and extracting the relevant information for a desired research project is a challenging task. Especially, combining information from experiments, simulations and theory is of great significance as different aspects are considered at each discipline that together, ultimately, form a holistic picture [1, 2, 3, 4]. Machine learning (ML) and artificial intelligence (AI) have been recently used as advanced computational tools to accelerate the physical understanding in materials science research [3, 5, 6, 4, 7]. Recent progress in these computational methods enabled AI-assisted models with the ability to extrapolate beyond their data basis and generate novel materials science approaches, called generative AI (genAI) [8, 9]. Applying genAI leads for example to a novel design of crystalline materials [10], of molecule properties [11] and of architected materials [12].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found