Goto

Collaborating Authors

 Materials


LLMs4Life: Large Language Models for Ontology Learning in Life Sciences

arXiv.org Artificial Intelligence

Ontology learning in complex domains, such as life sciences, poses significant challenges for current Large Language Models (LLMs). Existing LLMs struggle to generate ontologies with multiple hierarchical levels, rich interconnections, and comprehensive class coverage due to constraints on the number of tokens they can generate and inadequate domain adaptation. To address these issues, we extend the NeOn-GPT pipeline for ontology learning using LLMs with advanced prompt engineering techniques and ontology reuse to enhance the generated ontologies' domain-specific reasoning and structural depth. Our work evaluates the capabilities of LLMs in ontology learning in the context of highly specialized and complex domains such as life science domains. To assess the logical consistency, completeness, and scalability of the generated ontologies, we use the AquaDiva ontology developed and used in the collaborative research center AquaDiva as a case study. Our evaluation shows the viability of LLMs for ontology learning in specialized domains, providing solutions to longstanding limitations in model performance and scalability.


Jailbreak Defense in a Narrow Domain: Limitations of Existing Methods and a New Transcript-Classifier Approach

arXiv.org Artificial Intelligence

Defending large language models against jailbreaks so that they never engage in a broadly-defined set of forbidden behaviors is an open problem. In this paper, we investigate the difficulty of jailbreak-defense when we only want to forbid a narrowly-defined set of behaviors. As a case study, we focus on preventing an LLM from helping a user make a bomb. We find that popular defenses such as safety training, adversarial training, and input/output classifiers are unable to fully solve this problem. In pursuit of a better solution, we develop a transcript-classifier defense which outperforms the baseline defenses we test. However, our classifier defense still fails in some circumstances, which highlights the difficulty of jailbreak-defense even in a narrow domain.


MiningGPT -- A Domain-Specific Large Language Model for the Mining Industry

arXiv.org Artificial Intelligence

Recent advancements of generative LLMs (Large Language Models) have exhibited human-like language capabilities but have shown a lack of domain-specific understanding. Therefore, the research community has started the development of domain-specific LLMs for many domains. In this work we focus on discussing how to build mining domain-specific LLMs, as the global mining industry contributes significantly to the worldwide economy. We report on MiningGPT, a mining domain-specific instruction-following 7B parameter LLM model which showed a 14\% higher mining domain knowledge test score as compared to its parent model Mistral 7B instruct.


T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs

arXiv.org Artificial Intelligence

The success of Multimodal Large Language Models (MLLMs) in the image domain has garnered wide attention from the research community. Drawing on previous successful experiences, researchers have recently explored extending the success to the video understanding realms. Apart from training from scratch, an efficient way is to utilize the pre-trained image-LLMs, leading to two mainstream approaches, i.e. zero-shot inference and further fine-tuning with video data. In this work, our study of these approaches harvests an effective data augmentation method. We first make a deeper inspection of the zero-shot inference way and identify two limitations, i.e. limited generalization and lack of temporal understanding capabilities. Thus, we further investigate the fine-tuning approach and find a low learning efficiency when simply using all the video data samples, which can be attributed to a lack of instruction diversity. Aiming at this issue, we develop a method called T2Vid to synthesize video-like samples to enrich the instruction diversity in the training corpus. Integrating these data enables a simple and efficient training scheme, which achieves performance comparable to or even superior to using full video datasets by training with just 15% the sample size. Meanwhile, we find that the proposed scheme can boost the performance of long video understanding without training with long video samples. We hope our study will spark more thinking about using MLLMs for video understanding and curation of high-quality data. The code is released at https://github.com/xjtupanda/T2Vid.


PAL -- Parallel active learning for machine-learned potentials

arXiv.org Artificial Intelligence

Constructing datasets representative of the target domain is essential for training effective machine learning models. Active learning (AL) is a promising method that iteratively extends training data to enhance model performance while minimizing data acquisition costs. However, current AL workflows often require human intervention and lack parallelism, leading to inefficiencies and underutilization of modern computational resources. In this work, we introduce PAL, an automated, modular, and parallel active learning library that integrates AL tasks and manages their execution and communication on shared- and distributed-memory systems using the Message Passing Interface (MPI). PAL provides users with the flexibility to design and customize all components of their active learning scenarios, including machine learning models with uncertainty estimation, oracles for ground truth labeling, and strategies for exploring the target space. We demonstrate that PAL significantly reduces computational overhead and improves scalability, achieving substantial speed-ups through asynchronous parallelization on CPU and GPU hardware. Applications of PAL to several real-world scenarios - including ground-state reactions in biomolecular systems, excited-state dynamics of molecules, simulations of inorganic clusters, and thermo-fluid dynamics - illustrate its effectiveness in accelerating the development of machine learning models. Our results show that PAL enables efficient utilization of high-performance computing resources in active learning workflows, fostering advancements in scientific research and engineering applications.


G-RAG: Knowledge Expansion in Material Science

arXiv.org Artificial Intelligence

In the field of Material Science, effective information retrieval systems are essential for facilitating research. Traditional Retrieval-Augmented Generation (RAG) approaches in Large Language Models (LLMs) often encounter challenges such as outdated information, hallucinations, limited interpretability due to context constraints, and inaccurate retrieval. To address these issues, Graph RAG integrates graph databases to enhance the retrieval process. Our proposed method processes Material Science documents by extracting key entities (referred to as MatIDs) from sentences, which are then utilized to query external Wikipedia knowledge bases (KBs) for additional relevant information. We implement an agent-based parsing technique to achieve a more detailed representation of the documents. Our improved version of Graph RAG called G-RAG further leverages a graph database to capture relationships between these entities, improving both retrieval accuracy and contextual understanding. This enhanced approach demonstrates significant improvements in performance for domains that require precise information retrieval, such as Material Science.


ChemTEB: Chemical Text Embedding Benchmark, an Overview of Embedding Models Performance & Efficiency on a Specific Domain

arXiv.org Artificial Intelligence

Recent advancements in language models have started a new era of superior information retrieval and content generation, with embedding models playing an important role in optimizing data representation efficiency and performance. While benchmarks like the Massive Text Embedding Benchmark (MTEB) have standardized the evaluation of general domain embedding models, a gap remains in specialized fields such as chemistry, which require tailored approaches due to domain-specific challenges. This paper introduces a novel benchmark, the Chemical Text Embedding Benchmark (ChemTEB), designed specifically for the chemical sciences. ChemTEB addresses the unique linguistic and semantic complexities of chemical literature and data, offering a comprehensive suite of tasks on chemical domain data. Through the evaluation of 34 open-source and proprietary models using this benchmark, we illuminate the strengths and weaknesses of current methodologies in processing and understanding chemical information. Our work aims to equip the research community with a standardized, domain-specific evaluation framework, promoting the development of more precise and efficient NLP models for chemistry-related applications. Furthermore, it provides insights into the performance of generic models in a domain-specific context. ChemTEB comes with open-source code and data, contributing further to its accessibility and utility.


AI-driven inverse design of materials: Past, present and future

arXiv.org Artificial Intelligence

The discovery of advanced materials is the cornerstone of human technological development and progress. The structures of materials and their corresponding properties are essentially the result of a complex interplay of multiple degrees of freedom such as lattice, charge, spin, symmetry, and topology. This poses significant challenges for the inverse design methods of materials. Humans have long explored new materials through a large number of experiments and proposed corresponding theoretical systems to predict new material properties and structures. With the improvement of computational power, researchers have gradually developed various electronic structure calculation methods, such as the density functional theory and high-throughput computational methods. Recently, the rapid development of artificial intelligence technology in the field of computer science has enabled the effective characterization of the implicit association between material properties and structures, thus opening up an efficient paradigm for the inverse design of functional materials. A significant progress has been made in inverse design of materials based on generative and discriminative models, attracting widespread attention from researchers. Considering this rapid technological progress, in this survey, we look back on the latest advancements in AI-driven inverse design of materials by introducing the background, key findings, and mainstream technological development routes. In addition, we summarize the remaining issues for future directions. This survey provides the latest overview of AI-driven inverse design of materials, which can serve as a useful resource for researchers.


Digital Twin in Industries: A Comprehensive Survey

arXiv.org Artificial Intelligence

Industrial networks are undergoing rapid transformation driven by the convergence of emerging technologies that are revolutionizing conventional workflows, enhancing operational efficiency, and fundamentally redefining the industrial landscape across diverse sectors. Amidst this revolution, Digital Twin (DT) emerges as a transformative innovation that seamlessly integrates real-world systems with their virtual counterparts, bridging the physical and digital realms. In this article, we present a comprehensive survey of the emerging DT-enabled services and applications across industries, beginning with an overview of DT fundamentals and its components to a discussion of key enabling technologies for DT. Different from literature works, we investigate and analyze the capabilities of DT across a wide range of industrial services, including data sharing, data offloading, integrated sensing and communication, content caching, resource allocation, wireless networking, and metaverse. In particular, we present an in-depth technical discussion of the roles of DT in industrial applications across various domains, including manufacturing, healthcare, transportation, energy, agriculture, space, oil and gas, as well as robotics. Throughout the technical analysis, we delve into real-time data communications between physical and virtual platforms to enable industrial DT networking. Subsequently, we extensively explore and analyze a wide range of major privacy and security issues in DT-based industry. Taxonomy tables and the key research findings from the survey are also given, emphasizing important insights into the significance of DT in industries. Finally, we point out future research directions to spur further research in this promising area.


Grasping and Rolling In-plane Manipulation Using Deployable Tape spring Appendages

arXiv.org Artificial Intelligence

Rigid multi-link robotic arms face a tradeoff between their overall reach distance (the workspace), and how compactly they can be collapsed (the storage volume). Increasing the workspace of a robot arm requires longer links, which adds weight to the system and requires a larger storage volume. However, the tradeoff between workspace and storage volume can be resolved by the use of deployable structures with high extensibility. In this work we introduce a bidirectional tape spring based structure that can be stored in a compact state and then extended to perform manipulation tasks, allowing for a large manipulation workspace and low storage volume. Bidirectional tape springs are demonstrated to have large buckling strength compared to single tape springs, while maintaining the ability to roll into a compact storage volume. Two tape spring structures are integrated into a bimanual manipulator robot called GRIP-tape that allows for object Grasping and Rolling In Planar configurations (GRIP). In demonstrations we show that the continuum kinematics of the tape springs enable novel manipulation capabilities such as simultaneous translation-rotation and multi-object conveyance. Furthermore, the dual mechanical properties of stiffness and softness in the tape springs enables inherent safety from unintended collisions within the workspace and soft-contact with objects. Our system demonstrates new opportunities for extensible manipulators that may benefit manipulation in remote environments such as space and the deep sea.