Goto

Collaborating Authors

 Materials


Large Language Model-Guided Prediction Toward Quantum Materials Synthesis

arXiv.org Artificial Intelligence

The synthesis of inorganic crystalline materials is essential for modern technology, especially in quantum materials development. However, designing efficient synthesis workflows remains a significant challenge due to the precise experimental conditions and extensive trial and error. Here, we present a framework using large language models (LLMs) to predict synthesis pathways for inorganic materials, including quantum materials. Our framework contains three models: LHS2RHS, predicting products from reactants; RHS2LHS, predicting reactants from products; and TGT2CEQ, generating full chemical equations for target compounds. Fine-tuned on a text-mined synthesis database, our model raises accuracy from under 40% with pretrained models, to under 80% using conventional fine-tuning, and further to around 90% with our proposed generalized Tanimoto similarity, while maintaining robust to additional synthesis steps. Our model further demonstrates comparable performance across materials with varying degrees of quantumness quantified using quantum weight, indicating that LLMs offer a powerful tool to predict balanced chemical equations for quantum materials discovery.


Retrieval-Retro: Retrieval-based Inorganic Retrosynthesis with Expert Knowledge

arXiv.org Artificial Intelligence

While inorganic retrosynthesis planning is essential in the field of chemical science, the application of machine learning in this area has been notably less explored compared to organic retrosynthesis planning. In this paper, we propose Retrieval-Retro for inorganic retrosynthesis planning, which implicitly extracts the precursor information of reference materials that are retrieved from the knowledge base regarding domain expertise in the field. Specifically, instead of directly employing the precursor information of reference materials, we propose implicitly extracting it with various attention layers, which enables the model to learn novel synthesis recipes more effectively. Moreover, during retrieval, we consider the thermodynamic relationship between target material and precursors, which is essential domain expertise in identifying the most probable precursor set among various options. Extensive experiments demonstrate the superiority of Retrieval-Retro in retrosynthesis planning, especially in discovering novel synthesis recipes, which is crucial for materials discovery. The source code for Retrieval-Retro is available at https://github.com/HeewoongNoh/Retrieval-Retro.


Asteroid Mining: ACT&Friends' Results for the GTOC 12 Problem

arXiv.org Artificial Intelligence

Global Trajectory Optimization Competitions (GTOC) [1] represent a biennial cornerstone event within the international aerospace community, dedicated to addressing the intricacies of interplanetary trajectory optimization. The 12th edition of this well established competition, held in June-July 2023, proposed a challenging design of a "sustainable asteroid mining" mission. The problem demanded the concurrent extraction of resources from a set A of 60,000 target asteroids, to be accomplished during a fixed 15 years wide window (from 2035-Jan-01 to 2050-Jan-01) by multiple spacecraft. The participating spacecraft, dispatched from Earth and possibly flying by Venus and Mars, had to be meticulously designed to maximize the quantity of mined material returned to our home planet. A comprehensive exposition of the mathematical intricacies underpinning the problem definition can be found in [2], while in this paper we will primarily provide essential definitions and selectively reference these mathematical foundations. For the purpose of clarity, we shall employ the term'ship' interchangeably with'spacecraft.' In the context of the multi-spacecraft asteroid mining mission presented in GTOC12, each ship possesses the capability to deploy a specified number of mining devices onto the asteroids' surface. Furthermore, these ships have the capacity to collect mined resources if a mining device is already in place on the visited asteroid. Importantly, each ship is not confined to gathering material exclusively from asteroids where it initially deposited a miner; it can collect resources from asteroids where miners were deployed by other ships.


TALE-teller: Tendon-Actuated Linked Element Robotic Testbed for Investigating Tail Functions

arXiv.org Artificial Intelligence

Tails serve various functions in both robotics and biology, including expression, grasping, and defense. The vertebrate tails associated with these functions exhibit diverse patterns of vertebral lengths, but the precise mechanisms linking form to function have not yet been established. Vertebrate tails are complex musculoskeletal structures, making both direct experimentation and computational modeling challenging. This paper presents Tendon-Actuated Linked-Element (TALE), a modular robotic test bed to explore how tail morphology influences function. By varying 3D printed bones, silicone joints, and tendon configurations, TALE can match the morphology of extant, extinct, and even theoretical tails. We first characterized the stiffness of our joint design empirically and in simulation before testing the hypothesis that tails with different vertebral proportions curve differently. We then compared the maximum bending state of two common vertebrate proportions and one theoretical morphology. Uniform bending of joints with different vertebral proportions led to substantial differences in the location of the tail tip, suggesting a significant influence on overall tail function. Future studies can introduce more complex morphologies to establish the mechanisms of diverse tail functions. With this foundational knowledge, we will isolate the key features underlying tail function to inform the design for robotic tails. Images and videos can be found on TALE's project page: https://www.embirlab.com/tale.


Deep Learning-Driven Microstructure Characterization and Vickers Hardness Prediction of Mg-Gd Alloys

arXiv.org Artificial Intelligence

In the field of materials science, exploring the relationship between composition, microstructure, and properties has long been a critical research focus. The mechanical performance of solid-solution Mg-Gd alloys is significantly influenced by Gd content, dendritic structures, and the presence of secondary phases. To better analyze and predict the impact of these factors, this study proposes a multimodal fusion learning framework based on image processing and deep learning techniques. This framework integrates both elemental composition and microstructural features to accurately predict the Vickers hardness of solid-solution Mg-Gd alloys. Initially, deep learning methods were employed to extract microstructural information from a variety of solid-solution Mg-Gd alloy images obtained from literature and experiments. This provided precise grain size and secondary phase microstructural features for performance prediction tasks. Subsequently, these quantitative analysis results were combined with Gd content information to construct a performance prediction dataset. Finally, a regression model based on the Transformer architecture was used to predict the Vickers hardness of Mg-Gd alloys. The experimental results indicate that the Transformer model performs best in terms of prediction accuracy, achieving an R^2 value of 0.9. Additionally, SHAP analysis identified critical values for four key features affecting the Vickers hardness of Mg-Gd alloys, providing valuable guidance for alloy design. These findings not only enhance the understanding of alloy performance but also offer theoretical support for future material design and optimization.


Addressing the Pitfalls of Image-Based Structural Health Monitoring: A Focus on False Positives, False Negatives, and Base Rate Bias

arXiv.org Artificial Intelligence

This study explores the limitations of image-based structural health monitoring (SHM) techniques in detecting structural damage. Leveraging machine learning and computer vision, image-based SHM offers a scalable and efficient alternative to manual inspections. However, its reliability is impacted by challenges such as false positives, false negatives, and environmental variability, particularly in low base rate damage scenarios. The Base Rate Bias plays a significant role, as low probabilities of actual damage often lead to misinterpretation of positive results. This study uses both Bayesian analysis and a frequentist approach to evaluate the precision of damage detection systems, revealing that even highly accurate models can yield misleading results when the occurrence of damage is rare. Strategies for mitigating these limitations are discussed, including hybrid systems that combine multiple data sources, human-in-the-loop approaches for critical assessments, and improving the quality of training data. These findings provide essential insights into the practical applicability of image-based SHM techniques, highlighting both their potential and their limitations for real-world infrastructure monitoring.


Gradient Rewiring for Editable Graph Neural Network Training

arXiv.org Artificial Intelligence

Deep neural networks are ubiquitously adopted in many applications, such as computer vision, natural language processing, and graph analytics. However, well-trained neural networks can make prediction errors after deployment as the world changes. \textit{Model editing} involves updating the base model to correct prediction errors with less accessible training data and computational resources. Despite recent advances in model editors in computer vision and natural language processing, editable training in graph neural networks (GNNs) is rarely explored. The challenge with editable GNN training lies in the inherent information aggregation across neighbors, which can lead model editors to affect the predictions of other nodes unintentionally. In this paper, we first observe the gradient of cross-entropy loss for the target node and training nodes with significant inconsistency, which indicates that directly fine-tuning the base model using the loss on the target node deteriorates the performance on training nodes. Motivated by the gradient inconsistency observation, we propose a simple yet effective \underline{G}radient \underline{R}ewiring method for \underline{E}ditable graph neural network training, named \textbf{GRE}. Specifically, we first store the anchor gradient of the loss on training nodes to preserve the locality. Subsequently, we rewire the gradient of the loss on the target node to preserve performance on the training node using anchor gradient. Experiments demonstrate the effectiveness of GRE on various model architectures and graph datasets in terms of multiple editing situations. The source code is available at \url{https://github.com/zhimengj0326/Gradient_rewiring_editing}


Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs)

arXiv.org Artificial Intelligence

Pre-trained large language models (LLMs) have been reliably integrated with visual input for multimodal tasks. We study gender bias in 22 popular open-source VLAs with respect to personality traits, skills, and occupations. Our results show that VLAs replicate human biases likely present in the data, such as real-world occupational imbalances. Similarly, they tend to attribute more skills and positive personality traits to women than to men, and we see a consistent tendency to associate negative personality traits with men. To eliminate the gender bias in these models, we find that finetuning-based debiasing methods achieve the best tradeoff between debiasing and retaining performance on downstream task. We argue for pre-deploying gender bias assessment in VLAs and motivate further development of debiasing strategies to ensure equitable societal outcomes. Rapid progress in large language models (LLMs) has sparked a wave of innovation fusing visual encoding modules with LLMs, which ...


Prediction of Final Phosphorus Content of Steel in a Scrap-Based Electric Arc Furnace Using Artificial Neural Networks

arXiv.org Artificial Intelligence

The scrap-based electric arc furnace process is expected to capture a significant share of the steel market in the future due to its potential for reducing environmental impacts through steel recycling. However, managing impurities, particularly phosphorus, remains a challenge. This study aims to develop a machine learning model to estimate the steel phosphorus content at the end of the process based on input parameters. Data were collected over two years from a steel plant, focusing on the chemical composition and weight of the scrap, the volume of oxygen injected, and process duration. After preprocessing the data, several machine learning models were evaluated, with the artificial neural network (ANN) emerging as the most effective. The best ANN model included four hidden layers. The model was trained for 500 epochs with a batch size of 50. The best model achieves a mean square error (MSE) of 0.000016, a root-mean-square error (RMSE) of 0.0049998, a coefficient of determination (R2) of 99.96%, and a correlation coefficient (r) of 99.98%. Notably, the model achieved a 100% hit rate for predicting phosphorus content within +-0.001 wt% (+-10 ppm). These results demonstrate that the optimized ANN model offers accurate predictions for the steel final phosphorus content.


HyperspectralViTs: General Hyperspectral Models for On-board Remote Sensing

arXiv.org Artificial Intelligence

On-board processing of hyperspectral data with machine learning models would enable unprecedented amount of autonomy for a wide range of tasks, for example methane detection or mineral identification. This can enable early warning system and could allow new capabilities such as automated scheduling across constellations of satellites. Classical methods suffer from high false positive rates and previous deep learning models exhibit prohibitive computational requirements. We propose fast and accurate machine learning architectures which support end-to-end training with data of high spectral dimension without relying on hand-crafted products or spectral band compression preprocessing. We evaluate our models on two tasks related to hyperspectral data processing. With our proposed general architectures, we improve the F1 score of the previous methane detection state-of-the-art models by 27% on a newly created synthetic dataset and by 13% on the previously released large benchmark dataset. We also demonstrate that training models on the synthetic dataset improves performance of models finetuned on the dataset of real events by 6.9% in F1 score in contrast with training from scratch. On a newly created dataset for mineral identification, our models provide 3.5% improvement in the F1 score in contrast to the default versions of the models. With our proposed models we improve the inference speed by 85% in contrast to previous classical and deep learning approaches by removing the dependency on classically computed features. With our architecture, one capture from the EMIT sensor can be processed within 30 seconds on realistic proxy of the ION-SCV 004 satellite.